Menu

time align using sphinx-3

Help
Anonymous
2003-03-07
2012-09-22
  • Anonymous

    Anonymous - 2003-03-07

    how can i time align a given audio file using sphinx-3. I was using sphinx-2 before and it had sphinx2-align to do this. What is the equivalent to this in sphinx-3 ?

    I have the audio files and the corresponding text. I want to time align the words.

    thanks
    mj

     
    • Anonymous

      Anonymous - 2003-03-07

      I have successfully used s3align for force-aligning; is that the same (or close enough) to what you mean by time-aligning?  See my 2003-02-21 posting under "Force-aligning for Sphinx2 models?" in this forum.

      Alternatively, I believe there is a timealign program in one of the Sphinx3 decoders, but I don't know details.

       
      • robert b

        robert b - 2003-03-11

        > Alternatively, I believe there is a timealign program in one of the Sphinx3 decoders, but I don't know details.

        I tried using time-align with Sphinx2 models.  Carl Quillen provided me with some Sphinx3 scripts which I was hoping to use to build Sphinx2-format models, but I couldn't get it to work.  I also tried hacking some of the code and scripts that build the Sphinx2 models to get it to work with time-align, but I couldn't get that to work either.  (It had to do with matrix dimension problems.  I could never get the dimensions just right and probably would actually need to understand the algorithms to do that :-(.)

        I'm hoping to find time to try the s3align thing.

         
    • Anonymous

      Anonymous - 2003-03-07

      I dont need to do any recognition. I have a bunch of sound files and their corresponding transcripts (sentences uttered in those files) I need to get the time stamps for each word in a sound file.

      sphinx2-batch had options for specifying the file names in a control file (-ctlfn) and also the corresponding transcripts in another file (-tactlfn).

      madan.

       
    • Anonymous

      Anonymous - 2003-03-07

      See the archive_s3/et94-align.csh file for arguments for running s3align.  Use -insentfn for the transcriptions and -ctlfn for the list of utterances.  Use -wdsegdir to specify a directory in which to write word segmentations, which look like:

               SFrm  EFrm    SegAScr Word
                  0    15    1375896 <s>
                 16    55    3918595 WE
                 56   101    4015687 WE
                102   143    5338447 <sil>
                144   165     304749 HAVE
                166   218    2888821 FARM
                219   242     537819 TWO
                243   286    5335998 HOURS
                287   339    5461385 AWAY
                340   356    1970228 </s>
      Total score:    31147625

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.