Menu

Forced Alignment

Help
Diwakar.G
2016-12-05
2018-06-04
  • Diwakar.G

    Diwakar.G - 2016-12-05

    If the following information is their we can do directly forced alignment i.e. from sample number 2210 to 5080 corresponds to 'she' and 0 to 2209 correspond to 'SIL'.
    2210 5080 she
    5080 9370 had
    9370 10760 your
    10760 15840 dark
    15840 19258 suit
    19258 21360 in
    21360 27864 greasy
    27864 34464 wash
    34464 38642 water
    39477 43180 all
    43180 48569 year

    After training and testing the resuilts we are getting like this
    she had your dark suit in greasy wash water all year (FAKS0-FAKS0-SA1)
    she had your dark suit in greasy wash water all year (FAKS0-FAKS0-SA1)
    Words: 11 Correct: 11 Errors: 0 Percent correct = 100.00% Error = 0.00% Accuracy = 100.00%

    How can this will be speech to text alignment, is there any way to get sample number information or timing information.
    Please give me clarity.

     
  • Diwakar.G

    Diwakar.G - 2016-12-05

    I have seen the configuration file in that there is one option for forced alignment initially it is set no but now I make it yes. But I am getting the following error please help me.
    `
    sitecsp@acl-pg-06:~/DYSARTHRIC/an4$ sphinxtrain run
    Sphinxtrain path: /usr/local/lib/sphinxtrain
    Sphinxtrain binaries path: /usr/local/libexec/sphinxtrain
    Running the training
    MODULE: 000 Computing feature from audio files
    Extracting features from segments starting at (part 1 of 1)
    Extracting features from segments starting at (part 1 of 1)
    Feature extraction is done
    MODULE: 00 verify training files
    Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
    Found 30 words using 25 phones
    Phase 2: Checking to make sure there are not duplicate entries in the dictionary
    Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
    Phase 4: Checking number of lines in the transcript file should match lines in fileids file
    Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
    Estimated Total Hours Training: 0.647533333333333
    This is a small amount of data, no comment at this time
    Phase 6: Checking that all the words in the transcript are in the dictionary
    Words in dictionary: 27
    Words in filler dictionary: 3
    Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
    MODULE: 0000 train grapheme-to-phoneme model
    Skipped (set $CFG_G2P_MODEL = 'yes' to enable)
    MODULE: 01 Train LDA transformation
    Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
    MODULE: 02 Train MLLT transformation
    Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
    MODULE: 05 Vector Quantization
    Skipped for continuous models
    MODULE: 10 Training Context Independent models for forced alignment and VTLN
    Phase 1: Cleaning up directories:
    accumulator...logs...qmanager...models...
    Phase 2: Flat initialize
    Phase 3: Forward-Backward
    Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    Normalization for iteration: 1
    Current Overall Likelihood Per Frame = -161.308512646282
    Baum welch starting for 1 Gaussian(s), iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = -158.718598785133
    Convergence Ratio = 2.58991386114866
    Baum welch starting for 1 Gaussian(s), iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = -155.527943649405
    Convergence Ratio = 3.19065513572841
    Baum welch starting for 1 Gaussian(s), iteration: 4 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    Normalization for iteration: 4
    Current Overall Likelihood Per Frame = -153.912110916641
    Convergence Ratio = 1.61583273276406
    Baum welch starting for 1 Gaussian(s), iteration: 5 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    Normalization for iteration: 5
    Current Overall Likelihood Per Frame = -153.477470057312
    Convergence Ratio = 0.434640859329505
    Baum welch starting for 1 Gaussian(s), iteration: 6 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    Normalization for iteration: 6
    Current Overall Likelihood Per Frame = -153.290349703147
    Convergence Ratio = 0.187120354165017
    Baum welch starting for 1 Gaussian(s), iteration: 7 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    Normalization for iteration: 7
    Current Overall Likelihood Per Frame = -153.185850578263
    Convergence Ratio = 0.1044991248842
    Baum welch starting for 1 Gaussian(s), iteration: 8 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    Normalization for iteration: 8
    Current Overall Likelihood Per Frame = -153.134587666015
    Training completed after 8 iterations
    MODULE: 11 Force-aligning transcripts
    Skipped: No sphinx3_align(.exe) found in /usr/local/libexec/sphinxtrain
    If you wish to do force-alignment, please copy or link the
    sphinx3_align binary from Sphinx 3 to /usr/local/libexec/sphinxtrain
    and either define $CFG_MODEL_DIR in sphinx_train.cfg or
    run context-independent training first.

    `
    After that I made following changes in configuration file

    # (yes/no) Train multiple-gaussian context-independent models (useful
    # for alignment, use 'no' otherwise) in the models created
    # specifically for forced alignment
    $CFG_FALIGN_CI_MGAU = 'yes';
    # (yes/no) Train multiple-gaussian context-independent models (useful
    # for alignment, use 'no' otherwise)
    $CFG_CI_MGAU = 'yes';
    # (yes/no) Train context-dependent models
    $CFG_CD_TRAIN = 'yes';
    # Number of tied states (senones) to create in decision-tree clustering
    $CFG_N_TIED_STATES = 200;
    # How many parts to run Forward-Backward estimatinon in
    $CFG_NPART = 1;
    
    # (yes/no) Train a single decision tree for all phones (actually one
    # per state) (useful for grapheme-based models, use 'no' otherwise)
    $CFG_CROSS_PHONE_TREES = 'no';
    
    # Use force-aligned transcripts (if available) as input to training
    $CFG_FORCEDALIGN = 'yes';
    

    Even then I am getting the same error please help me. What might the output of forced alignment

    sitecsp@acl-pg-06:~/DYSARTHRIC/an4$ sphinxtrain run
    Sphinxtrain path: /usr/local/lib/sphinxtrain
    Sphinxtrain binaries path: /usr/local/libexec/sphinxtrain
    Running the training
    MODULE: 000 Computing feature from audio files
    Extracting features from  segments starting at  (part 1 of 1) 
    Extracting features from  segments starting at  (part 1 of 1) 
    Feature extraction is done
    MODULE: 00 verify training files
        Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
            Found 30 words using 25 phones
        Phase 2: Checking to make sure there are not duplicate entries in the dictionary
        Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
        Phase 4: Checking number of lines in the transcript file should match lines in fileids file
        Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
            Estimated Total Hours Training: 0.647533333333333
            This is a small amount of data, no comment at this time
        Phase 6: Checking that all the words in the transcript are in the dictionary
            Words in dictionary: 27
            Words in filler dictionary: 3
        Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
    MODULE: 0000 train grapheme-to-phoneme model
    Skipped (set $CFG_G2P_MODEL = 'yes' to enable)
    MODULE: 01 Train LDA transformation
    Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
    MODULE: 02 Train MLLT transformation
    Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
    MODULE: 05 Vector Quantization
    Skipped for continuous models
    MODULE: 10 Training Context Independent models for forced alignment and VTLN
        Phase 1: Cleaning up directories:
        accumulator...logs...qmanager...models...
        Phase 2: Flat initialize
        Phase 3: Forward-Backward
            Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 1
            Current Overall Likelihood Per Frame = -161.308512646282
            Baum welch starting for 1 Gaussian(s), iteration: 2 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 2
            Current Overall Likelihood Per Frame = -158.718598785133
            Convergence Ratio = 2.58991386114866
            Baum welch starting for 1 Gaussian(s), iteration: 3 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 3
            Current Overall Likelihood Per Frame = -155.527943649405
            Convergence Ratio = 3.19065513572841
            Baum welch starting for 1 Gaussian(s), iteration: 4 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 4
            Current Overall Likelihood Per Frame = -153.912110916641
            Convergence Ratio = 1.61583273276406
            Baum welch starting for 1 Gaussian(s), iteration: 5 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 5
            Current Overall Likelihood Per Frame = -153.477470057312
            Convergence Ratio = 0.434640859329505
            Baum welch starting for 1 Gaussian(s), iteration: 6 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 6
            Current Overall Likelihood Per Frame = -153.290349703147
            Convergence Ratio = 0.187120354165017
            Baum welch starting for 1 Gaussian(s), iteration: 7 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 7
            Current Overall Likelihood Per Frame = -153.185850578263
            Convergence Ratio = 0.1044991248842
            Baum welch starting for 1 Gaussian(s), iteration: 8 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 8
            Current Overall Likelihood Per Frame = -153.134587666015
            Split Gaussians, increase by 1
            Current Overall Likelihood Per Frame = -153.134587666015
            Convergence Ratio = 0.0512629122483759
            Baum welch starting for 2 Gaussian(s), iteration: 1 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 1
            Current Overall Likelihood Per Frame = -153.629242595834
            Baum welch starting for 2 Gaussian(s), iteration: 2 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 2
            Current Overall Likelihood Per Frame = -152.910189093655
            Convergence Ratio = 0.719053502179435
            Baum welch starting for 2 Gaussian(s), iteration: 3 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 3
            Current Overall Likelihood Per Frame = -152.24939085075
            Convergence Ratio = 0.660798242905145
            Baum welch starting for 2 Gaussian(s), iteration: 4 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 4
            Current Overall Likelihood Per Frame = -151.657314938742
            Convergence Ratio = 0.592075912008085
            Baum welch starting for 2 Gaussian(s), iteration: 5 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 5
            Current Overall Likelihood Per Frame = -151.33661072789
            Convergence Ratio = 0.320704210851545
            Baum welch starting for 2 Gaussian(s), iteration: 6 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 6
            Current Overall Likelihood Per Frame = -151.181363464772
            Convergence Ratio = 0.155247263117701
            Baum welch starting for 2 Gaussian(s), iteration: 7 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 7
            Current Overall Likelihood Per Frame = -151.101787981743
            Split Gaussians, increase by 2
            Current Overall Likelihood Per Frame = -151.101787981743
            Convergence Ratio = 0.0795754830293163
            Baum welch starting for 4 Gaussian(s), iteration: 1 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 1
            Current Overall Likelihood Per Frame = -151.568516421291
            Baum welch starting for 4 Gaussian(s), iteration: 2 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 2
            Current Overall Likelihood Per Frame = -150.884724939085
            Convergence Ratio = 0.683791482205919
            Baum welch starting for 4 Gaussian(s), iteration: 3 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 3
            Current Overall Likelihood Per Frame = -150.405470331858
            Convergence Ratio = 0.479254607227347
            Baum welch starting for 4 Gaussian(s), iteration: 4 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 4
            Current Overall Likelihood Per Frame = -149.926129928961
            Convergence Ratio = 0.47934040289681
            Baum welch starting for 4 Gaussian(s), iteration: 5 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 5
            Current Overall Likelihood Per Frame = -149.672775318302
            Convergence Ratio = 0.253354610659045
            Baum welch starting for 4 Gaussian(s), iteration: 6 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 6
            Current Overall Likelihood Per Frame = -149.544682384433
            Convergence Ratio = 0.128092933868771
            Baum welch starting for 4 Gaussian(s), iteration: 7 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 7
            Current Overall Likelihood Per Frame = -149.463734170699
            Split Gaussians, increase by 4
            Current Overall Likelihood Per Frame = -149.463734170699
            Convergence Ratio = 0.080948213733933
            Baum welch starting for 8 Gaussian(s), iteration: 1 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 1
            Current Overall Likelihood Per Frame = -149.922097532517
            Baum welch starting for 8 Gaussian(s), iteration: 2 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 2
            Current Overall Likelihood Per Frame = -149.273310683277
            Convergence Ratio = 0.648786849240281
            Baum welch starting for 8 Gaussian(s), iteration: 3 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 3
            Current Overall Likelihood Per Frame = -148.915242458561
            Convergence Ratio = 0.358068224716305
            Baum welch starting for 8 Gaussian(s), iteration: 4 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 4
            Current Overall Likelihood Per Frame = -148.523242046741
            Convergence Ratio = 0.392000411819538
            Baum welch starting for 8 Gaussian(s), iteration: 5 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 5
            Current Overall Likelihood Per Frame = -148.303819623185
            Convergence Ratio = 0.219422423555557
            Baum welch starting for 8 Gaussian(s), iteration: 6 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 6
            Current Overall Likelihood Per Frame = -148.192242355606
            Convergence Ratio = 0.111577267579122
            Baum welch starting for 8 Gaussian(s), iteration: 7 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
            Normalization for iteration: 7
            Current Overall Likelihood Per Frame = -148.125021448917
    Training for 8 Gaussian(s) completed after 7 iterations
    MODULE: 11 Force-aligning transcripts
    Skipped: No sphinx3_align(.exe) found in /usr/local/libexec/sphinxtrain
    If you wish to do force-alignment, please copy or link the
    sphinx3_align binary from Sphinx 3 to /usr/local/libexec/sphinxtrain
    and either define $CFG_MODEL_DIR in sphinx_train.cfg or
    run context-independent training first.
    
     
    • Arseniy Gorin

      Arseniy Gorin - 2016-12-05

      It seems self-explaining: you are missing sphin3_align tool that you want to use for forced alignment.
      You should try to download and install https://github.com/skerit/cmusphinx/tree/master/sphinx3

       
      • sheharyar masood

        sir how can I download it ?

         
  • Diwakar.G

    Diwakar.G - 2016-12-05

    sir, from the forced alignment should I get sample number information
    If the following information is their we can do directly forced alignment i.e. from sample number 2210 to 5080 corresponds to 'she' and 0 to 2209 correspond to 'SIL'.
    2210 5080 she
    5080 9370 had
    9370 10760 your
    10760 15840 dark
    15840 19258 suit
    19258 21360 in
    21360 27864 greasy
    27864 34464 wash
    34464 38642 water
    39477 43180 all
    43180 48569 year
    After training and testing the resuilts we are getting like this
    she had your dark suit in greasy wash water all year (FAKS0-FAKS0-SA1)
    she had your dark suit in greasy wash water all year (FAKS0-FAKS0-SA1)
    Words: 11 Correct: 11 Errors: 0 Percent correct = 100.00% Error = 0.00% Accuracy = 100.00%
    How can this will be speech to text alignment, is there any way to get sample number information or timing information.
    Please give me clarity.

     
    • Arseniy Gorin

      Arseniy Gorin - 2016-12-05

      In fact I do not understand your question. Forced alignment is used when you do not have time information (only have sentence trancript).

      In your example you have time explicitly written. So why at all you need the alignmen? Try to re-formulate your question probably

       
  • Diwakar.G

    Diwakar.G - 2016-12-05

    In order to infer the speech to text alignment results the sample numbers of word boundaries(onset offsets) are required. But the CMU sphinx is not giving the desired alignment results in the form mentioned above which is not quite intuitive. Instead it is giving following result
    she had your dark suit in greasy wash water all year (FAKS0-FAKS0-SA1)
    she had your dark suit in greasy wash water all year (FAKS0-FAKS0-SA1)
    Which I am unable to infer result.

    Is there any way to get timing information from the alignment result.

     
    • Arseniy Gorin

      Arseniy Gorin - 2016-12-05

      OK, now it's clear, you seem to speak about result/an4.align file. This is not the speech to audio alignment but reference to hypothesis string alignment to compute error rate.

      What you seem to need is just time information from the decoder. In pocketsphinx this is achieved with -ctm option. CTM format gives you time per each word

      Not that when you do forced alignment, it is a totally different procedure. Forced alignment means you have a ground truth transcript and your goal is to get time information. You do not use decoder for that, but the aligner

       
  • Diwakar.G

    Diwakar.G - 2016-12-05

    Yes sir, that is what I required. For a given transcription which part of the word corresponds to 'she' like that.If I get timing information from the forced alignment I can easily do this. Sir, can you please explain me detailed how can I do
    this. From where I should option of ctm please tell me

     
    • Nickolay V. Shmyrev

      sphinx3_align tool has -wdsegdir option, not ctm option to dump word times.

      The command line is:

      sphinx3_align -hmm <acoustic_model> -dict <dictionary> -fdict <filler_dictionary> -ctl <fileids_file> -insent <transcription_file> -cepdir <mfc_dir>  -wdsegdir <output_wdseg_dir>
      
       
  • Diwakar.G

    Diwakar.G - 2016-12-06

    Sir, I am currently using pocket sphinx sphinx3 align .exe is not their but for the first time I have used sphinx 3 in that within a build directory their sphinx3_align .exe file can I use the same thing orCan you please provide the link so that I can download.
    Thanks in advance

     

    Last edit: Diwakar.G 2016-12-06
  • Diwakar.G

    Diwakar.G - 2016-12-06

    Sir, finally I got

         SFrm  EFrm    SegAScr Word
            0     4     -13424 <s>
            5    25     -35534 <s>
           26    95    -631131 RUBOUT
           96   164    -459649 <sil>
          165   197    -243354 T
          198   200     -81896 <sil>
          201   244    -139962 G
          245   256    -141954 <sil>
          257   306    -338001 J
          307   314    -104440 <sil>
          315   357    -375252 W
          358   400    -239127 B
          401   468    -220867 <sil>
          469   516    -277984 SEVENTY
          517   554    -230317 NINE
          555   587    -157324 FIFTY
          588   630    -263817 NINE
          631   635     -35982 </s>
          636   638     -62163 </s>
     Total score:    -4052178
    

    But I am getting frame numberwise information i.e. RUBOUT corresponds from 26 to 95 frame. Is there is any way to get either sample number information or timing information i.e. RUBOUT corresponds to samples from 16000 to 36000 or 1.5 to 2.5sec in the audio signal.

    sphinx3_align -hmm /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd -dict an4.dic -fdict an4.filler -ctl an4_train.fileids -insent an4_train.transcription -cepdir /home/sitecsp/Documents/FORCE/an4/feat -wdsegdir /home/sitecsp/Documents/FORCE/an4
    

    Thank you.

     
  • Diwakar.G

    Diwakar.G - 2016-12-06

    Sir, when I tried to run for some other mfc files I am getting the following error. Please help me

    INFO: feat.c(1205): At directory /home/sitecsp/DR/an4/feat
    INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR/an4/feat/an4_clstk/DR8/MRRE0/MRRE0-SA1.mfc'[0..-1]
    INFO: cmn.c(175): CMN: 42.44  2.21 -4.83  4.28 -12.19 -4.37 -24.89  9.50 -4.19 -7.13  3.09 -4.73  0.46 
    INFO: main_align.c(1009): MRRE0-SA1: 404 input frames
    
    ERROR: "main_align.c", line 891: Final state not reached; no alignment for MRRE0-SA1
    
        0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   404 frames,    0.04 sec CPU,   0.01 xRT;    0.04 sec elapsed,   0.01 xRT
    INFO: corpus.c(661): MRRE0-SA1:    0.0 sec CPU,    0.0 sec Clk;  TOT:     21.3 sec CPU,     21.9 sec Clk
    
    INFO: feat.c(1205): At directory /home/sitecsp/DR/an4/feat
    INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR/an4/feat/an4_clstk/DR8/MRRE0/MRRE0-SA2.mfc'[0..-1]
    INFO: cmn.c(175): CMN: 43.67  2.70 -7.54 -0.29 -13.10 -6.50 -29.84 11.79 -6.49 -1.61  4.74 -7.29  5.05 
    INFO: main_align.c(1009): MRRE0-SA2: 306 input frames
    
    ERROR: "main_align.c", line 891: Final state not reached; no alignment for MRRE0-SA2
    
        0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   306 frames,    0.03 sec CPU,   0.01 xRT;    0.03 sec elapsed,   0.01 xRT
    INFO: corpus.c(661): MRRE0-SA2:    0.0 sec CPU,    0.0 sec Clk;  TOT:     21.3 sec CPU,     21.9 sec Clk
    
    INFO: feat.c(1205): At directory /home/sitecsp/DR/an4/feat
    INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR/an4/feat/an4_clstk/DR8/MTCS0/MTCS0-SA1.mfc'[0..-1]
    INFO: cmn.c(175): CMN: 39.68  5.80 -11.96  9.20 -13.26 -5.05 -8.06 -8.04  8.89 -12.84  2.64 -2.76  0.66 
    INFO: main_align.c(1009): MTCS0-SA1: 302 input frames
    
    ERROR: "main_align.c", line 891: Final state not reached; no alignment for MTCS0-SA1
    
        0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   302 frames,    0.03 sec CPU,   0.01 xRT;    0.03 sec elapsed,   0.01 xRT
    INFO: corpus.c(661): MTCS0-SA1:    0.0 sec CPU,    0.0 sec Clk;  TOT:     21.4 sec CPU,     21.9 sec Clk
    
    INFO: feat.c(1205): At directory /home/sitecsp/DR/an4/feat
    INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR/an4/feat/an4_clstk/DR8/MTCS0/MTCS0-SA2.mfc'[0..-1]
    INFO: cmn.c(175): CMN: 42.33 10.22 -18.50  5.65 -16.41 -3.37 -13.11 -11.32  7.78 -11.17  5.06 -1.35  3.80 
    INFO: main_align.c(1009): MTCS0-SA2: 233 input frames
    
    ERROR: "main_align.c", line 891: Final state not reached; no alignment for MTCS0-SA2
    
        0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   233 frames,    0.02 sec CPU,   0.01 xRT;    0.02 sec elapsed,   0.01 xRT
    INFO: corpus.c(661): MTCS0-SA2:    0.0 sec CPU,    0.0 sec Clk;  TOT:     21.4 sec CPU,     21.9 sec Clk
    
    TOTAL FRAMES:         233112
    TOTAL CPU TIME:           21.22 sec,    0.01 xRT
    TOTAL ELAPSED TIME:       21.26 sec,    0.01 xRT
    sitecsp  14081  0.0  0.0   4448   764 pts/0    S+   10:15   0:00 sh -c ps aguxwww | grep s3align
    sitecsp  14083  0.0  0.0  15944  2228 pts/0    R+   10:15   0:00 grep s3align
    
     
  • Diwakar.G

    Diwakar.G - 2016-12-07

    sphinx3_align gives frame number only for an4 database. When I tried rm1 or some other database it throws error. Can somebody please help me.

     
    • Nickolay V. Shmyrev

      Sure, as soon as you provide error details.

       
  • Diwakar.G

    Diwakar.G - 2016-12-07

    For rm1 database, I am getting following error

    sitecsp@acl-pg-06:~/Documents/ALIGN/an4$ sphinx3_align -hmm /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd  -dict an4.dic -fdict an4.filler -ctl an4_train.fileids -insent an4_train.transcription -cepdir /home/sitecsp/Documents/ALIGN/an4/feat -phsegdir /home/sitecsp/Documents/ALIGN/an4/pdsegd -wdsegdir /home/sitecsp/Documents/ALIGN/an4/wdsegd
    INFO: info.c(65): Host: 'acl-pg-06'
    INFO: info.c(69): Directory: '/home/sitecsp/Documents/ALIGN/an4'
    INFO: info.c(73): sphinx3_align Compiled on: Dec 22 2013, AT: 15:13:45
    
    INFO: cmd_ln.c(691): Parsing command line:
    sphinx3_align \
        -hmm /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd \
        -dict an4.dic \
        -fdict an4.filler \
        -ctl an4_train.fileids \
        -insent an4_train.transcription \
        -cepdir /home/sitecsp/Documents/ALIGN/an4/feat \
        -phsegdir /home/sitecsp/Documents/ALIGN/an4/pdsegd \
        -wdsegdir /home/sitecsp/Documents/ALIGN/an4/wdsegd 
    
    Current configuration:
    [NAME]      [DEFLT]     [VALUE]
    -adchdr     0       0
    -adcin      no      no
    -agc        none        none
    -agcthresh  2.0     2.000000e+00
    -beam       1e-64       1.000000e-64
    -cb2mllr    .1cls.      .1cls.
    -cepdir             /home/sitecsp/Documents/ALIGN/an4/feat
    -cepext     .mfc        .mfc
    -ceplen     13      13
    -ci_pbeam   1e-80       1.000000e-80
    -cmn        current     current
    -cmninit    8.0     8.0
    -cond_ds    no      no
    -ctl                an4_train.fileids
    -ctlcount   1000000000  1000000000
    -ctloffset  0       0
    -ctl_mllr           
    -dict               an4.dic
    -dist_ds    no      no
    -ds     1       1
    -fdict              an4.filler
    -feat       1s_c_d_dd   1s_c_d_dd
    -featparams         
    -frate      100     100
    -gs             
    -gs4gs      yes     yes
    -hmm                /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd
    -hyp                
    -hypseg             
    -insent             an4_train.transcription
    -insert_sil 1       1
    -kdmaxbbi   -1      -1
    -kdmaxdepth 0       0
    -kdtree             
    -lambda             
    -lda                
    -ldadim     0       0
    -log3table  yes     yes
    -logbase    1.0003      1.000300e+00
    -logfn              
    -lts_mismatch   no      no
    -maxcdsenpf 100000      100000
    -mdef               
    -mean               
    -mixw               
    -mixwfloor  0.0000001   1.000000e-07
    -mllr               
    -outsent            
    -phlabdir           
    -phsegdir           /home/sitecsp/Documents/ALIGN/an4/pdsegd
    -s2cdsen    no      no
    -s2stsegdir         
    -senmgau    .cont.      .cont.
    -stsegdir           
    -subvq              
    -subvqbeam  3.0e-3      3.000000e-03
    -svq4svq    no      no
    -svspec             
    -tighten_factor 0.5     5.000000e-01
    -tmat               
    -tmatfloor  0.0001      1.000000e-04
    -topn       4       4
    -var                
    -varfloor   0.0001      1.000000e-04
    -varnorm    no      no
    -vqeval     3       3
    -wdsegdir           /home/sitecsp/Documents/ALIGN/an4/wdsegd
    
    INFO:   Initialization of the log add table
    INFO:   Log-Add table size = 29350 x 2 >> 0
    INFO:   
    INFO: feat.c(713): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
    INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
    INFO:   Reading HMM in Sphinx 3 Model format
    INFO:   Model Definition File: /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/mdef
    INFO:   Mean File: /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/means
    INFO:   Variance File: /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/variances
    INFO:   Mixture Weight File: /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/mixture_weights
    INFO:   Transition Matrices File: /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/transition_matrices
    INFO: mdef.c(682): Reading model definition: /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/mdef
    INFO:   Initialization of mdef_t, report:
    INFO:   48 CI-phone, 133500 CD-phone, 3 emitstate/phone, 144 CI-sen, 6144 Sen, 32639 Sen-Seq
    INFO:   
    INFO: kbcore.c(288): Using optimized GMM computation for Continuous HMM, -topn will be ignored
    INFO: cont_mgau.c(163): Reading mixture gaussian file '/home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/means'
    INFO: cont_mgau.c(422): 6144 mixture Gaussians, 8 components, 1 streams, veclen 39
    INFO: cont_mgau.c(163): Reading mixture gaussian file '/home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/variances'
    INFO: cont_mgau.c(422): 6144 mixture Gaussians, 8 components, 1 streams, veclen 39
    INFO: cont_mgau.c(510): Reading mixture weights file '/home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/mixture_weights'
    ERROR: "cont_mgau.c", line 653: Weight normalization failed for 3 senones
    INFO: cont_mgau.c(665): Read 6144 x 8 mixture weights
    INFO: cont_mgau.c(693): Removing uninitialized Gaussian densities
     6 7 8
    WARNING: "cont_mgau.c", line 767: 24 densities removed (3 mixtures removed entirely)
    INFO: cont_mgau.c(783): Applying variance floor
    INFO: cont_mgau.c(801): 0 variance values floored
    INFO: cont_mgau.c(849): Precomputing Mahalanobis distance invariants
    INFO: tmat.c(169): Reading HMM transition probability matrices: /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/transition_matrices
    WARNING: "tmat.c", line 242: Normalization failed for tmat 2 from state 0
    WARNING: "tmat.c", line 242: Normalization failed for tmat 2 from state 1
    WARNING: "tmat.c", line 242: Normalization failed for tmat 2 from state 2
    INFO:   Initialization of tmat_t, report:
    INFO:   Read 48 transition matrices of size 3x4
    INFO:   
    INFO: dict.c(475): Reading main dictionary: an4.dic
    ERROR: "dict.c", line 263: Line 2: Bad ciphone: AX; word A(2) ignored
    ERROR: "dict.c", line 263: Line 5: Bad ciphone: AX; word AAW ignored
    ERROR: "dict.c", line 263: Line 6: Bad ciphone: AXR; word ABERDEEN ignored
    ERROR: "dict.c", line 263: Line 7: Bad ciphone: AX; word ABOARD ignored
    ERROR: "dict.c", line 263: Line 8: Bad ciphone: AX; word ABOVE ignored
    ERROR: "dict.c", line 263: Line 10: Bad ciphone: DX; word ADDED ignored
    ERROR: "dict.c", line 263: Line 11: Bad ciphone: DX; word ADDING ignored
    ERROR: "dict.c", line 263: Line 12: Bad ciphone: AX; word AFFECT ignored
    ERROR: "dict.c", line 263: Line 13: Bad ciphone: AXR; word AFTER ignored
    ERROR: "dict.c", line 263: Line 14: Bad ciphone: AX; word AGAIN ignored
    ERROR: "dict.c", line 263: Line 16: Bad ciphone: IX; word AJAX'S ignored
    ERROR: "dict.c", line 263: Line 17: Bad ciphone: AX; word ALASKA ignored
    ERROR: "dict.c", line 263: Line 18: Bad ciphone: AX; word ALERT ignored
    ERROR: "dict.c", line 263: Line 19: Bad ciphone: AX; word ALERTS ignored
    ERROR: "dict.c", line 263: Line 20: Bad ciphone: IX; word ALEXANDRIA ignored
    ERROR: "dict.c", line 263: Line 25: Bad ciphone: AX; word AN(4) ignored
    ERROR: "dict.c", line 263: Line 26: Bad ciphone: AXR; word ANCHORAGE ignored
    ERROR: "dict.c", line 263: Line 27: Bad ciphone: AX; word AND ignored
    ERROR: "dict.c", line 263: Line 29: Bad ciphone: DX; word ANYBODY ignored
    ERROR: "dict.c", line 263: Line 30: Bad ciphone: DX; word ANYBODY(2) ignored
    ERROR: "dict.c", line 263: Line 31: Bad ciphone: AX; word APALACHICOLA ignored
    ERROR: "dict.c", line 263: Line 32: Bad ciphone: AX; word APALACHICOLA'S ignored
    ERROR: "dict.c", line 263: Line 33: Bad ciphone: AX; word APRIL ignored
    ERROR: "dict.c", line 263: Line 34: Bad ciphone: AXR; word ARABIAN ignored
    ERROR: "dict.c", line 263: Line 35: Bad ciphone: DX; word ARCTIC ignored
    ERROR: "dict.c", line 263: Line 36: Bad ciphone: IX; word ARCTIC(2) ignored
    ERROR: "dict.c", line 263: Line 38: Bad ciphone: AXR; word ARE(2) ignored
    ERROR: "dict.c", line 263: Line 39: Bad ciphone: AX; word AREA ignored
    ERROR: "dict.c", line 263: Line 40: Bad ciphone: AX; word AREAS ignored
    ERROR: "dict.c", line 263: Line 41: Bad ciphone: AX; word AREN'T ignored
    FATAL_ERROR: "dict.c", line 208: Missing base word for: AREN'T(2)
    
     
    • Nickolay V. Shmyrev

      Dictionary should match the acoustic model and contain all reference words, your dictionary mismatches and word AREN'T is missing.

       
  • Diwakar.G

    Diwakar.G - 2016-12-07

    When I tried with timit database in that DR2 folder I am getting the following error

    INFO: feat.c(1205): At directory /home/sitecsp/DR2/an4/feat
    INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR2/an4/feat/an4_clstk/DR2/MZMB0/MZMB0-SI1796.mfc'[0..-1]
    INFO: cmn.c(175): CMN: 46.46  1.78 -7.01 -4.41 -4.52  3.84 -22.28 -0.83 -0.66 -3.31  2.51 -2.98  0.02 
    INFO: main_align.c(1009): MZMB0-SI1796: 190 input frames
    
    ERROR: "main_align.c", line 891: Final state not reached; no alignment for MZMB0-SI1796
    
        0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   190 frames,    0.02 sec CPU,   0.01 xRT;    0.02 sec elapsed,   0.01 xRT
    INFO: corpus.c(661): MZMB0-SI1796:    0.0 sec CPU,    0.0 sec Clk;  TOT:      7.8 sec CPU,      7.8 sec Clk
    
    ERROR: "main_align.c", line 974: Uttid mismatch: ctlfile = "MZMB0-SI536"; transcript = "MZMB0-SI1796"
    INFO: feat.c(1205): At directory /home/sitecsp/DR2/an4/feat
    INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR2/an4/feat/an4_clstk/DR2/MZMB0/MZMB0-SI536.mfc'[0..-1]
    INFO: cmn.c(175): CMN: 40.52  1.26 -7.57  0.84 -9.61  0.68 -18.57  3.43 -1.65 -6.09  7.16 -0.13  0.87 
    INFO: main_align.c(1009): MZMB0-SI536: 354 input frames
    
    ERROR: "main_align.c", line 891: Final state not reached; no alignment for MZMB0-SI536
    
        0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   354 frames,    0.03 sec CPU,   0.01 xRT;    0.03 sec elapsed,   0.01 xRT
    INFO: corpus.c(661): MZMB0-SI536:    0.0 sec CPU,    0.0 sec Clk;  TOT:      7.8 sec CPU,      7.8 sec Clk
    
    ERROR: "main_align.c", line 974: Uttid mismatch: ctlfile = "MZMB0-SX176"; transcript = "MZMB0-SI536"
    INFO: feat.c(1205): At directory /home/sitecsp/DR2/an4/feat
    INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR2/an4/feat/an4_clstk/DR2/MZMB0/MZMB0-SX176.mfc'[0..-1]
    INFO: cmn.c(175): CMN: 41.94  1.52 -5.41 -3.23 -7.75  1.32 -17.09 -1.08  5.15 -6.77  5.87 -5.76  4.13 
    INFO: main_align.c(1009): MZMB0-SX176: 367 input frames
    
    ERROR: "main_align.c", line 891: Final state not reached; no alignment for MZMB0-SX176
    
        0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   367 frames,    0.04 sec CPU,   0.01 xRT;    0.03 sec elapsed,   0.01 xRT
    INFO: corpus.c(661): MZMB0-SX176:    0.0 sec CPU,    0.0 sec Clk;  TOT:      7.8 sec CPU,      7.9 sec Clk
    
    ERROR: "main_align.c", line 974: Uttid mismatch: ctlfile = "MZMB0-SX266"; transcript = "MZMB0-SX176"
    INFO: feat.c(1205): At directory /home/sitecsp/DR2/an4/feat
    INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR2/an4/feat/an4_clstk/DR2/MZMB0/MZMB0-SX266.mfc'[0..-1]
    INFO: cmn.c(175): CMN: 40.42  0.12 -3.11  6.92 -10.41 -5.97 -14.89  3.30  4.63 -5.77  5.67 -1.00  4.92 
    INFO: main_align.c(1009): MZMB0-SX266: 266 input frames
    
    ERROR: "main_align.c", line 891: Final state not reached; no alignment for MZMB0-SX266
    
        0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   266 frames,    0.02 sec CPU,   0.01 xRT;    0.02 sec elapsed,   0.01 xRT
    INFO: corpus.c(661): MZMB0-SX266:    0.0 sec CPU,    0.0 sec Clk;  TOT:      7.9 sec CPU,      7.9 sec Clk
    
    ERROR: "main_align.c", line 974: Uttid mismatch: ctlfile = "MZMB0-SX356"; transcript = "MZMB0-SX266"
    INFO: feat.c(1205): At directory /home/sitecsp/DR2/an4/feat
    INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR2/an4/feat/an4_clstk/DR2/MZMB0/MZMB0-SX356.mfc'[0..-1]
    INFO: cmn.c(175): CMN: 37.97 -0.21 -7.37 -2.20 -7.97  1.87 -16.89  3.42  0.30 -4.44  6.58 -3.76  3.35 
    INFO: main_align.c(1009): MZMB0-SX356: 315 input frames
    
    ERROR: "main_align.c", line 891: Final state not reached; no alignment for MZMB0-SX356
    
        0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   315 frames,    0.03 sec CPU,   0.01 xRT;    0.03 sec elapsed,   0.01 xRT
    INFO: corpus.c(661): MZMB0-SX356:    0.0 sec CPU,    0.0 sec Clk;  TOT:      7.9 sec CPU,      7.9 sec Clk
    
    ERROR: "main_align.c", line 974: Uttid mismatch: ctlfile = "MZMB0-SX446"; transcript = "MZMB0-SX356"
    INFO: feat.c(1205): At directory /home/sitecsp/DR2/an4/feat
    INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR2/an4/feat/an4_clstk/DR2/MZMB0/MZMB0-SX446.mfc'[0..-1]
    INFO: cmn.c(175): CMN: 41.12  1.51 -4.42 -11.68 -17.91  3.99 -13.75  8.52 -6.43  2.67  5.88 -0.78  0.31 
    INFO: main_align.c(1009): MZMB0-SX446: 274 input frames
    
    ERROR: "main_align.c", line 891: Final state not reached; no alignment for MZMB0-SX446
    
        0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   274 frames,    0.02 sec CPU,   0.01 xRT;    0.02 sec elapsed,   0.01 xRT
    INFO: corpus.c(661): MZMB0-SX446:    0.0 sec CPU,    0.0 sec Clk;  TOT:      7.9 sec CPU,      7.9 sec Clk
    
    ERROR: "main_align.c", line 974: Uttid mismatch: ctlfile = "MZMB0-SX86"; transcript = "MZMB0-SX446"
    INFO: feat.c(1205): At directory /home/sitecsp/DR2/an4/feat
    INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR2/an4/feat/an4_clstk/DR2/MZMB0/MZMB0-SX86.mfc'[0..-1]
    INFO: cmn.c(175): CMN: 39.14 -0.07 -6.72 -6.75 -12.41 -4.73 -16.51  3.99 -0.82 -3.01  4.67 -3.06  0.09 
    INFO: main_align.c(1009): MZMB0-SX86: 223 input frames
    
    ERROR: "main_align.c", line 891: Final state not reached; no alignment for MZMB0-SX86
    
        0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   223 frames,    0.02 sec CPU,   0.01 xRT;    0.02 sec elapsed,   0.01 xRT
    INFO: corpus.c(661): MZMB0-SX86:    0.0 sec CPU,    0.0 sec Clk;  TOT:      7.9 sec CPU,      8.0 sec Clk
    
    TOTAL FRAMES:          86595
    TOTAL CPU TIME:            7.91 sec,    0.01 xRT
    TOTAL ELAPSED TIME:        7.90 sec,    0.01 xRT
    sitecsp   3013  0.0  0.0   4448   684 pts/0    S+   16:20   0:00 sh -c ps aguxwww | grep s3align
    sitecsp   3015  0.0  0.0  15944  2264 pts/0    S+   16:20   0:00 grep s3align
    

    Please help me.

     
    • Nickolay V. Shmyrev

      This means you made a mistake extracting the features.

       
  • Diwakar.G

    Diwakar.G - 2016-12-07

    Sir, what I have done is for every database I am giving same hmm directory path(i.e. sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd) is this right path.

     
  • Diwakar.G

    Diwakar.G - 2016-12-07

    Sir, how actually it is giving frame number without any training. In that hmm model for the same variance and means can be used for segmentation of any words how actually it is giving frame information. Is there any material to clearly understand this.
    In this manually I have to convert from frame number to get sample number or time information. Is there any option in sphinx3_align so that I can directly get time information.
    Thank you

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.