CMU Sphinx / Forums / Help: Forced Alignment

Diwakar.G - 2016-12-05

If the following information is their we can do directly forced alignment i.e. from sample number 2210 to 5080 corresponds to 'she' and 0 to 2209 correspond to 'SIL'.
2210 5080 she
5080 9370 had
9370 10760 your
10760 15840 dark
15840 19258 suit
19258 21360 in
21360 27864 greasy
27864 34464 wash
34464 38642 water
39477 43180 all
43180 48569 year

After training and testing the resuilts we are getting like this
she had your dark suit in greasy wash water all year (FAKS0-FAKS0-SA1)
she had your dark suit in greasy wash water all year (FAKS0-FAKS0-SA1)
Words: 11 Correct: 11 Errors: 0 Percent correct = 100.00% Error = 0.00% Accuracy = 100.00%

How can this will be speech to text alignment, is there any way to get sample number information or timing information.
Please give me clarity.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

I have seen the configuration file in that there is one option for forced alignment initially it is set no but now I make it yes. But I am getting the following error please help me.
`
sitecsp@acl-pg-06:~/DYSARTHRIC/an4$ sphinxtrain run
Sphinxtrain path: /usr/local/lib/sphinxtrain
Sphinxtrain binaries path: /usr/local/libexec/sphinxtrain
Running the training
MODULE: 000 Computing feature from audio files
Extracting features from segments starting at (part 1 of 1)
Extracting features from segments starting at (part 1 of 1)
Feature extraction is done
MODULE: 00 verify training files
Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
Found 30 words using 25 phones
Phase 2: Checking to make sure there are not duplicate entries in the dictionary
Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
Phase 4: Checking number of lines in the transcript file should match lines in fileids file
Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
Estimated Total Hours Training: 0.647533333333333
This is a small amount of data, no comment at this time
Phase 6: Checking that all the words in the transcript are in the dictionary
Words in dictionary: 27
Words in filler dictionary: 3
Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
MODULE: 0000 train grapheme-to-phoneme model
Skipped (set $CFG_G2P_MODEL = 'yes' to enable)
MODULE: 01 Train LDA transformation
Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
MODULE: 02 Train MLLT transformation
Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
MODULE: 05 Vector Quantization
Skipped for continuous models
MODULE: 10 Training Context Independent models for forced alignment and VTLN
Phase 1: Cleaning up directories:
accumulator...logs...qmanager...models...
Phase 2: Flat initialize
Phase 3: Forward-Backward
Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Normalization for iteration: 1
Current Overall Likelihood Per Frame = -161.308512646282
Baum welch starting for 1 Gaussian(s), iteration: 2 (1 of 1)
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Normalization for iteration: 2
Current Overall Likelihood Per Frame = -158.718598785133
Convergence Ratio = 2.58991386114866
Baum welch starting for 1 Gaussian(s), iteration: 3 (1 of 1)
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Normalization for iteration: 3
Current Overall Likelihood Per Frame = -155.527943649405
Convergence Ratio = 3.19065513572841
Baum welch starting for 1 Gaussian(s), iteration: 4 (1 of 1)
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Normalization for iteration: 4
Current Overall Likelihood Per Frame = -153.912110916641
Convergence Ratio = 1.61583273276406
Baum welch starting for 1 Gaussian(s), iteration: 5 (1 of 1)
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Normalization for iteration: 5
Current Overall Likelihood Per Frame = -153.477470057312
Convergence Ratio = 0.434640859329505
Baum welch starting for 1 Gaussian(s), iteration: 6 (1 of 1)
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Normalization for iteration: 6
Current Overall Likelihood Per Frame = -153.290349703147
Convergence Ratio = 0.187120354165017
Baum welch starting for 1 Gaussian(s), iteration: 7 (1 of 1)
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Normalization for iteration: 7
Current Overall Likelihood Per Frame = -153.185850578263
Convergence Ratio = 0.1044991248842
Baum welch starting for 1 Gaussian(s), iteration: 8 (1 of 1)
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Normalization for iteration: 8
Current Overall Likelihood Per Frame = -153.134587666015
Training completed after 8 iterations
MODULE: 11 Force-aligning transcripts
Skipped: No sphinx3_align(.exe) found in /usr/local/libexec/sphinxtrain
If you wish to do force-alignment, please copy or link the
sphinx3_align binary from Sphinx 3 to /usr/local/libexec/sphinxtrain
and either define $CFG_MODEL_DIR in sphinx_train.cfg or
run context-independent training first.

`
After that I made following changes in configuration file

# (yes/no) Train multiple-gaussian context-independent models (useful
# for alignment, use 'no' otherwise) in the models created
# specifically for forced alignment
$CFG_FALIGN_CI_MGAU = 'yes';
# (yes/no) Train multiple-gaussian context-independent models (useful
# for alignment, use 'no' otherwise)
$CFG_CI_MGAU = 'yes';
# (yes/no) Train context-dependent models
$CFG_CD_TRAIN = 'yes';
# Number of tied states (senones) to create in decision-tree clustering
$CFG_N_TIED_STATES = 200;
# How many parts to run Forward-Backward estimatinon in
$CFG_NPART = 1;

# (yes/no) Train a single decision tree for all phones (actually one
# per state) (useful for grapheme-based models, use 'no' otherwise)
$CFG_CROSS_PHONE_TREES = 'no';

# Use force-aligned transcripts (if available) as input to training
$CFG_FORCEDALIGN = 'yes';

Even then I am getting the same error please help me. What might the output of forced alignment

sitecsp@acl-pg-06:~/DYSARTHRIC/an4$ sphinxtrain run
Sphinxtrain path: /usr/local/lib/sphinxtrain
Sphinxtrain binaries path: /usr/local/libexec/sphinxtrain
Running the training
MODULE: 000 Computing feature from audio files
Extracting features from  segments starting at  (part 1 of 1) 
Extracting features from  segments starting at  (part 1 of 1) 
Feature extraction is done
MODULE: 00 verify training files
    Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
        Found 30 words using 25 phones
    Phase 2: Checking to make sure there are not duplicate entries in the dictionary
    Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
    Phase 4: Checking number of lines in the transcript file should match lines in fileids file
    Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
        Estimated Total Hours Training: 0.647533333333333
        This is a small amount of data, no comment at this time
    Phase 6: Checking that all the words in the transcript are in the dictionary
        Words in dictionary: 27
        Words in filler dictionary: 3
    Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
MODULE: 0000 train grapheme-to-phoneme model
Skipped (set $CFG_G2P_MODEL = 'yes' to enable)
MODULE: 01 Train LDA transformation
Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
MODULE: 02 Train MLLT transformation
Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
MODULE: 05 Vector Quantization
Skipped for continuous models
MODULE: 10 Training Context Independent models for forced alignment and VTLN
    Phase 1: Cleaning up directories:
    accumulator...logs...qmanager...models...
    Phase 2: Flat initialize
    Phase 3: Forward-Backward
        Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 1
        Current Overall Likelihood Per Frame = -161.308512646282
        Baum welch starting for 1 Gaussian(s), iteration: 2 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 2
        Current Overall Likelihood Per Frame = -158.718598785133
        Convergence Ratio = 2.58991386114866
        Baum welch starting for 1 Gaussian(s), iteration: 3 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 3
        Current Overall Likelihood Per Frame = -155.527943649405
        Convergence Ratio = 3.19065513572841
        Baum welch starting for 1 Gaussian(s), iteration: 4 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 4
        Current Overall Likelihood Per Frame = -153.912110916641
        Convergence Ratio = 1.61583273276406
        Baum welch starting for 1 Gaussian(s), iteration: 5 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 5
        Current Overall Likelihood Per Frame = -153.477470057312
        Convergence Ratio = 0.434640859329505
        Baum welch starting for 1 Gaussian(s), iteration: 6 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 6
        Current Overall Likelihood Per Frame = -153.290349703147
        Convergence Ratio = 0.187120354165017
        Baum welch starting for 1 Gaussian(s), iteration: 7 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 7
        Current Overall Likelihood Per Frame = -153.185850578263
        Convergence Ratio = 0.1044991248842
        Baum welch starting for 1 Gaussian(s), iteration: 8 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 8
        Current Overall Likelihood Per Frame = -153.134587666015
        Split Gaussians, increase by 1
        Current Overall Likelihood Per Frame = -153.134587666015
        Convergence Ratio = 0.0512629122483759
        Baum welch starting for 2 Gaussian(s), iteration: 1 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 1
        Current Overall Likelihood Per Frame = -153.629242595834
        Baum welch starting for 2 Gaussian(s), iteration: 2 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 2
        Current Overall Likelihood Per Frame = -152.910189093655
        Convergence Ratio = 0.719053502179435
        Baum welch starting for 2 Gaussian(s), iteration: 3 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 3
        Current Overall Likelihood Per Frame = -152.24939085075
        Convergence Ratio = 0.660798242905145
        Baum welch starting for 2 Gaussian(s), iteration: 4 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 4
        Current Overall Likelihood Per Frame = -151.657314938742
        Convergence Ratio = 0.592075912008085
        Baum welch starting for 2 Gaussian(s), iteration: 5 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 5
        Current Overall Likelihood Per Frame = -151.33661072789
        Convergence Ratio = 0.320704210851545
        Baum welch starting for 2 Gaussian(s), iteration: 6 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 6
        Current Overall Likelihood Per Frame = -151.181363464772
        Convergence Ratio = 0.155247263117701
        Baum welch starting for 2 Gaussian(s), iteration: 7 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 7
        Current Overall Likelihood Per Frame = -151.101787981743
        Split Gaussians, increase by 2
        Current Overall Likelihood Per Frame = -151.101787981743
        Convergence Ratio = 0.0795754830293163
        Baum welch starting for 4 Gaussian(s), iteration: 1 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 1
        Current Overall Likelihood Per Frame = -151.568516421291
        Baum welch starting for 4 Gaussian(s), iteration: 2 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 2
        Current Overall Likelihood Per Frame = -150.884724939085
        Convergence Ratio = 0.683791482205919
        Baum welch starting for 4 Gaussian(s), iteration: 3 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 3
        Current Overall Likelihood Per Frame = -150.405470331858
        Convergence Ratio = 0.479254607227347
        Baum welch starting for 4 Gaussian(s), iteration: 4 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 4
        Current Overall Likelihood Per Frame = -149.926129928961
        Convergence Ratio = 0.47934040289681
        Baum welch starting for 4 Gaussian(s), iteration: 5 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 5
        Current Overall Likelihood Per Frame = -149.672775318302
        Convergence Ratio = 0.253354610659045
        Baum welch starting for 4 Gaussian(s), iteration: 6 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 6
        Current Overall Likelihood Per Frame = -149.544682384433
        Convergence Ratio = 0.128092933868771
        Baum welch starting for 4 Gaussian(s), iteration: 7 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 7
        Current Overall Likelihood Per Frame = -149.463734170699
        Split Gaussians, increase by 4
        Current Overall Likelihood Per Frame = -149.463734170699
        Convergence Ratio = 0.080948213733933
        Baum welch starting for 8 Gaussian(s), iteration: 1 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 1
        Current Overall Likelihood Per Frame = -149.922097532517
        Baum welch starting for 8 Gaussian(s), iteration: 2 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 2
        Current Overall Likelihood Per Frame = -149.273310683277
        Convergence Ratio = 0.648786849240281
        Baum welch starting for 8 Gaussian(s), iteration: 3 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 3
        Current Overall Likelihood Per Frame = -148.915242458561
        Convergence Ratio = 0.358068224716305
        Baum welch starting for 8 Gaussian(s), iteration: 4 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 4
        Current Overall Likelihood Per Frame = -148.523242046741
        Convergence Ratio = 0.392000411819538
        Baum welch starting for 8 Gaussian(s), iteration: 5 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 5
        Current Overall Likelihood Per Frame = -148.303819623185
        Convergence Ratio = 0.219422423555557
        Baum welch starting for 8 Gaussian(s), iteration: 6 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 6
        Current Overall Likelihood Per Frame = -148.192242355606
        Convergence Ratio = 0.111577267579122
        Baum welch starting for 8 Gaussian(s), iteration: 7 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
        Normalization for iteration: 7
        Current Overall Likelihood Per Frame = -148.125021448917
Training for 8 Gaussian(s) completed after 7 iterations
MODULE: 11 Force-aligning transcripts
Skipped: No sphinx3_align(.exe) found in /usr/local/libexec/sphinxtrain
If you wish to do force-alignment, please copy or link the
sphinx3_align binary from Sphinx 3 to /usr/local/libexec/sphinxtrain
and either define $CFG_MODEL_DIR in sphinx_train.cfg or
run context-independent training first.

Arseniy Gorin - 2016-12-05

It seems self-explaining: you are missing sphin3_align tool that you want to use for forced alignment.
You should try to download and install https://github.com/skerit/cmusphinx/tree/master/sphinx3

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- sheharyar masood - 2018-06-04
  
  sir how can I download it ?
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Diwakar.G - 2016-12-05

sir, from the forced alignment should I get sample number information
If the following information is their we can do directly forced alignment i.e. from sample number 2210 to 5080 corresponds to 'she' and 0 to 2209 correspond to 'SIL'.
2210 5080 she
5080 9370 had
9370 10760 your
10760 15840 dark
15840 19258 suit
19258 21360 in
21360 27864 greasy
27864 34464 wash
34464 38642 water
39477 43180 all
43180 48569 year
After training and testing the resuilts we are getting like this
she had your dark suit in greasy wash water all year (FAKS0-FAKS0-SA1)
she had your dark suit in greasy wash water all year (FAKS0-FAKS0-SA1)
Words: 11 Correct: 11 Errors: 0 Percent correct = 100.00% Error = 0.00% Accuracy = 100.00%
How can this will be speech to text alignment, is there any way to get sample number information or timing information.
Please give me clarity.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Arseniy Gorin - 2016-12-05
  
  In fact I do not understand your question. Forced alignment is used when you do not have time information (only have sentence trancript).
  
  In your example you have time explicitly written. So why at all you need the alignmen? Try to re-formulate your question probably
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Diwakar.G - 2016-12-05

In order to infer the speech to text alignment results the sample numbers of word boundaries(onset offsets) are required. But the CMU sphinx is not giving the desired alignment results in the form mentioned above which is not quite intuitive. Instead it is giving following result
she had your dark suit in greasy wash water all year (FAKS0-FAKS0-SA1)
she had your dark suit in greasy wash water all year (FAKS0-FAKS0-SA1)
Which I am unable to infer result.

Is there any way to get timing information from the alignment result.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Arseniy Gorin - 2016-12-05
  
  OK, now it's clear, you seem to speak about result/an4.align file. This is not the speech to audio alignment but reference to hypothesis string alignment to compute error rate.
  
  What you seem to need is just time information from the decoder. In pocketsphinx this is achieved with -ctm option. CTM format gives you time per each word
  
  Not that when you do forced alignment, it is a totally different procedure. Forced alignment means you have a ground truth transcript and your goal is to get time information. You do not use decoder for that, but the aligner
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Diwakar.G - 2016-12-05

Yes sir, that is what I required. For a given transcription which part of the word corresponds to 'she' like that.If I get timing information from the forced alignment I can easily do this. Sir, can you please explain me detailed how can I do
this. From where I should option of ctm please tell me

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-12-05
  
  sphinx3_align tool has -wdsegdir option, not ctm option to dump word times.
  
  The command line is:
  
  sphinx3_align -hmm <acoustic_model> -dict <dictionary> -fdict <filler_dictionary> -ctl <fileids_file> -insent <transcription_file> -cepdir <mfc_dir> -wdsegdir <output_wdseg_dir>
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Diwakar.G - 2016-12-06

Sir, I am currently using pocket sphinx sphinx3 align .exe is not their but for the first time I have used sphinx 3 in that within a build directory their sphinx3_align .exe file can I use the same thing orCan you please provide the link so that I can download.
Thanks in advance

Last edit: Diwakar.G 2016-12-06

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Sir, finally I got

     SFrm  EFrm    SegAScr Word
        0     4     -13424 <s>
        5    25     -35534 <s>
       26    95    -631131 RUBOUT
       96   164    -459649 <sil>
      165   197    -243354 T
      198   200     -81896 <sil>
      201   244    -139962 G
      245   256    -141954 <sil>
      257   306    -338001 J
      307   314    -104440 <sil>
      315   357    -375252 W
      358   400    -239127 B
      401   468    -220867 <sil>
      469   516    -277984 SEVENTY
      517   554    -230317 NINE
      555   587    -157324 FIFTY
      588   630    -263817 NINE
      631   635     -35982 </s>
      636   638     -62163 </s>
 Total score:    -4052178

But I am getting frame numberwise information i.e. RUBOUT corresponds from 26 to 95 frame. Is there is any way to get either sample number information or timing information i.e. RUBOUT corresponds to samples from 16000 to 36000 or 1.5 to 2.5sec in the audio signal.

sphinx3_align -hmm /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd -dict an4.dic -fdict an4.filler -ctl an4_train.fileids -insent an4_train.transcription -cepdir /home/sitecsp/Documents/FORCE/an4/feat -wdsegdir /home/sitecsp/Documents/FORCE/an4

Thank you.

Sir, when I tried to run for some other mfc files I am getting the following error. Please help me

INFO: feat.c(1205): At directory /home/sitecsp/DR/an4/feat
INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR/an4/feat/an4_clstk/DR8/MRRE0/MRRE0-SA1.mfc'[0..-1]
INFO: cmn.c(175): CMN: 42.44  2.21 -4.83  4.28 -12.19 -4.37 -24.89  9.50 -4.19 -7.13  3.09 -4.73  0.46 
INFO: main_align.c(1009): MRRE0-SA1: 404 input frames

ERROR: "main_align.c", line 891: Final state not reached; no alignment for MRRE0-SA1

    0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   404 frames,    0.04 sec CPU,   0.01 xRT;    0.04 sec elapsed,   0.01 xRT
INFO: corpus.c(661): MRRE0-SA1:    0.0 sec CPU,    0.0 sec Clk;  TOT:     21.3 sec CPU,     21.9 sec Clk

INFO: feat.c(1205): At directory /home/sitecsp/DR/an4/feat
INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR/an4/feat/an4_clstk/DR8/MRRE0/MRRE0-SA2.mfc'[0..-1]
INFO: cmn.c(175): CMN: 43.67  2.70 -7.54 -0.29 -13.10 -6.50 -29.84 11.79 -6.49 -1.61  4.74 -7.29  5.05 
INFO: main_align.c(1009): MRRE0-SA2: 306 input frames

ERROR: "main_align.c", line 891: Final state not reached; no alignment for MRRE0-SA2

    0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   306 frames,    0.03 sec CPU,   0.01 xRT;    0.03 sec elapsed,   0.01 xRT
INFO: corpus.c(661): MRRE0-SA2:    0.0 sec CPU,    0.0 sec Clk;  TOT:     21.3 sec CPU,     21.9 sec Clk

INFO: feat.c(1205): At directory /home/sitecsp/DR/an4/feat
INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR/an4/feat/an4_clstk/DR8/MTCS0/MTCS0-SA1.mfc'[0..-1]
INFO: cmn.c(175): CMN: 39.68  5.80 -11.96  9.20 -13.26 -5.05 -8.06 -8.04  8.89 -12.84  2.64 -2.76  0.66 
INFO: main_align.c(1009): MTCS0-SA1: 302 input frames

ERROR: "main_align.c", line 891: Final state not reached; no alignment for MTCS0-SA1

    0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   302 frames,    0.03 sec CPU,   0.01 xRT;    0.03 sec elapsed,   0.01 xRT
INFO: corpus.c(661): MTCS0-SA1:    0.0 sec CPU,    0.0 sec Clk;  TOT:     21.4 sec CPU,     21.9 sec Clk

INFO: feat.c(1205): At directory /home/sitecsp/DR/an4/feat
INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR/an4/feat/an4_clstk/DR8/MTCS0/MTCS0-SA2.mfc'[0..-1]
INFO: cmn.c(175): CMN: 42.33 10.22 -18.50  5.65 -16.41 -3.37 -13.11 -11.32  7.78 -11.17  5.06 -1.35  3.80 
INFO: main_align.c(1009): MTCS0-SA2: 233 input frames

ERROR: "main_align.c", line 891: Final state not reached; no alignment for MTCS0-SA2

    0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   233 frames,    0.02 sec CPU,   0.01 xRT;    0.02 sec elapsed,   0.01 xRT
INFO: corpus.c(661): MTCS0-SA2:    0.0 sec CPU,    0.0 sec Clk;  TOT:     21.4 sec CPU,     21.9 sec Clk


TOTAL FRAMES:         233112
TOTAL CPU TIME:           21.22 sec,    0.01 xRT
TOTAL ELAPSED TIME:       21.26 sec,    0.01 xRT
sitecsp  14081  0.0  0.0   4448   764 pts/0    S+   10:15   0:00 sh -c ps aguxwww | grep s3align
sitecsp  14083  0.0  0.0  15944  2228 pts/0    R+   10:15   0:00 grep s3align

Diwakar.G - 2016-12-07

sphinx3_align gives frame number only for an4 database. When I tried rm1 or some other database it throws error. Can somebody please help me.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-12-07
  
  Sure, as soon as you provide error details.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

For rm1 database, I am getting following error

sitecsp@acl-pg-06:~/Documents/ALIGN/an4$ sphinx3_align -hmm /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd  -dict an4.dic -fdict an4.filler -ctl an4_train.fileids -insent an4_train.transcription -cepdir /home/sitecsp/Documents/ALIGN/an4/feat -phsegdir /home/sitecsp/Documents/ALIGN/an4/pdsegd -wdsegdir /home/sitecsp/Documents/ALIGN/an4/wdsegd
INFO: info.c(65): Host: 'acl-pg-06'
INFO: info.c(69): Directory: '/home/sitecsp/Documents/ALIGN/an4'
INFO: info.c(73): sphinx3_align Compiled on: Dec 22 2013, AT: 15:13:45

INFO: cmd_ln.c(691): Parsing command line:
sphinx3_align \
    -hmm /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd \
    -dict an4.dic \
    -fdict an4.filler \
    -ctl an4_train.fileids \
    -insent an4_train.transcription \
    -cepdir /home/sitecsp/Documents/ALIGN/an4/feat \
    -phsegdir /home/sitecsp/Documents/ALIGN/an4/pdsegd \
    -wdsegdir /home/sitecsp/Documents/ALIGN/an4/wdsegd 

Current configuration:
[NAME]      [DEFLT]     [VALUE]
-adchdr     0       0
-adcin      no      no
-agc        none        none
-agcthresh  2.0     2.000000e+00
-beam       1e-64       1.000000e-64
-cb2mllr    .1cls.      .1cls.
-cepdir             /home/sitecsp/Documents/ALIGN/an4/feat
-cepext     .mfc        .mfc
-ceplen     13      13
-ci_pbeam   1e-80       1.000000e-80
-cmn        current     current
-cmninit    8.0     8.0
-cond_ds    no      no
-ctl                an4_train.fileids
-ctlcount   1000000000  1000000000
-ctloffset  0       0
-ctl_mllr           
-dict               an4.dic
-dist_ds    no      no
-ds     1       1
-fdict              an4.filler
-feat       1s_c_d_dd   1s_c_d_dd
-featparams         
-frate      100     100
-gs             
-gs4gs      yes     yes
-hmm                /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd
-hyp                
-hypseg             
-insent             an4_train.transcription
-insert_sil 1       1
-kdmaxbbi   -1      -1
-kdmaxdepth 0       0
-kdtree             
-lambda             
-lda                
-ldadim     0       0
-log3table  yes     yes
-logbase    1.0003      1.000300e+00
-logfn              
-lts_mismatch   no      no
-maxcdsenpf 100000      100000
-mdef               
-mean               
-mixw               
-mixwfloor  0.0000001   1.000000e-07
-mllr               
-outsent            
-phlabdir           
-phsegdir           /home/sitecsp/Documents/ALIGN/an4/pdsegd
-s2cdsen    no      no
-s2stsegdir         
-senmgau    .cont.      .cont.
-stsegdir           
-subvq              
-subvqbeam  3.0e-3      3.000000e-03
-svq4svq    no      no
-svspec             
-tighten_factor 0.5     5.000000e-01
-tmat               
-tmatfloor  0.0001      1.000000e-04
-topn       4       4
-var                
-varfloor   0.0001      1.000000e-04
-varnorm    no      no
-vqeval     3       3
-wdsegdir           /home/sitecsp/Documents/ALIGN/an4/wdsegd

INFO:   Initialization of the log add table
INFO:   Log-Add table size = 29350 x 2 >> 0
INFO:   
INFO: feat.c(713): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
INFO:   Reading HMM in Sphinx 3 Model format
INFO:   Model Definition File: /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/mdef
INFO:   Mean File: /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/means
INFO:   Variance File: /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/variances
INFO:   Mixture Weight File: /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/mixture_weights
INFO:   Transition Matrices File: /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/transition_matrices
INFO: mdef.c(682): Reading model definition: /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/mdef
INFO:   Initialization of mdef_t, report:
INFO:   48 CI-phone, 133500 CD-phone, 3 emitstate/phone, 144 CI-sen, 6144 Sen, 32639 Sen-Seq
INFO:   
INFO: kbcore.c(288): Using optimized GMM computation for Continuous HMM, -topn will be ignored
INFO: cont_mgau.c(163): Reading mixture gaussian file '/home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/means'
INFO: cont_mgau.c(422): 6144 mixture Gaussians, 8 components, 1 streams, veclen 39
INFO: cont_mgau.c(163): Reading mixture gaussian file '/home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/variances'
INFO: cont_mgau.c(422): 6144 mixture Gaussians, 8 components, 1 streams, veclen 39
INFO: cont_mgau.c(510): Reading mixture weights file '/home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/mixture_weights'
ERROR: "cont_mgau.c", line 653: Weight normalization failed for 3 senones
INFO: cont_mgau.c(665): Read 6144 x 8 mixture weights
INFO: cont_mgau.c(693): Removing uninitialized Gaussian densities
 6 7 8
WARNING: "cont_mgau.c", line 767: 24 densities removed (3 mixtures removed entirely)
INFO: cont_mgau.c(783): Applying variance floor
INFO: cont_mgau.c(801): 0 variance values floored
INFO: cont_mgau.c(849): Precomputing Mahalanobis distance invariants
INFO: tmat.c(169): Reading HMM transition probability matrices: /home/sitecsp/Documents/FORCE/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/transition_matrices
WARNING: "tmat.c", line 242: Normalization failed for tmat 2 from state 0
WARNING: "tmat.c", line 242: Normalization failed for tmat 2 from state 1
WARNING: "tmat.c", line 242: Normalization failed for tmat 2 from state 2
INFO:   Initialization of tmat_t, report:
INFO:   Read 48 transition matrices of size 3x4
INFO:   
INFO: dict.c(475): Reading main dictionary: an4.dic
ERROR: "dict.c", line 263: Line 2: Bad ciphone: AX; word A(2) ignored
ERROR: "dict.c", line 263: Line 5: Bad ciphone: AX; word AAW ignored
ERROR: "dict.c", line 263: Line 6: Bad ciphone: AXR; word ABERDEEN ignored
ERROR: "dict.c", line 263: Line 7: Bad ciphone: AX; word ABOARD ignored
ERROR: "dict.c", line 263: Line 8: Bad ciphone: AX; word ABOVE ignored
ERROR: "dict.c", line 263: Line 10: Bad ciphone: DX; word ADDED ignored
ERROR: "dict.c", line 263: Line 11: Bad ciphone: DX; word ADDING ignored
ERROR: "dict.c", line 263: Line 12: Bad ciphone: AX; word AFFECT ignored
ERROR: "dict.c", line 263: Line 13: Bad ciphone: AXR; word AFTER ignored
ERROR: "dict.c", line 263: Line 14: Bad ciphone: AX; word AGAIN ignored
ERROR: "dict.c", line 263: Line 16: Bad ciphone: IX; word AJAX'S ignored
ERROR: "dict.c", line 263: Line 17: Bad ciphone: AX; word ALASKA ignored
ERROR: "dict.c", line 263: Line 18: Bad ciphone: AX; word ALERT ignored
ERROR: "dict.c", line 263: Line 19: Bad ciphone: AX; word ALERTS ignored
ERROR: "dict.c", line 263: Line 20: Bad ciphone: IX; word ALEXANDRIA ignored
ERROR: "dict.c", line 263: Line 25: Bad ciphone: AX; word AN(4) ignored
ERROR: "dict.c", line 263: Line 26: Bad ciphone: AXR; word ANCHORAGE ignored
ERROR: "dict.c", line 263: Line 27: Bad ciphone: AX; word AND ignored
ERROR: "dict.c", line 263: Line 29: Bad ciphone: DX; word ANYBODY ignored
ERROR: "dict.c", line 263: Line 30: Bad ciphone: DX; word ANYBODY(2) ignored
ERROR: "dict.c", line 263: Line 31: Bad ciphone: AX; word APALACHICOLA ignored
ERROR: "dict.c", line 263: Line 32: Bad ciphone: AX; word APALACHICOLA'S ignored
ERROR: "dict.c", line 263: Line 33: Bad ciphone: AX; word APRIL ignored
ERROR: "dict.c", line 263: Line 34: Bad ciphone: AXR; word ARABIAN ignored
ERROR: "dict.c", line 263: Line 35: Bad ciphone: DX; word ARCTIC ignored
ERROR: "dict.c", line 263: Line 36: Bad ciphone: IX; word ARCTIC(2) ignored
ERROR: "dict.c", line 263: Line 38: Bad ciphone: AXR; word ARE(2) ignored
ERROR: "dict.c", line 263: Line 39: Bad ciphone: AX; word AREA ignored
ERROR: "dict.c", line 263: Line 40: Bad ciphone: AX; word AREAS ignored
ERROR: "dict.c", line 263: Line 41: Bad ciphone: AX; word AREN'T ignored
FATAL_ERROR: "dict.c", line 208: Missing base word for: AREN'T(2)

Nickolay V. Shmyrev - 2016-12-07

Dictionary should match the acoustic model and contain all reference words, your dictionary mismatches and word AREN'T is missing.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

When I tried with timit database in that DR2 folder I am getting the following error

INFO: feat.c(1205): At directory /home/sitecsp/DR2/an4/feat
INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR2/an4/feat/an4_clstk/DR2/MZMB0/MZMB0-SI1796.mfc'[0..-1]
INFO: cmn.c(175): CMN: 46.46  1.78 -7.01 -4.41 -4.52  3.84 -22.28 -0.83 -0.66 -3.31  2.51 -2.98  0.02 
INFO: main_align.c(1009): MZMB0-SI1796: 190 input frames

ERROR: "main_align.c", line 891: Final state not reached; no alignment for MZMB0-SI1796

    0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   190 frames,    0.02 sec CPU,   0.01 xRT;    0.02 sec elapsed,   0.01 xRT
INFO: corpus.c(661): MZMB0-SI1796:    0.0 sec CPU,    0.0 sec Clk;  TOT:      7.8 sec CPU,      7.8 sec Clk

ERROR: "main_align.c", line 974: Uttid mismatch: ctlfile = "MZMB0-SI536"; transcript = "MZMB0-SI1796"
INFO: feat.c(1205): At directory /home/sitecsp/DR2/an4/feat
INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR2/an4/feat/an4_clstk/DR2/MZMB0/MZMB0-SI536.mfc'[0..-1]
INFO: cmn.c(175): CMN: 40.52  1.26 -7.57  0.84 -9.61  0.68 -18.57  3.43 -1.65 -6.09  7.16 -0.13  0.87 
INFO: main_align.c(1009): MZMB0-SI536: 354 input frames

ERROR: "main_align.c", line 891: Final state not reached; no alignment for MZMB0-SI536

    0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   354 frames,    0.03 sec CPU,   0.01 xRT;    0.03 sec elapsed,   0.01 xRT
INFO: corpus.c(661): MZMB0-SI536:    0.0 sec CPU,    0.0 sec Clk;  TOT:      7.8 sec CPU,      7.8 sec Clk

ERROR: "main_align.c", line 974: Uttid mismatch: ctlfile = "MZMB0-SX176"; transcript = "MZMB0-SI536"
INFO: feat.c(1205): At directory /home/sitecsp/DR2/an4/feat
INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR2/an4/feat/an4_clstk/DR2/MZMB0/MZMB0-SX176.mfc'[0..-1]
INFO: cmn.c(175): CMN: 41.94  1.52 -5.41 -3.23 -7.75  1.32 -17.09 -1.08  5.15 -6.77  5.87 -5.76  4.13 
INFO: main_align.c(1009): MZMB0-SX176: 367 input frames

ERROR: "main_align.c", line 891: Final state not reached; no alignment for MZMB0-SX176

    0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   367 frames,    0.04 sec CPU,   0.01 xRT;    0.03 sec elapsed,   0.01 xRT
INFO: corpus.c(661): MZMB0-SX176:    0.0 sec CPU,    0.0 sec Clk;  TOT:      7.8 sec CPU,      7.9 sec Clk

ERROR: "main_align.c", line 974: Uttid mismatch: ctlfile = "MZMB0-SX266"; transcript = "MZMB0-SX176"
INFO: feat.c(1205): At directory /home/sitecsp/DR2/an4/feat
INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR2/an4/feat/an4_clstk/DR2/MZMB0/MZMB0-SX266.mfc'[0..-1]
INFO: cmn.c(175): CMN: 40.42  0.12 -3.11  6.92 -10.41 -5.97 -14.89  3.30  4.63 -5.77  5.67 -1.00  4.92 
INFO: main_align.c(1009): MZMB0-SX266: 266 input frames

ERROR: "main_align.c", line 891: Final state not reached; no alignment for MZMB0-SX266

    0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   266 frames,    0.02 sec CPU,   0.01 xRT;    0.02 sec elapsed,   0.01 xRT
INFO: corpus.c(661): MZMB0-SX266:    0.0 sec CPU,    0.0 sec Clk;  TOT:      7.9 sec CPU,      7.9 sec Clk

ERROR: "main_align.c", line 974: Uttid mismatch: ctlfile = "MZMB0-SX356"; transcript = "MZMB0-SX266"
INFO: feat.c(1205): At directory /home/sitecsp/DR2/an4/feat
INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR2/an4/feat/an4_clstk/DR2/MZMB0/MZMB0-SX356.mfc'[0..-1]
INFO: cmn.c(175): CMN: 37.97 -0.21 -7.37 -2.20 -7.97  1.87 -16.89  3.42  0.30 -4.44  6.58 -3.76  3.35 
INFO: main_align.c(1009): MZMB0-SX356: 315 input frames

ERROR: "main_align.c", line 891: Final state not reached; no alignment for MZMB0-SX356

    0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   315 frames,    0.03 sec CPU,   0.01 xRT;    0.03 sec elapsed,   0.01 xRT
INFO: corpus.c(661): MZMB0-SX356:    0.0 sec CPU,    0.0 sec Clk;  TOT:      7.9 sec CPU,      7.9 sec Clk

ERROR: "main_align.c", line 974: Uttid mismatch: ctlfile = "MZMB0-SX446"; transcript = "MZMB0-SX356"
INFO: feat.c(1205): At directory /home/sitecsp/DR2/an4/feat
INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR2/an4/feat/an4_clstk/DR2/MZMB0/MZMB0-SX446.mfc'[0..-1]
INFO: cmn.c(175): CMN: 41.12  1.51 -4.42 -11.68 -17.91  3.99 -13.75  8.52 -6.43  2.67  5.88 -0.78  0.31 
INFO: main_align.c(1009): MZMB0-SX446: 274 input frames

ERROR: "main_align.c", line 891: Final state not reached; no alignment for MZMB0-SX446

    0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   274 frames,    0.02 sec CPU,   0.01 xRT;    0.02 sec elapsed,   0.01 xRT
INFO: corpus.c(661): MZMB0-SX446:    0.0 sec CPU,    0.0 sec Clk;  TOT:      7.9 sec CPU,      7.9 sec Clk

ERROR: "main_align.c", line 974: Uttid mismatch: ctlfile = "MZMB0-SX86"; transcript = "MZMB0-SX446"
INFO: feat.c(1205): At directory /home/sitecsp/DR2/an4/feat
INFO: feat.c(1022): Reading mfc file: '/home/sitecsp/DR2/an4/feat/an4_clstk/DR2/MZMB0/MZMB0-SX86.mfc'[0..-1]
INFO: cmn.c(175): CMN: 39.14 -0.07 -6.72 -6.75 -12.41 -4.73 -16.51  3.99 -0.82 -3.01  4.67 -3.06  0.09 
INFO: main_align.c(1009): MZMB0-SX86: 223 input frames

ERROR: "main_align.c", line 891: Final state not reached; no alignment for MZMB0-SX86

    0.00x U    0.00x G    0.00x S    0.00x AEXECTIME:   223 frames,    0.02 sec CPU,   0.01 xRT;    0.02 sec elapsed,   0.01 xRT
INFO: corpus.c(661): MZMB0-SX86:    0.0 sec CPU,    0.0 sec Clk;  TOT:      7.9 sec CPU,      8.0 sec Clk


TOTAL FRAMES:          86595
TOTAL CPU TIME:            7.91 sec,    0.01 xRT
TOTAL ELAPSED TIME:        7.90 sec,    0.01 xRT
sitecsp   3013  0.0  0.0   4448   684 pts/0    S+   16:20   0:00 sh -c ps aguxwww | grep s3align
sitecsp   3015  0.0  0.0  15944  2264 pts/0    S+   16:20   0:00 grep s3align

Please help me.

Nickolay V. Shmyrev - 2016-12-07

This means you made a mistake extracting the features.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Diwakar.G - 2016-12-07

Sir, what I have done is for every database I am giving same hmm directory path(i.e. sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd) is this right path.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Diwakar.G - 2016-12-07

Sir, how actually it is giving frame number without any training. In that hmm model for the same variance and means can be used for segmentation of any words how actually it is giving frame information. Is there any material to clearly understand this.
In this manually I have to convert from frame number to get sample number or time information. Is there any option in sphinx3_align so that I can directly get time information.
Thank you

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Forced Alignment

Speech Recognition Toolkit

Forums

Help

Forced Alignment document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Forced Alignment