Menu

Training Completed with Errors Reported in Modules 20, 30 and 50

Help
Izza
2016-06-18
2016-06-24
  • Izza

    Izza - 2016-06-18

    Hi,

    I carried out an acoustic model training with recordings around 0.5 hours. Intentionally disabled to recording length check in verify_all.pl, just to check if the training completes. The training log is [1].

    Even with the errors reported, can I assume the training is completed? Or else do I need to fix all the errors mentioned in the intermediate steps? I tried with force alignment set to both 'true' and 'false', and got similar errors in the intermediate steps as mentioned at [1].

    Have attached the log directory for the run.

    Thank you!

    [1].
    MODULE: 000 Computing feature from audio files
    Extracting features from segments starting at (part 1 of 1)
    Extracting features from segments starting at (part 1 of 1)
    Feature extraction is done
    MODULE: 00 verify training files
    Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
    Found 1075 words using 41 phones
    Phase 2: Checking to make sure there are not duplicate entries in the dictionary
    Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
    Phase 4: Checking number of lines in the transcript file should match lines in fileids file
    Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
    Estimated Total Hours Training: 0.413611111111111
    This is a small amount of data, no comment at this time
    Phase 6: Checking that all the words in the transcript are in the dictionary
    Words in dictionary: 1072
    Words in filler dictionary: 3
    Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
    MODULE: 0000 train grapheme-to-phoneme model
    Skipped (set $CFG_G2P_MODEL = 'yes' to enable)
    MODULE: 01 Train LDA transformation
    Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
    MODULE: 02 Train MLLT transformation
    Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
    MODULE: 05 Vector Quantization
    Skipped for continuous models
    MODULE: 10 Training Context Independent models for forced alignment and VTLN
    Skipped: $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
    Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
    MODULE: 11 Force-aligning transcripts
    Skipped: $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
    MODULE: 12 Force-aligning data for VTLN
    Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
    MODULE: 20 Training Context Independent models
    Phase 1: Cleaning up directories:
    accumulator...logs...qmanager...models...
    Phase 2: Flat initialize
    Phase 3: Forward-Backward
    Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 4 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 1
    Current Overall Likelihood Per Frame = -172.23200701764
    Baum welch starting for 1 Gaussian(s), iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = -171.572867696441
    Convergence Ratio = 0.659139321199433
    Baum welch starting for 1 Gaussian(s), iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 2 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = -169.28253007552
    Convergence Ratio = 2.29033762092141
    Baum welch starting for 1 Gaussian(s), iteration: 4 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 2 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 4
    Current Overall Likelihood Per Frame = -167.596332263351
    Convergence Ratio = 1.6861978121689
    Baum welch starting for 1 Gaussian(s), iteration: 5 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 2 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 5
    Current Overall Likelihood Per Frame = -167.060806914471
    Convergence Ratio = 0.535525348879588
    Baum welch starting for 1 Gaussian(s), iteration: 6 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 6 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 6
    Current Overall Likelihood Per Frame = -166.904653126774
    Convergence Ratio = 0.156153787697178
    Baum welch starting for 1 Gaussian(s), iteration: 7 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 8 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 7
    Current Overall Likelihood Per Frame = -166.851970970313
    Training completed after 7 iterations
    MODULE: 30 Training Context Dependent models
    Phase 1: Cleaning up directories:
    accumulator...logs...qmanager...
    Phase 2: Initialization
    Phase 3: Forward-Backward
    Baum welch starting for iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 8 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 1
    Current Overall Likelihood Per Frame = -166.822955266875
    Baum welch starting for iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 12 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = -160.506480899903
    Convergence Ratio = 6.31647436697207
    Baum welch starting for iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 16 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = -156.50110117702
    Convergence Ratio = 4.00537972288333
    Baum welch starting for iteration: 4 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 20 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 4
    Current Overall Likelihood Per Frame = -155.370894888269
    Convergence Ratio = 1.13020628875057
    Baum welch starting for iteration: 5 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 20 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 5
    Current Overall Likelihood Per Frame = -155.048150743131
    Convergence Ratio = 0.322744145137563
    Baum welch starting for iteration: 6 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 22 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 6
    Current Overall Likelihood Per Frame = -154.875081910414
    Convergence Ratio = 0.173068832716524
    Baum welch starting for iteration: 7 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 24 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 7
    Current Overall Likelihood Per Frame = -154.779529358008
    Training completed after 7 iterations
    MODULE: 40 Build Trees
    Phase 1: Cleaning up old log files...
    Phase 2: Make Questions
    Phase 3: Tree building
    Processing each phone with each state
    A 0
    A 1
    A 2
    AA 0
    AA 1
    AA 2
    AE 0
    AE 1
    AE 2
    AN 0
    AN 1
    AN 2
    B 0
    B 1
    B 2
    BH 0
    BH 1
    BH 2
    CH 0
    CH 1
    CH 2
    D 0
    D 1
    D 2
    DH 0
    DH 1
    DH 2
    E 0
    E 1
    E 2
    EA 0
    EA 1
    EA 2
    EE 0
    EE 1
    EE 2
    G 0
    G 1
    G 2
    GH 0
    GH 1
    GH 2
    GN 0
    GN 1
    GN 2
    H 0
    H 1
    H 2
    I 0
    I 1
    I 2
    II 0
    II 1
    II 2
    J 0
    J 1
    J 2
    K 0
    K 1
    K 2
    KH 0
    KH 1
    KH 2
    KN 0
    KN 1
    KN 2
    L 0
    L 1
    L 2
    M 0
    M 1
    M 2
    N 0
    N 1
    N 2
    NNDH 0
    NNDH 1
    NNDH 2
    NNG 0
    NNG 1
    NNG 2
    O 0
    O 1
    O 2
    OE 0
    OE 1
    OE 2
    OO 0
    OO 1
    OO 2
    P 0
    P 1
    P 2
    R 0
    R 1
    R 2
    RU 0
    RU 1
    RU 2
    S 0
    S 1
    S 2
    SH 0
    SH 1
    SH 2
    T 0
    T 1
    T 2
    TH 0
    TH 1
    TH 2
    U 0
    U 1
    U 2
    V 0
    V 1
    V 2
    Y 0
    Y 1
    Y 2
    Skipping SIL
    MODULE: 45 Prune Trees
    Phase 1: Tree Pruning
    Phase 2: State Tying
    MODULE: 50 Training Context dependent models
    Phase 1: Cleaning up directories:
    accumulator...logs...qmanager...
    Phase 2: Copy CI to CD initialize
    Phase 3: Forward-Backward
    Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 8 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 1
    Current Overall Likelihood Per Frame = -166.822955266875
    Baum welch starting for 1 Gaussian(s), iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 8 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = -165.739934445477
    Convergence Ratio = 1.08302082139778
    Baum welch starting for 1 Gaussian(s), iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 10 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = -165.577558688099
    Convergence Ratio = 0.16237575737847
    Baum welch starting for 1 Gaussian(s), iteration: 4 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 10 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 4
    Current Overall Likelihood Per Frame = -165.511666996243
    Split Gaussians, increase by 1
    Current Overall Likelihood Per Frame = -165.511666996243
    Convergence Ratio = 0.0658916918561658
    Baum welch starting for 2 Gaussian(s), iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 12 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 1
    Current Overall Likelihood Per Frame = -165.982857183639
    Baum welch starting for 2 Gaussian(s), iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 10 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = -165.344273412499
    Convergence Ratio = 0.638583771139622
    Baum welch starting for 2 Gaussian(s), iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 10 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = -164.949808913598
    Convergence Ratio = 0.394464498901129
    Baum welch starting for 2 Gaussian(s), iteration: 4 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 10 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 4
    Current Overall Likelihood Per Frame = -164.515926221575
    Convergence Ratio = 0.433882692022536
    Baum welch starting for 2 Gaussian(s), iteration: 5 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 10 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 5
    Current Overall Likelihood Per Frame = -164.282737233238
    Convergence Ratio = 0.233188988336565
    Baum welch starting for 2 Gaussian(s), iteration: 6 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 10 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 6
    Current Overall Likelihood Per Frame = -164.160032212717
    Convergence Ratio = 0.122705020521039
    Baum welch starting for 2 Gaussian(s), iteration: 7 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 4 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 7
    Current Overall Likelihood Per Frame = -164.133910589267
    Split Gaussians, increase by 2
    Current Overall Likelihood Per Frame = -164.133910589267
    Convergence Ratio = 0.0261216234501092
    Baum welch starting for 4 Gaussian(s), iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 4 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 1
    Current Overall Likelihood Per Frame = -164.575278466618
    Baum welch starting for 4 Gaussian(s), iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 4 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = -163.898601159079
    Convergence Ratio = 0.676677307539222
    Baum welch starting for 4 Gaussian(s), iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 4 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = -163.531765060489
    Convergence Ratio = 0.366836098590426
    Baum welch starting for 4 Gaussian(s), iteration: 4 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 4 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 4
    Current Overall Likelihood Per Frame = -163.135472468454
    Convergence Ratio = 0.396292592034854
    Baum welch starting for 4 Gaussian(s), iteration: 5 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 4 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 5
    Current Overall Likelihood Per Frame = -162.892833166642
    Convergence Ratio = 0.24263930181246
    Baum welch starting for 4 Gaussian(s), iteration: 6 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 4 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 6
    Current Overall Likelihood Per Frame = -162.748290838345
    Convergence Ratio = 0.144542328296495
    Baum welch starting for 4 Gaussian(s), iteration: 7 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 4 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 7
    Current Overall Likelihood Per Frame = -162.65875679897
    Split Gaussians, increase by 4
    Current Overall Likelihood Per Frame = -162.65875679897
    Convergence Ratio = 0.0895340393753088
    Baum welch starting for 8 Gaussian(s), iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 4 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 1
    Current Overall Likelihood Per Frame = -163.101152228418
    Baum welch starting for 8 Gaussian(s), iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 4 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = -162.383715354368
    Convergence Ratio = 0.717436874049525
    Baum welch starting for 8 Gaussian(s), iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 4 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = -161.974270095494
    Convergence Ratio = 0.409445258874172
    Baum welch starting for 8 Gaussian(s), iteration: 4 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 4 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 4
    Current Overall Likelihood Per Frame = -161.583320774363
    Convergence Ratio = 0.390949321130734
    Baum welch starting for 8 Gaussian(s), iteration: 5 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 4 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 5
    Current Overall Likelihood Per Frame = -161.353080601186
    Convergence Ratio = 0.230240173176497
    Baum welch starting for 8 Gaussian(s), iteration: 6 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 4 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 6
    Current Overall Likelihood Per Frame = -161.224431078656
    Convergence Ratio = 0.12864952253031
    Baum welch starting for 8 Gaussian(s), iteration: 7 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 4 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 7
    Current Overall Likelihood Per Frame = -161.139486771979
    Training for 8 Gaussian(s) completed after 7 iterations
    MODULE: 60 Lattice Generation
    Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
    MODULE: 61 Lattice Pruning
    Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
    MODULE: 62 Lattice Format Conversion
    Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
    MODULE: 65 MMIE Training
    Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
    MODULE: 90 deleted interpolation
    Skipped for continuous models
    MODULE: DECODE Decoding using models previously trained
    Decoding 240 segments starting at 0 (part 1 of 1)
    0%
    Aligning results to find error rate
    Sphinxtrain path: /usr/local/lib/sphinxtrain
    Sphinxtrain binaries path: /usr/local/libexec/sphinxtrain
    Running the training

     

    Last edit: Izza 2016-06-18
  • Izza

    Izza - 2016-06-20

    Hi,

    Any input/feedback is much appreciated!

     
    • Nickolay V. Shmyrev

      This issue is covered in troubleshooting section of acoustic model training tutorial.

      ERROR: "backward.c", line 430: Failed to align audio to transcript: final state of the search is not reached.

      Sometimes audio in your database doesn't match the transcription properly. For example transcription file has the line “Hello world” but in audio actually “Hello hello world” is pronounced. Training process usually detects that and emits this message in the logs. If there are too many such errors it most likely mean you misconfigured something, for example you had a mismatch between audio and the text caused by transcription reordering. If there are few errors, you can ignore them. You might want to edit the transcription file to put there exact word which were pronounced, in the case above you need to edit the transcription file and put “Hello hello world” on corresponding line. You might want to filter such prompts because they affect acoustic model quality. In that case you need to enable forced alignment stage in training. To do that edit sphinx_train.cfg line

      $CFG_FORCEDALIGN = 'yes';
      and run training again. It will execute stages 10 and 11 and will filter your database.

       

      Last edit: Nickolay V. Shmyrev 2016-06-20
  • Izza

    Izza - 2016-06-22

    Hi Nickolay,

    Thanks for the input.

    Actually I tried force aligning as well, as I've mentioned above. But still it gives me "Final state not reached; no alignment for xxx" [1] error. Can forced aligning fail because of some reason?

    [1]. "main_align.c", line 762: Final state not reached; no alignment for xxx

    Have attached the logdir for the forced alignment training run.

    Thank you!

     
    • Nickolay V. Shmyrev

      Can forced aligning fail because of some reason?

      The reason is provided in tutorial:

      you had a mismatch between audio and the text caused by transcription reordering. You might want to edit the transcription file to put there exact word which were pronounced, in the case above you need to edit the transcription file and put “Hello hello world” on corresponding line

       
  • Izza

    Izza - 2016-06-23

    Thank you. But I went through the training wav files several times, making sure the utterences are same as the audio. Will try with only a couple of sentences to start with and check if I can build from there.

     
    • Nickolay V. Shmyrev

      I trust computer that utterance audio does not match the text more than you.

      If you want more detailed help on this issue you can provide a model training folder.

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.