Menu

Acoustic Model with Forced Alignment Failing

Help
Izza
2016-06-11
2016-06-13
  • Izza

    Izza - 2016-06-11

    Hi,

    I'm trying to train an acousting model with forced alignment. However, the training fails with error [1]. Think for some reason force aligning is failing. Any help on what is going wrong here is very much apprecicated!

    I have attached the log directory as well.

    [1].
    MODULE: 000 Computing feature from audio files
    Extracting features from segments starting at (part 1 of 1)
    Extracting features from segments starting at (part 1 of 1)
    Feature extraction is done
    MODULE: 00 verify training files
    Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
    Found 2652 words using 202 phones
    Phase 2: Checking to make sure there are not duplicate entries in the dictionary
    Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
    Phase 4: Checking number of lines in the transcript file should match lines in fileids file
    Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
    Estimated Total Hours Training: 0.980125
    This is a small amount of data, no comment at this time
    Phase 6: Checking that all the words in the transcript are in the dictionary
    Words in dictionary: 2649
    Words in filler dictionary: 3
    Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
    MODULE: 0000 train grapheme-to-phoneme model
    Skipped (set $CFG_G2P_MODEL = 'yes' to enable)
    MODULE: 01 Train LDA transformation
    Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
    MODULE: 02 Train MLLT transformation
    Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
    MODULE: 05 Vector Quantization
    Skipped for continuous models
    MODULE: 10 Training Context Independent models for forced alignment and VTLN
    Phase 1: Cleaning up directories:
    accumulator...logs...qmanager...models...
    Phase 2: Flat initialize
    Phase 3: Forward-Backward
    Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    Normalization for iteration: 1
    Current Overall Likelihood Per Frame = -172.660686703793
    Baum welch starting for 1 Gaussian(s), iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = -171.773356572999
    Convergence Ratio = 0.887330130793515
    Baum welch starting for 1 Gaussian(s), iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = -168.973713670308
    Convergence Ratio = 2.79964290269052
    Baum welch starting for 1 Gaussian(s), iteration: 4 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    Normalization for iteration: 4
    Current Overall Likelihood Per Frame = -167.020164661537
    Convergence Ratio = 1.95354900877106
    Baum welch starting for 1 Gaussian(s), iteration: 5 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 2 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 5
    Current Overall Likelihood Per Frame = -166.388843837051
    Convergence Ratio = 0.631320824486181
    Baum welch starting for 1 Gaussian(s), iteration: 6 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 4 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 6
    Current Overall Likelihood Per Frame = -166.173955024339
    Convergence Ratio = 0.214888812712047
    Baum welch starting for 1 Gaussian(s), iteration: 7 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 4 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 7
    Current Overall Likelihood Per Frame = -166.080741367096
    Training completed after 7 iterations
    MODULE: 11 Force-aligning transcripts
    Phase 1: Cleaning up directories:
    logs...output...qmanager...
    Phase 3: Creating dictionary for alignment...
    Phase 4: Creating transcript for alignment...
    Phase 5: Running force alignment in 1 parts
    Force alignment starting: (1 of 1)
    0%
    ERROR: This step had 25 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Failed in part 1
    MODULE: 12 Force-aligning data for VTLN
    Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
    MODULE: 20 Training Context Independent models
    Phase 1: Cleaning up directories:
    accumulator...logs...qmanager...models...
    Phase 2: Copy initialize from falign model
    Phase 3: Forward-Backward
    Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
    0%
    ERROR: This step had 1 ERROR messages and 0 WARNING messages. Please check the log file for details.
    ERROR: Training failed in iteration 1
    Sphinxtrain path: /usr/local/lib/sphinxtrain
    Sphinxtrain binaries path: /usr/local/libexec/sphinxtrain
    Running the training

     

    Last edit: Izza 2016-06-11
    • Nickolay V. Shmyrev

      Forced alignment failed on utterance 207, you need to review and probably remove that utterance from training.

       
      • Izza

        Izza - 2016-06-12

        Hi Nickolay,

        Thank you for the input.

        After removing utterance 207, training progressed a bit more but failed again with errors [1, 2]. Doing a grep for ERROR in logdir/10.falign_ci_hmm revealed errors [3] and doing the same in logdir/11.force_align revealed errors [4]. Is this due to an issue in training wav files? What should I do to find the actual issue? Can the last error "ERROR: FATAL: "main.c", line 167: Unable to open /home/isuru/mine/work/sinhala_speech_to_text/training/trees/xxxx.unpruned/WAA-0.dtree for reading: No such file or directory" be a result of such previous issues?

        Have attached the zipped logdir.

        Many thanks!

        [1].
        MODULE: 10 Training Context Independent models for forced alignment and VTLN
        Phase 1: Cleaning up directories:
        accumulator...logs...qmanager...models...
        Phase 2: Flat initialize
        Phase 3: Forward-Backward
        Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
        Normalization for iteration: 1
        Current Overall Likelihood Per Frame = -172.663171705615
        Baum welch starting for 1 Gaussian(s), iteration: 2 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
        Normalization for iteration: 2
        Current Overall Likelihood Per Frame = -171.776481872024
        Convergence Ratio = 0.886689833590935
        Baum welch starting for 1 Gaussian(s), iteration: 3 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
        Normalization for iteration: 3
        Current Overall Likelihood Per Frame = -168.97396709386
        Convergence Ratio = 2.80251477816381
        Baum welch starting for 1 Gaussian(s), iteration: 4 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
        Normalization for iteration: 4
        Current Overall Likelihood Per Frame = -167.006571551359
        Convergence Ratio = 1.96739554250144
        Baum welch starting for 1 Gaussian(s), iteration: 5 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
        Normalization for iteration: 5
        Current Overall Likelihood Per Frame = -166.373607921153
        Convergence Ratio = 0.632963630206234
        Baum welch starting for 1 Gaussian(s), iteration: 6 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
        ERROR: This step had 4 ERROR messages and 0 WARNING messages. Please check the log file for details.
        Normalization for iteration: 6
        Current Overall Likelihood Per Frame = -166.166409428127
        Convergence Ratio = 0.207198493025828
        Baum welch starting for 1 Gaussian(s), iteration: 7 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
        ERROR: This step had 6 ERROR messages and 0 WARNING messages. Please check the log file for details.
        Normalization for iteration: 7
        Current Overall Likelihood Per Frame = -166.088183787271
        Training completed after 7 iterations
        MODULE: 11 Force-aligning transcripts
        Phase 1: Cleaning up directories:
        logs...output...qmanager...
        Phase 3: Creating dictionary for alignment...
        Phase 4: Creating transcript for alignment...
        Phase 5: Running force alignment in 1 parts
        Force alignment starting: (1 of 1)
        0%
        ERROR: This step had 30 ERROR messages and 0 WARNING messages. Please check the log file for details.

        [2].
        MODULE: 45 Prune Trees
        Phase 1: Tree Pruning
        ERROR: FATAL: "main.c", line 167: Unable to open /home/isuru/mine/work/xxx/training/trees/xxx.unpruned/WAA-0.dtree for reading: No such file or directory
        MODULE: 50 Training Context dependent models
        Phase 1: Cleaning up directories:
        accumulator...logs...qmanager...
        Phase 2: Copy CI to CD initialize
        ERROR: This step had 1 ERROR messages and 0 WARNING messages. Please check the log file for details.
        Phase 3: Forward-Backward
        Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
        0% ERROR: FATAL: "main.c", line 1839: initialization failed

        ERROR: This step had 1 ERROR messages and 0 WARNING messages. Please check the log file for details.
        ERROR: Failed to start bw
        ERROR: Only 0 parts of 1 of Baum Welch were successfully completed
        ERROR: Parts 1 failed to run!
        Training failed in iteration 1
        Sphinxtrain path: /usr/local/lib/sphinxtrain
        Sphinxtrain binaries path: /usr/local/libexec/sphinxtrain
        Running the training

        [3].
        ./xxx.1.6-1.bw.log:217:ERROR: "backward.c", line 421: Failed to align audio to trancript: final state of the search is not reached
        ./xxx.1.6-1.bw.log:218:ERROR: "baum_welch.c", line 324: speaker_1/053 ignored
        ./xxx.1.6-1.bw.log:487:ERROR: "backward.c", line 421: Failed to align audio to trancript: final state of the search is not reached
        ./xxx.1.6-1.bw.log:488:ERROR: "baum_welch.c", line 324: speaker_1/192 ignored
        ./sinhala_buddhism.1.7-1.bw.log:217:ERROR: "backward.c", line 421: Failed to align audio to trancript: final state of the search is not reached
        ./xxx.1.7-1.bw.log:218:ERROR: "baum_welch.c", line 324: speaker_1/053 ignored
        ./xxx.1.7-1.bw.log:487:ERROR: "backward.c", line 421: Failed to align audio to trancript: final state of the search is not reached
        ./xxx.1.7-1.bw.log:488:ERROR: "baum_welch.c", line 324: speaker_1/192 ignored
        ./xxx.1.7-1.bw.log:519:ERROR: "backward.c", line 421: Failed to align audio to trancript: final state of the search is not reached
        ./xxx.1.7-1.bw.log:520:ERROR: "baum_welch.c", line 324: speaker_1/208 ignored

        [4].
        ./xxx.1.falign.log:55:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 003
        ./xxx.1.falign.log:159:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 020
        ./xxx.1.falign.log:315:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 035
        ./xxx.1.falign.log:395:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 048
        ./xxx.1.falign.log:415:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 051
        ./xxx.1.falign.log:429:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 053
        ./xxx.1.falign.log:449:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 056
        ./xxx.1.falign.log:653:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 081
        ./xxx.1.falign.log:673:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 084
        ./xxx.1.falign.log:699:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 088
        ./xxx.1.falign.log:725:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 093
        ./xxx.1.falign.log:861:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 105
        ./xxx.1.falign.log:899:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 111
        ./xxx.1.falign.log:913:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 113
        ./xxx.1.falign.log:945:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 118
        ./xxx.1.falign.log:953:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 119
        ./xxx.1.falign.log:1009:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 130
        ./xxx.1.falign.log:1159:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 144
        ./xxx.1.falign.log:1185:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 148
        ./xxx.1.falign.log:1253:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 159
        ./xxx.1.falign.log:1261:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 160
        ./xxx.1.falign.log:1367:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 167
        ./xxx.1.falign.log:1375:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 168
        ./xxx.1.falign.log:1383:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 169
        ./xxx.1.falign.log:1391:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 170
        ./xxx.1.falign.log:1525:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 192
        ./xxx.1.falign.log:1687:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 209
        ./xxx.1.falign.log:1877:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 230
        ./xxx.1.falign.log:1891:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 232
        ./xxx.1.falign.log:1905:ERROR: "main_align.c", line 762: Final state not reached; no alignment for 234

         

        Last edit: Izza 2016-06-12
        • Nickolay V. Shmyrev

          Phone waa is too rare in your training set, probably you need to remove it from your phoneset. It seems you are trying to build a syllable-based acoustic model, it is not advised to do that. CMUSphinx is designed to work with phoneme acoustic model, if you want to play with syllables you need another toolkit.

           
  • Izza

    Izza - 2016-06-13

    Hi Nickolay,

    Thank you for the reply.

    I was actually going through the sequence mentioned in [1] to build a speech-to-text converter for a native language; create a mapping between the unicode words and the transliteration, create the language model and create the acoustic model. I'm not sure what is meant by a syllable-based acoustic model; can you please explain what is a syllable-based acoustic model is, what I'm doing wrong here and how to correct it. I can share the dictionary file, language model, and any other information required.

    Thank you.

    [1]. http://stackoverflow.com/questions/31050003/build-new-acoustic-model-dictionary-language-model-for-uncommon-language-spee

     
  • Izza

    Izza - 2016-06-13

    Thank you.

    So, extracting out the phones from the training data set is a manual procedure?

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.