Menu

Keep failing in iteration 1

Help
Dan H
2018-01-03
2018-01-07
  • Dan H

    Dan H - 2018-01-03

    I've succeeded twice with the training before. However, at this 3rd time (after adding new data), I keep getting this error:

    $ sphinxtrain run
    Sphinxtrain path: /usr/local/lib/sphinxtrain
    Sphinxtrain binaries path: /usr/local/libexec/sphinxtrain
    Running the training
    MODULE: 000 Computing feature from audio files
    Feature extraction is done
    MODULE: 00 verify training files
        Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
            Found 18800 words using 29 phones
        Phase 2: Checking to make sure there are not duplicate entries in the dictionary
        Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
        Phase 4: Checking number of lines in the transcript file should match lines in fileids file
        Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
            Estimated Total Hours Training: 5.81149444444445
            This is a small amount of data, no comment at this time
        Phase 6: Checking that all the words in the transcript are in the dictionary
            Words in dictionary: 18794
            Words in filler dictionary: 6
        Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
    MODULE: 0000 train grapheme-to-phoneme model
    Skipped (set $CFG_G2P_MODEL = 'yes' to enable)
    MODULE: 01 Train LDA transformation
    Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
    MODULE: 02 Train MLLT transformation
    Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
    MODULE: 05 Vector Quantization
    Skipped for continuous models
    MODULE: 10 Training Context Independent models for forced alignment and VTLN
    Skipped:  $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
    Skipped:  $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
    MODULE: 11 Force-aligning transcripts
    Skipped:  $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
    MODULE: 12 Force-aligning data for VTLN
    Skipped:  $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
    MODULE: 20 Training Context Independent models
        Phase 1: Cleaning up directories:
            accumulator...logs...qmanager...models...
        Phase 2: Flat initialize
        Phase 3: Forward-Backward
    ERROR: Training failed in iteration 1
    

    Having looked at the log files:

    utt>     0          01-audio-01-1  468    0INFO: cvt2triphone.c(199): no multiphones defined, no conversion done
    ERROR: "forward.c", line 594: All 2 active states, 63 64, zero at time 1
    ERROR: "baum_welch.c", line 324: male1/01-audio-01-1 ignored
       168 0 
    INFO: cmn.c(133): CMN: -387.67 64.52 -253.42 -235.65 83.31 -24.09 -30.89 94.15 303.87 -380.87 275.48 252.06 397.23 
    ERROR: "forward.c", line 594: All 2 active states, 63 64, zero at time 1
    ERROR: "baum_welch.c", line 324: male1/01-audio-01-2 ignored
    utt>     1          01-audio-01-2 1213    0   412 0 
    INFO: cmn.c(133): CMN: 627.87 -367.40 -776.83 -829.70 -417.78 763.99 -26.27 675.54 -540.51 -28.63 97.21 472.52 -733.19 
    ERROR: "forward.c", line 594: All 2 active states, 63 64, zero at time 1
    ERROR: "baum_welch.c", line 324: male1/01-audio-01-3 ignored
    utt>     2          01-audio-01-3  617    0   196 0 
    

    These audio samples above are even the succeeded ones in previous trainings. I confirmed all audios (including the new data) are using 16KHz, Mono, 16bit.

    What could be the problem?

     

    Last edit: Dan H 2018-01-03
    • Nickolay V. Shmyrev

      CMN values suggest that the input data format is wrong for those files. Check their format carefully, they are certainly not 16khz.

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.