I've succeeded twice with the training before. However, at this 3rd time (after adding new data), I keep getting this error:
$ sphinxtrain run
Sphinxtrain path: /usr/local/lib/sphinxtrain
Sphinxtrain binaries path: /usr/local/libexec/sphinxtrain
Running the training
MODULE: 000 Computing feature from audio files
Feature extraction is done
MODULE: 00 verify training files
Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
Found 18800 words using 29 phones
Phase 2: Checking to make sure there are not duplicate entries in the dictionary
Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
Phase 4: Checking number of lines in the transcript file should match lines in fileids file
Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
Estimated Total Hours Training: 5.81149444444445
This is a small amount of data, no comment at this time
Phase 6: Checking that all the words in the transcript are in the dictionary
Words in dictionary: 18794
Words in filler dictionary: 6
Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
MODULE: 0000 train grapheme-to-phoneme model
Skipped (set$CFG_G2P_MODEL='yes' to enable)
MODULE: 01 Train LDA transformation
Skipped (set$CFG_LDA_MLLT='yes' to enable)
MODULE: 02 Train MLLT transformation
Skipped (set$CFG_LDA_MLLT='yes' to enable)
MODULE: 05 Vector Quantization
Skipped for continuous models
MODULE: 10 Training Context Independent models for forced alignment and VTLN
Skipped: $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
MODULE: 11 Force-aligning transcripts
Skipped: $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
MODULE: 12 Force-aligning data for VTLN
Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
MODULE: 20 Training Context Independent models
Phase 1: Cleaning up directories:
accumulator...logs...qmanager...models...
Phase 2: Flat initialize
Phase 3: Forward-Backward
ERROR: Training failed in iteration 1
Having looked at the log files:
utt> 0 01-audio-01-1 468 0INFO: cvt2triphone.c(199): no multiphones defined, no conversion done
ERROR: "forward.c", line 594: All 2 active states, 63 64, zero at time 1
ERROR: "baum_welch.c", line 324: male1/01-audio-01-1 ignored
168 0
INFO: cmn.c(133): CMN: -387.67 64.52 -253.42 -235.65 83.31 -24.09 -30.89 94.15 303.87 -380.87 275.48 252.06 397.23
ERROR: "forward.c", line 594: All 2 active states, 63 64, zero at time 1
ERROR: "baum_welch.c", line 324: male1/01-audio-01-2 ignored
utt> 1 01-audio-01-2 1213 0 412 0
INFO: cmn.c(133): CMN: 627.87 -367.40 -776.83 -829.70 -417.78 763.99 -26.27 675.54 -540.51 -28.63 97.21 472.52 -733.19
ERROR: "forward.c", line 594: All 2 active states, 63 64, zero at time 1
ERROR: "baum_welch.c", line 324: male1/01-audio-01-3 ignored
utt> 2 01-audio-01-3 617 0 196 0
These audio samples above are even the succeeded ones in previous trainings. I confirmed all audios (including the new data) are using 16KHz, Mono, 16bit.
What could be the problem?
Last edit: Dan H 2018-01-03
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've succeeded twice with the training before. However, at this 3rd time (after adding new data), I keep getting this error:
Having looked at the log files:
These audio samples above are even the succeeded ones in previous trainings. I confirmed all audios (including the new data) are using 16KHz, Mono, 16bit.
What could be the problem?
Last edit: Dan H 2018-01-03
CMN values suggest that the input data format is wrong for those files. Check their format carefully, they are certainly not 16khz.