MODULE: 00 verify training files (2017-07-16 05:57)
Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
WARNING: The phonelist (C:/sphinx/*/etc/.phone) does not define the phone SIL (required!)
Found 7 words using 12 phones
WARNING: This phone (SIL) occurs in the dictionary (C:/sphinx/*/etc/*.dic), but not in the phonelist (C:/sphinx/*/etc/**.phone)
passed
Phase 2: Checking to make sure there are not duplicate entries in the dictionary
passed
Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
passed
Phase 4: Checking number of lines in the transcript file should match lines in fileids file
passed
Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
Estimated Total Hours Training: 0.0223138888888889ERROR: Not enough data for the training, we can only train CI models (set CFG_CD_TRAIN to "no")
FAILED*
Phase 6: Checking that all the words in the transcript are in the dictionary
Words in dictionary: 4
Words in filler dictionary: 3
passed
Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
How can solve this type of issue?
MODULE: 00 verify training files (2017-07-16 05:57)
Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
WARNING: The phonelist (C:/sphinx/*/etc/.phone) does not define the phone SIL (required!)
Found 7 words using 12 phones
WARNING: This phone (SIL) occurs in the dictionary (C:/sphinx/*/etc/*.dic), but not in the phonelist (C:/sphinx/*/etc/**.phone)
passed
Phase 2: Checking to make sure there are not duplicate entries in the dictionary
passed
Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
passed
Phase 4: Checking number of lines in the transcript file should match lines in fileids file
passed
Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
Estimated Total Hours Training: 0.0223138888888889ERROR: Not enough data for the training, we can only train CI models (set CFG_CD_TRAIN to "no")
FAILED*
Phase 6: Checking that all the words in the transcript are in the dictionary
Words in dictionary: 4
Words in filler dictionary: 3
passed
Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
Add phone SIL to the phonelist file.