Menu

transcription and dictionary mismatch

Help
pannam
2017-04-11
2017-04-11
  • pannam

    pannam - 2017-04-11

    Hi, I have been trying for days but I sincerely can't understand why I get this error

    pannam@pannam-VirtualBox:~$ cd earthquake
    pannam@pannam-VirtualBox:~/earthquake$ sphinxtrain run
    Sphinxtrain path: /usr/local/lib/sphinxtrain
    Sphinxtrain binaries path: /usr/local/libexec/sphinxtrain
    Running the training
    MODULE: 000 Computing feature from audio files
    Extracting features from  segments starting at  (part 1 of 1) 
    Extracting features from  segments starting at  (part 1 of 1) 
    Feature extraction is done
    MODULE: 00 verify training files
        Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
            Found 938 words using 213 phones
        Phase 2: Checking to make sure there are not duplicate entries in the dictionary
        Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
        Phase 4: Checking number of lines in the transcript file should match lines in fileids file
        Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
            Estimated Total Hours Training: 1.67431111111111
            This is a small amount of data, no comment at this time
        Phase 6: Checking that all the words in the transcript are in the dictionary
            Words in dictionary: 935
            Words in filler dictionary: 3
    WARNING: This word: <s> was in the transcript file, but is not in the dictionary (<s> महाभूकम्पले हामीलाई देश निर्माणमा एकजुट भएर लाग्ने भावना प्रदान गरेको  </s> ). Do cases match?
        Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
    

    I have thoroughly checked the transcript file (the sentence is the first sentence in transcript file) but I can't find anything odd. This is my filler file

    <s>             SIL
    </s>            SIL
    <sil>           SIL
    
     

    Last edit: pannam 2017-04-11

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.