ERROR: "ngram_search_fwdtree.c", line 336: No word from the language model...

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

ERROR: "ngram_search_fwdtree.c", line 336: No word from the language model has pronunciation in the dictionary

Forum: Help

Creator: ab1984

Created: 2015-08-11

Updated: 2015-08-16

ab1984 - 2015-08-11

I'm getting an error when decoding in sphinxtrain. as stated in the subject line I can seen the error in log file and I cannot find any inconsistance in .dic and .lm files. they are matching exactly as I can see. HERE are the .dic and .lm files and the error log. Please let me know is their any differences between these two files.

If their is any encoding mismatch between two files ,how can I find that?

Thank you

Last edit: ab1984 2015-08-11

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2015-08-11
  
  You incorrectly created language model. You added <s> before every word and </s> after every word, without spaces so that every word now contains <s>. You can open langauge model file and see that.
  
  Actually you do not need to add <s> to the training corpra at all. SRILM adds them automatically.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ab1984 - 2015-08-12

Thank you Nickolay. It worked :)

MODULE: DECODE Decoding using models previously trained
Decoding 4896 segments starting at 0 (part 1 of 1)
0%
Aligning results to find error rate
SENTENCE ERROR: 40.2% (1970/4896) WORD ERROR RATE: 21.8% (2634/12063)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.