Yes indeed, adding it to the dictionary is not the solution. It still seems the error is due to a word in the test data which is not in the dictionary (OOV error). So perhaps the language model was not trained to handle OOVs (closed-vocabulary), and so it asks for a dictionary entry.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
i just followed every instruction in the Robust Group Turorial..
i did not edit any files except for those with errors and i got this error.. and i used the rm1 database..
what did you mean when you asked if i trained the LM for myself?
-kris
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am no expert but I think unknown words in the language model's training or test corpus get replaced by "UNK". It is then handled like any other word (gets assigned a prior probability, etc). It is present in the language model file for RM1, but not in the dictionary. You may need to add it there to remove the message.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
hello to all..
i finished the training of sphinx 3 release and sphinxtrain with rm1.. i have recorded some utterances and sphinx 3 recognized it..
just want to ask, how do i get rid of this error?
what is the purpose of UNK?
, line 282: <UNK> is not a word in dictionary and it is not a class tag.
thanx you very much..
-kris
Yes indeed, adding it to the dictionary is not the solution. It still seems the error is due to a word in the test data which is not in the dictionary (OOV error). So perhaps the language model was not trained to handle OOVs (closed-vocabulary), and so it asks for a dictionary entry.
Oops, ignore post, it is wrong.
got the same error, think is depends if you use open or closed LM. Die you train the LM for yourself ?
greetings
Chris
hi chris.
i just followed every instruction in the Robust Group Turorial..
i did not edit any files except for those with errors and i got this error.. and i used the rm1 database..
what did you mean when you asked if i trained the LM for myself?
-kris
I am no expert but I think unknown words in the language model's training or test corpus get replaced by "UNK". It is then handled like any other word (gets assigned a prior probability, etc). It is present in the language model file for RM1, but not in the dictionary. You may need to add it there to remove the message.
It might be a silly question - but if you add <UNK> to the dictionary, what "spelling" should you provide????
(I would certainly like to handle (read as "survive") unknown words... If I survuve them I can find a way to deal with them...)
Can anyone help?
Alan
(Sorry, when I said "spelling" - I meant phonetic spelling...)