I built a custom JSGF grammar consisting of about 1000 words (common words such as egg, hat and a lot of the numbers. This is for a kids app). I also used the g2p model to generate pronunciations for all of these and am using that as the dictionary. I'm using the default English model that's part of the pocketsphinx-android demo, which I believe is the en-us-ptm-5.2 model. The application is built on top of the pocketsphinx demo removing the wake up and menu selection and only running the above specified grammar.
When I run this on a mobile device, I see a lot of incorrect detections (currently being tested by an adult itself). Sometimes, the partial hypothesis seems to be predicting the target word but the final result is often incorrect. At other times, the hypothesis is not close to the word at all.
Could anyone help me and let me know what I'm doing wrong and how I can fix this?
Thanks,
Hitesh
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Nickolay,
Thanks for that. Could you post the link to where the speedup is discussed? Is my approach for this right or should I be using some kind of language model (n-gram or RNNLM)?
Also, does it make a difference if I use something like the cmudict with the extra words added as the lexicon, versus using only the words in the grammar as the lexicon?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Is my approach for this right or should I be using some kind of language model (n-gram or RNNLM)?
You didn't provide enough information about application you want to implement to enable answer on this question. You also didn't provide enough information about the problem - logcat output, data files, etc.
Also, does it make a difference if I use something like the cmudict with the extra words added as the lexicon, versus using only the words in the grammar as the lexicon?
It does not matter.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I built a custom JSGF grammar consisting of about 1000 words (common words such as egg, hat and a lot of the numbers. This is for a kids app). I also used the g2p model to generate pronunciations for all of these and am using that as the dictionary. I'm using the default English model that's part of the pocketsphinx-android demo, which I believe is the en-us-ptm-5.2 model. The application is built on top of the pocketsphinx demo removing the wake up and menu selection and only running the above specified grammar.
When I run this on a mobile device, I see a lot of incorrect detections (currently being tested by an adult itself). Sometimes, the partial hypothesis seems to be predicting the target word but the final result is often incorrect. At other times, the hypothesis is not close to the word at all.
Could anyone help me and let me know what I'm doing wrong and how I can fix this?
Thanks,
Hitesh
Mobile device is too slow to recognize such a large grammar, you can see xRT details in the log. Speedups are discussed on our wiki.
Hi Nickolay,
Thanks for that. Could you post the link to where the speedup is discussed? Is my approach for this right or should I be using some kind of language model (n-gram or RNNLM)?
Also, does it make a difference if I use something like the cmudict with the extra words added as the lexicon, versus using only the words in the grammar as the lexicon?
http://cmusphinx.sourceforge.net/wiki/pocketsphinxhandhelds
You didn't provide enough information about application you want to implement to enable answer on this question. You also didn't provide enough information about the problem - logcat output, data files, etc.
It does not matter.