Pocketsphinx is producing a bizare issue where the accuracy appears to be
degrading after only a couple of queries to the engine. The first query has
near perfect accuracy - it can recognize relatively complicated and convoluted
phrases without dificulty. However the second, and third recognitions can
barely pick up a two sylable word, and by the forth query to the engine, it
simply fails to generate a hypothesis.
I'm not quite sure whats causing the problem, the original demo appeared to
work fine, and I have hardly altered the configuration of the speech engine,
other than using jsgf grammars.
Anyways, any one else ever experience something like this, or even have a
sugestion as to what I could try to remedy this?
Thanks
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Pocketsphinx is producing a bizare issue where the accuracy appears to be
degrading after only a couple of queries to the engine. The first query has
near perfect accuracy - it can recognize relatively complicated and convoluted
phrases without dificulty. However the second, and third recognitions can
barely pick up a two sylable word, and by the forth query to the engine, it
simply fails to generate a hypothesis.
For context I'm creating a pocketsphinx application for android, using jsgf
grammars (though the problem persists with fsg grammars as well). My code is
based on the pocketsphinx demo for android
http://cmusphinx.sourceforge.net/2011/05/building-pocketsphinx-on-
android/.
I'm not quite sure whats causing the problem, the original demo appeared to
work fine, and I have hardly altered the configuration of the speech engine,
other than using jsgf grammars.
Anyways, any one else ever experience something like this, or even have a
sugestion as to what I could try to remedy this?
Thanks
Provide pocketsphinx log which is created on the device
Add
option to pocketsphinx initialization and collect the audio you are trying to
recognize. Share the audio, maybe it's corrupted somehow.
Heres the raw data files. http://speechweb2.cs.uwindsor.ca/rawdata.zip
Also heres my log
Thanks
The audio has zero energy regions. You need to add "-dither yes" to engine
configuration.
That got it! Thank you, this problem has been dogging me for a while, and I
couldn't for the life of me figure it out.
Is there some documentation that you could point me to which elaborates on the
-dither option (and other command line options for that matter)?
Once again, thank you.
man pocketsphinx_batch