Hello! I'm trying to use pocketsphinx in Android application for keyword spotting. I use "zero_ru.cd_ptm_4000" Russian language model. I initialize the recognizer with -allphone_ci=true according to the demo and then call addKeywordSearch(). One of my keyphases is "прослушать сообщение". The problem is that when I test it with my own voice and other male voices - it seems to work well, but when I play the phrase with female voice using different speech synthesizers (i.e. http://www.ispeech.org/text.to.speech), Sphinx cannot recognize the phrase even with threshold "1e-90". However, I didn't test it with real female voice yet, only the synthesized one. Also I experimented with changing my own voice: for example, I tried to speak by falsetto, and the result is the same - no luck with any threshold.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello! I'm trying to use pocketsphinx in Android application for keyword spotting. I use "zero_ru.cd_ptm_4000" Russian language model. I initialize the recognizer with -allphone_ci=true according to the demo and then call addKeywordSearch(). One of my keyphases is "прослушать сообщение". The problem is that when I test it with my own voice and other male voices - it seems to work well, but when I play the phrase with female voice using different speech synthesizers (i.e. http://www.ispeech.org/text.to.speech), Sphinx cannot recognize the phrase even with threshold "1e-90". However, I didn't test it with real female voice yet, only the synthesized one. Also I experimented with changing my own voice: for example, I tried to speak by falsetto, and the result is the same - no luck with any threshold.
You can provide recordings you trying.
It is better to try with real voice, TTS voices might be corrupted in a strange ways.
Russian model is not the most accurate one either.
Ok, I've recorded 2 variations of my voice which are not recognized for me with any threshold.
Last edit: Eugene Kudelevsky 2016-01-13
Files are 44.1khz stereo. Decoder expects 16khz mono.
Ok, I've converted them to the required format
Last edit: Eugene Kudelevsky 2016-01-13
Nickolay, any ideas?
You need to provide command line you are using and required data files.