I am working on a project where I have to integrate the speech functionalities of Pocketsphinx into an Android application.
I have to detect 3 action words of which I've made a small grammar and a small dictionnary generated with LMTOOL (above a dozen words).
My grammar :
#JSGF V1.0;grammarmenu;
public<action> = VALIDATE | BACK | RETURN ;
I'm using the following assets :
* dictionnary : custom
* Language Model : custom
* Acoustic Model : latest en-us-ptm
Actually I'm having a lot of false detection problem. For example a small noise is recognize as "back", or "validate" is recognize as "return". I don't know if the problem comes from the Android device microphone or a bad configuration (probably more this one), but I've seen there are a lot of pocketsphinx config parameter and I don't know which one are useful or not.
I've already tried but with addKeywordSearch the recognition is near 0, like the decoder is not enough sensitive (inverse problem to the previous post).
Best,
Paul
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've followed your advices and completed the tutorial (see the attachtment) but is there a script like "word_align.pl" for the Tuning Tutorial to transcript the result in a more understable form ?
Thank you very much for your previous response, i'll find a way around with counting results...I 've got another question : in this case is the threshold the only parameter we can play with to improve the performances or there are others ?
Best,
Paul
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello everyone,
I am working on a project where I have to integrate the speech functionalities of Pocketsphinx into an Android application.
I have to detect 3 action words of which I've made a small grammar and a small dictionnary generated with LMTOOL (above a dozen words).
My grammar :
I'm using the following assets :
* dictionnary : custom
* Language Model : custom
* Acoustic Model : latest en-us-ptm
Actually I'm having a lot of false detection problem. For example a small noise is recognize as "back", or "validate" is recognize as "return". I don't know if the problem comes from the Android device microphone or a bad configuration (probably more this one), but I've seen there are a lot of pocketsphinx config parameter and I don't know which one are useful or not.
I've only configure theses :
I've made a sample raw with a noise which is detected and one with the "validate" command.
So, could you guys please help and let me know what could be the problem and any suggestions for solving this problem?
Thank you very much!
Paul
Last edit: paul 2016-04-15
To recognize words in continuous speech you need to use keyword spotting mode, not grammar mode as covered in tutorial
http://cmusphinx.sourceforge.net/wiki/tutoriallm
Hello Nickolay,
Thank you very much for your fast response.
I've already tried but with
addKeywordSearch
the recognition is near 0, like the decoder is not enough sensitive (inverse problem to the previous post).Best,
Paul
Dear Paul
Please read the tutorial linked above about tuning the detection threshold.
Hello again Nickolay,
I've followed your advices and completed the tutorial (see the attachtment) but is there a script like "word_align.pl" for the Tuning Tutorial to transcript the result in a more understable form ?
Best,
Paul
Last edit: paul 2016-04-15
No, there is no script yet
Hello Nickolay,
Thank you very much for your previous response, i'll find a way around with counting results...I 've got another question : in this case is the threshold the only parameter we can play with to improve the performances or there are others ?
Best,
Paul
You can use longer spotting phrase.