Hi, I am using PocketSphinx an in an android app to detect wake word (using Key Phrases) and perform an action when its detected. I have updated the cmudict-en-us.dict files to just have words used in key phrase. However, very frequently PocketSphinx detects not keypphrase words as key phrase. I tried to adjust the keyphrase threshold value but can't find a good fit where it works reliably. Here is the code looks like:
recognizer = SpeechRecognizerSetup.defaultSetup()
.setAcousticModel(new File(assetsDir, "en-us-ptm"))
.setDictionary(new File(assetsDir, "cmudict-en-us.dict"))
.setKeywordThreshold((float) 1e-30)
.getRecognizer();
recognizer.addListener(this)
The app will be used where parents and little kids will say the wake word to get app in listening mode (which will use Google Voice Recognizer) so I need this action to be reliable. I would greatly appreciate if someone can assist me this. Thanks,
Last edit: Nickolay V. Shmyrev 2017-07-24
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks Nickolay. Longer keyphrase indeed works better. One more issue I am seeing is that if my son (8 years old) speaks in deep voice (he has to really make an effort), it works much better than his normal voice. Interestingly, if duing my own testing, if wake word is not triggered by my own voice, I can reliablly trigger by saying the key phrase in deep voice. Is there any way to change some configuration so that it can work more reliably for kids voice (sorry not sure what's the right term for it). Thank you again for your time!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi, I am using PocketSphinx an in an android app to detect wake word (using Key Phrases) and perform an action when its detected. I have updated the cmudict-en-us.dict files to just have words used in key phrase. However, very frequently PocketSphinx detects not keypphrase words as key phrase. I tried to adjust the keyphrase threshold value but can't find a good fit where it works reliably. Here is the code looks like:
recognizer = SpeechRecognizerSetup.defaultSetup()
.setAcousticModel(new File(assetsDir, "en-us-ptm"))
.setDictionary(new File(assetsDir, "cmudict-en-us.dict"))
.setKeywordThreshold((float) 1e-30)
.getRecognizer();
recognizer.addListener(this)
The app will be used where parents and little kids will say the wake word to get app in listening mode (which will use Google Voice Recognizer) so I need this action to be reliable. I would greatly appreciate if someone can assist me this. Thanks,
Last edit: Nickolay V. Shmyrev 2017-07-24
Use longer keyphrase or try snowboy, it should be more reliable.
Thanks Nickolay. Longer keyphrase indeed works better. One more issue I am seeing is that if my son (8 years old) speaks in deep voice (he has to really make an effort), it works much better than his normal voice. Interestingly, if duing my own testing, if wake word is not triggered by my own voice, I can reliablly trigger by saying the key phrase in deep voice. Is there any way to change some configuration so that it can work more reliably for kids voice (sorry not sure what's the right term for it). Thank you again for your time!
You have to retrain the model for kids with pitch perturb to get good results on their voices.
Thank you Nickolay!