im using the pocketsphinx android demo and have problems with detection. The FAQ says that i probably haven't properly configured my decoder but im using the demo config except for a changed .gram file.
I'm using Android Studio and tested the App on several devices ranging from Android 4.4 to 6.0 (i also listened to the RAW-Audio files, to check if the microphone is the problem).
im Using the KeywordSearch and i either get a lot of false results or none if i lower the threshold so it seems that it it more or less random.
Do i have to change any Config? Or is there a resource that explains me what the entries in the feat.params are?
heres my digits.gram:
zero /1e-5/
one /1e-5/
two /1e-5/
three /1e-5/
four /1e-5/
five /1e-5/
six /1e-5/
seven /1e-5/
eight /1e-5/
nine /1e-5/
Ok, so words like Binocular, Compiler, Highflying Illustrated etc are working Pretty well.
Now my question is: if i want to say calculate something like 1 + 2 the individual words are too short but every possible combination wouldnt be an option either. Is it better to use say, an NGram search?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Sorry, not sure what do you mean by "calculate". I think a practical approach would be to use an activation keyphrase like "ok google" which can be detected reliable + grammar search or language model search for recognition of the actual action. You can combine short words in activation keyphrase to get longer reliable phrase.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Oh i was trying to say i want my phone to calculate something and then say the Words "One plus two is..." but the combination of possible numbers i want to say is infinite so i was asking if its possible to use a Keyphrase + a grammar search with short words (one syllable) or if thats not possible.
As i understand the Demo the use of the keyphrase is just to activate the speech which i'm doing with a button right now.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
If you want voice calculator, you implement a keyphrase for activation or a button for activation. Once activated you recognize with a language model and return result.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Addendum: Here's a Use Case:
I have my smart watch with me and am carrying something so i have no hands free. I want to calculate how much something costs so i say to my watch: ok google whats 5.99 times 4
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello Sphinx Team,
im using the pocketsphinx android demo and have problems with detection. The FAQ says that i probably haven't properly configured my decoder but im using the demo config except for a changed .gram file.
I'm using Android Studio and tested the App on several devices ranging from Android 4.4 to 6.0 (i also listened to the RAW-Audio files, to check if the microphone is the problem).
im Using the KeywordSearch and i either get a lot of false results or none if i lower the threshold so it seems that it it more or less random.
Do i have to change any Config? Or is there a resource that explains me what the entries in the feat.params are?
heres my digits.gram:
My RecognizerSetup:
The class that implements RecognitionListener:
thats how i start the listening:
I've tried the same with the german voxforge model but that didn't work either.
Last edit: AlexM 2017-03-31
For reliable detection keyphrase should have 3-4 syllables, your digits are very short. Source:
http://cmusphinx.sourceforge.net/wiki/tutoriallm#keyword_lists
Ok, so words like Binocular, Compiler, Highflying Illustrated etc are working Pretty well.
Now my question is: if i want to say calculate something like 1 + 2 the individual words are too short but every possible combination wouldnt be an option either. Is it better to use say, an NGram search?
Sorry, not sure what do you mean by "calculate". I think a practical approach would be to use an activation keyphrase like "ok google" which can be detected reliable + grammar search or language model search for recognition of the actual action. You can combine short words in activation keyphrase to get longer reliable phrase.
Oh i was trying to say i want my phone to calculate something and then say the Words "One plus two is..." but the combination of possible numbers i want to say is infinite so i was asking if its possible to use a Keyphrase + a grammar search with short words (one syllable) or if thats not possible.
As i understand the Demo the use of the keyphrase is just to activate the speech which i'm doing with a button right now.
If you want voice calculator, you implement a keyphrase for activation or a button for activation. Once activated you recognize with a language model and return result.
Thank you. thats what i was asking for.
Addendum: Here's a Use Case:
I have my smart watch with me and am carrying something so i have no hands free. I want to calculate how much something costs so i say to my watch: ok google whats 5.99 times 4
Thanks i'll try that