After installing the android project, it runs fine, so i can say 'oh mighty computer' and the app recognize it. Cool
But, when i change it for simply 'hello', i have the onPartialResult function called many times by second. Each time, the Hypothesis containing one more 'hello', without saying anything.
So after 4 seconds, the Hypothesis contains "hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello".
Could you tell me what is wrong.
Regards
Alexandre
this the code
/*Inpartialresultwegetquickupdatesaboutcurrenthypothesis.In*keywordspottingmodewecanreacthere,inothermodesweneedtowait*forfinalresultinonResult.*/@OverridepublicvoidonPartialResult(Hypothesishypothesis){if(hypothesis==null)return;Stringtext=hypothesis.getHypstr();intprob=hypothesis.getProb();Log.d("TEST","onPartialResult: "+text);Log.d("TEST","onPartialResult - prob: "+prob);if(text.equals(KEYPHRASE)){((TextView)findViewById(R.id.result_text)).setText("partial:"+"bingo:"+text);recognizer.stop();}}/***Thiscallbackiscalledwhenwestoptherecognizer.*/@OverridepublicvoidonResult(Hypothesishypothesis){((TextView)findViewById(R.id.result_text)).setText("!!");if(hypothesis!=null){Stringtext=hypothesis.getHypstr();((TextView)findViewById(R.id.result_text)).setText(text+"!!");Log.d("TEST","onResult:"+text);makeText(getApplicationContext(),text,Toast.LENGTH_SHORT).show();switchSearch(KWS_SEARCH);}}@OverridepublicvoidonBeginningOfSpeech(){Log.d("TEST","onBeginningOfSpeech");}/***Westoprecognizerheretogetafinalresult*/@OverridepublicvoidonEndOfSpeech(){Log.d("TEST","onEndOfSpeech");Log.d("TEST","onEndOfSpeech:"+recognizer.getSearchName());if(!recognizer.getSearchName().equals(KWS_SEARCH))switchSearch(KWS_SEARCH);}privatevoidswitchSearch(StringsearchName){recognizer.stop();// If we are not spotting, start listening with timeout (10000 ms or 10 seconds).if(searchName.equals(KWS_SEARCH))recognizer.startListening(searchName);elserecognizer.startListening(searchName,10000);Stringcaption=getResources().getString(captions.get(searchName));((TextView)findViewById(R.id.caption_text)).setText(caption);}privatevoidsetupRecognizer(FileassetsDir)throwsIOException{// The recognizer can be configured to perform multiple searches// of different kind and switch between themrecognizer=defaultSetup().setAcousticModel(newFile(assetsDir,"en-us-ptm")).setDictionary(newFile(assetsDir,"cmudict-en-us.dict"))// To disable logging of raw audio comment out this call (takes a lot of space on the device).setRawLogDir(assetsDir)// Threshold to tune for keyphrase to balance between false alarms and misses.setKeywordThreshold(1e-17f)/*45*/// Use context-independent phonetic search, context-dependent is too slow for mobile.setBoolean("-allphone_ci",true)/*true*/.getRecognizer();
Last edit: Nickolay V. Shmyrev 2015-04-09
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I need to build an android app for french kids. They will speak single word, not phrase...
So do you think CMU Sphinx is not appropriate?
I've played with threshold value and it avoids plenty of onpartialResult, that's great.
But with some words like 'rabbit', i can't manage to find the "good" threshold value.
Once it don't detect it, once it detects it every time, without speaking...
One more question : in the app, an exercice will have 3 or 4 words.
I believe that i'll need to set a threshold value for each word.
So it is possible to set threshold value when i call "recognizer.startListening"
Thanks again
Alexandre
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
But with some words like 'rabbit', i can't manage to find the "good" threshold value. Once it don't detect it, once it detects it every time, without speaking.
Maybe your french 'r' is not good enough. Good acoustic model should fix this issue.
So it is possible to set threshold value when i call "recognizer.startListening"
You can configure recognizer to look for several keyphrases and specify threshold for each phrase separately. See
If you have some time, i've got severals questions (again...)
1 - In order to find the good threshold for each word, it is possible to update the threshold value, without initialize the recognizer each time.
If yes, in which object/method could i achieve this.
2 - What do you mean by "Good acoustic model"? It is possible to make my own dictionary with a new voice, and only hundred of word?
3 - If 2 is yes, it is a big deal? Could you pointe me to a doc?
The goal of all of this is an application for french kids, to learn them some english words.
Thanks again
Regards
Alexandre
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
In order to find the good threshold for each word, it is possible to update the threshold value, without initialize the recognizer each time.
If yes, in which object/method could i achieve this.
It is not possible now.
2 - What do you mean by "Good acoustic model"?
Good acoustic models recognizes sounds you need accurately.
It is possible to make my own dictionary with a new voice, and only hundred of word?
It is possible to to train acoustic models, but you need a lot of data for training. For isolated words you need about 100 examples of each word you want to train.
for children you certainly need to train because our models are for adults.
3 - If 2 is yes, it is a big deal? Could you pointe me to a doc?
//setupTaskRecognizerpublicvoidsetupTaskRecognizer(){enableControls(false);((TextView)view.findViewById(R.id.caption_text)).setText("Initialisation de la reconnaissance");newAsyncTask<Void,Void,Exception>(){@OverrideprotectedExceptiondoInBackground(Void...params){try{setupRecognizer(assetDir,current_word);}catch(IOExceptione){returne;}returnnull;}@OverrideprotectedvoidonPostExecute(Exceptionresult){if(result!=null){((TextView)view.findViewById(R.id.caption_text)).setText("Failed to init recognizer "+result);}else{((TextView)view.findViewById(R.id.caption_text)).setText("Initialisation de la reconnaissance terminée");switchSearch(KEY_SEARCH);enableControls(true);}}}.execute();}
I was recreating the recognizer, because last week, you told me that it is not possible to update the threshold value, without initialize the recognizer each time.
Anyway, thanks again for your help.
Now, the teachers can set the threshold for each word they want the children to learn.
And, if it is not enougth, i believe that i will do an acoustic model with data from children voice.
Regards
Alexandre
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello
After installing the android project, it runs fine, so i can say 'oh mighty computer' and the app recognize it. Cool
But, when i change it for simply 'hello', i have the onPartialResult function called many times by second. Each time, the Hypothesis containing one more 'hello', without saying anything.
So after 4 seconds, the Hypothesis contains "hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello".
Could you tell me what is wrong.
Regards
Alexandre
this the code
Last edit: Nickolay V. Shmyrev 2015-04-09
You can set keyword threshold to recognize word more reliably. A value like 1e-1 might be reasonable.
Overall "hello" is too short, you will not get a reliable activation with it.
Thanks Nickolay
I need to build an android app for french kids. They will speak single word, not phrase...
So do you think CMU Sphinx is not appropriate?
I've played with threshold value and it avoids plenty of onpartialResult, that's great.
But with some words like 'rabbit', i can't manage to find the "good" threshold value.
Once it don't detect it, once it detects it every time, without speaking...
One more question : in the app, an exercice will have 3 or 4 words.
I believe that i'll need to set a threshold value for each word.
So it is possible to set threshold value when i call "recognizer.startListening"
Thanks again
Alexandre
Maybe your french 'r' is not good enough. Good acoustic model should fix this issue.
You can configure recognizer to look for several keyphrases and specify threshold for each phrase separately. See
http://stackoverflow.com/questions/25748113/recognizing-multiple-keywords-using-pocketsphinx
Nickolay
If you have some time, i've got severals questions (again...)
1 - In order to find the good threshold for each word, it is possible to update the threshold value, without initialize the recognizer each time.
If yes, in which object/method could i achieve this.
2 - What do you mean by "Good acoustic model"? It is possible to make my own dictionary with a new voice, and only hundred of word?
3 - If 2 is yes, it is a big deal? Could you pointe me to a doc?
The goal of all of this is an application for french kids, to learn them some english words.
Thanks again
Regards
Alexandre
It is not possible now.
Good acoustic models recognizes sounds you need accurately.
It is possible to to train acoustic models, but you need a lot of data for training. For isolated words you need about 100 examples of each word you want to train.
for children you certainly need to train because our models are for adults.
http://cmusphinx.sourceforge.net/wiki/tutorialam
Hi Nickolay,
thanks for your time.
Now, i try to give to the teachers, the ability to set themselves the threshold for each word in a lesson.
They have a plus and minus button which play with the threshold.
When they validate it, i setup again the SpeechRecognizer.
The problem is that when i setup again the SpeechRecognizer object, the recognizer is slower and slower and after 1 or 2 setup, it recognize nothing.
I believe that i'm not deleting the object in the right way.
This is a piece of code, and i hope you could help me on it.
Thanks a lot
Regards
Alexandre
It looks like an object leak, however, it is not easy to figure out why is it leaked, a logcat output might be helpful.
Actually you do not need to recreate the recognizer every time to change the threshold. You can change threshold with something like:
and then just readd the search to the decoder, it will have a new threshold.
Last edit: Nickolay V. Shmyrev 2015-04-20
Hi Nickolay
Thanks, it's work like a charm!
I was recreating the recognizer, because last week, you told me that it is not possible to update the threshold value, without initialize the recognizer each time.
Anyway, thanks again for your help.
Now, the teachers can set the threshold for each word they want the children to learn.
And, if it is not enougth, i believe that i will do an acoustic model with data from children voice.
Regards
Alexandre