tried to get the Android demo application up and running. onBeginningOfSpeech and onEndOfSpeech are being called regularly.But never onResult or onPartialResult. As a result I am stuck on the screen where I have to say "oh mighty computer". What might be the reason for this?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Demo waits for "oh mighty computer" to activate decoding.
I don't hear you say "oh mighty computer" in the raw file. Start the application, say "oh mighty computer" multiple times to activate the decoding. Share the raw files again.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I said it quite a few times, I did it again and again. Now I got it convinced to recognize, at least some times. The recognition afterwards is as bad as the recognition of "oh mighty computer". Can the system be trained on Android? I only need it to recognize a few phrases, but that reliably. I am fine to train it to recognize my accent.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
In these files you should find about 10 times "oh might computer", once it recognized it and I was able to say "digits" and 1,2, then it switched back to "oh mighty computer".
but how shall that help on terms I need like "item 1" or "paint" or whatever? Especially in an app you have different users, so each user should be able to say these words once at startup for a better recognition.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Well, we can't guarantee you a good accuracy for arbitrary accent yet. To improve "oh mighty computer" recognition you need to make the following changes in the dictionary:
oh AO UW
computer K AO M P Y UW T AO R
Also make sure that kwsThreshold is configured in sources in recognizer setup method to 1e-40, not 1e-20.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
1.I said "Oh mighty computer" several times, I did it again and again.
But it coudn't recogniz the words.
2.Raw files is not created in the destinated folder.
Please check my Log details.
6-01 20:23:55.571 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: dict.c(213): Dictionary size 133425, allocated 0 KiB for strings, 0 KiB for phones
06-01 20:23:55.571 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: dict.c(361): 5 words read
06-01 20:23:55.571 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: dict2pid.c(396): Building PID tables for dictionary
06-01 20:23:55.571 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: dict2pid.c(406): Allocating 42^3 * 2 bytes (144 KiB) for word-initial triphones
06-01 20:23:55.621 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: dict2pid.c(132): Allocated 21336 bytes (20 KiB) for word-final triphones
06-01 20:23:55.631 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: dict2pid.c(196): Allocated 21336 bytes (20 KiB) for single-phone word triphones
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: kws_search.c(406): KWS(beam: -1080, plp: -23, default threshold -450, delay 10)
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/SpeechRecognizer: Load JSGF /storage/emulated/0/Android/data/edu.cmu.sphinx.pocketsphinx/files/sync/menu.gram
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: jsgf.c(706): Defined rule: PUBLIC <menu.item>
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(208): Computing transitive closure for null transitions
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(270): 0 null transitions added
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_search.c(227): FSG(beam: -1080, pbeam: -1080, wbeam: -634; wip: -26, pip: 0)
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(423): Adding silence transitions for <sil> to FSG
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(443): Added 2 silence word transitions
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(423): Adding silence transitions for <sil> to FSG
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(443): Added 2 silence word transitions
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(423): Adding silence transitions for [NOISE] to FSG
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(443): Added 2 silence word transitions
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_search.c(173): Added 1 alternate word transitions
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_lextree.c(110): Allocated 172 bytes (0 KiB) for left and right context phones
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_lextree.c(256): 25 HMM nodes in lextree (8 leaves)
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_lextree.c(259): Allocated 3000 bytes (2 KiB) for all lextree nodes
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_lextree.c(262): Allocated 960 bytes (0 KiB) for lextree leafnodes
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/SpeechRecognizer: Load JSGF /storage/emulated/0/Android/data/edu.cmu.sphinx.pocketsphinx/files/sync/digits.gram
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: jsgf.c(706): Defined rule: <digits.digit>
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: jsgf.c(706): Defined rule: <digits.g00001>
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: jsgf.c(706): Defined rule: PUBLIC <digits.digits>
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: jsgf.c(365): Right recursion <digits.g00001> 2 => 0
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(208): Computing transitive closure for null transitions
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(270): 0 null transitions added
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_search.c(227): FSG(beam: -1080, pbeam: -1080, wbeam: -634; wip: -26, pip: 0)
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(423): Adding silence transitions for <sil> to FSG
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(443): Added 3 silence word transitions
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(423): Adding silence transitions for <sil> to FSG
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(443): Added 3 silence word transitions
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(423): Adding silence transitions for [NOISE] to FSG
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(443): Added 3 silence word transitions
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_search.c(173): Added 4 alternate word transitions
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_lextree.c(110): Allocated 258 bytes (0 KiB) for left and right context phones
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_lextree.c(256): 191 HMM nodes in lextree (163 leaves)
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_lextree.c(259): Allocated 22920 bytes (22 KiB) for all lextree nodes
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_lextree.c(262): Allocated 19560 bytes (19 KiB) for lextree leafnodes
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/SpeechRecognizer: Load N-gram model /storage/emulated/0/Android/data/edu.cmu.sphinx.pocketsphinx/files/sync/weather.dmp
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(354): Trying to read LM in trie binary format
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(365): Header doesn't match
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(177): Trying to read LM in arpa format
06-01 20:23:55.701 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(70): No \data\ mark in LM file
06-01 20:23:55.701 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(445): Trying to read LM in dmp format
06-01 20:23:55.701 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(527): ngrams 1=779, 2=14416, 3=46976
06-01 20:23:56.022 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: lm_trie.c(474): Training quantizer
06-01 20:23:56.202 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: lm_trie.c(482): Building LM trie
06-01 20:24:01.618 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_search_fwdtree.c(74): Initializing search tree
06-01 20:24:01.628 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_search_fwdtree.c(101): 788 unique initial diphones
06-01 20:24:01.628 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_search_fwdtree.c(186): Creating search channels
06-01 20:24:01.668 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_search_fwdtree.c(323): Max nonroot chan increased to 2220
06-01 20:24:01.668 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_search_fwdtree.c(333): Created 256 root, 2092 non-root channels, 9 single-phone words
06-01 20:24:01.668 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
06-01 20:24:01.668 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(354): Trying to read LM in trie binary format
06-01 20:24:01.668 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(365): Header doesn't match
06-01 20:24:01.668 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(177): Trying to read LM in arpa format
06-01 20:24:01.668 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(70): No \data\ mark in LM file
06-01 20:24:01.668 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(445): Trying to read LM in dmp format
06-01 20:24:01.668 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(527): ngrams 1=43, 2=1509, 3=21837
06-01 20:24:01.708 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: lm_trie.c(474): Training quantizer
06-01 20:24:01.748 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: lm_trie.c(482): Building LM trie
06-01 20:24:01.778 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: allphone_search.c(236): Building PHMM net of 42 phones
06-01 20:24:01.778 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: allphone_search.c(309): 42 nodes, 1764 links
06-01 20:24:01.778 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: allphone_search.c(603): Allphone(beam: -1080, pbeam: -1080)
06-01 20:24:01.778 12924-12924/edu.cmu.sphinx.pocketsphinx I/SpeechRecognizer: Start recognition "wakeup"
06-01 20:24:01.888 12924-13478/edu.cmu.sphinx.pocketsphinx D/SpeechRecognizer: Starting decoding
06-01 20:24:01.888 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: pocketsphinx.c(986): Writing raw audio file: /storage/emulated/0/Android/data/edu.cmu.sphinx.pocketsphinx/files/sync/000000000.raw
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: cmn_live.c(88): Update from <
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 40.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 10.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 10.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 0.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 0.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 0.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 0.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 0.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 0.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 0.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 0.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 0.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 0.00
* 06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: >
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I wrote to identify the "hello" keyword to initiate the voice to text, then it should convert voice to text continuoulsy.But it only identify the "hello".
My purpose is to continoulsy do the voice to text process.
public class MainActivity extends Activity implements
RecognitionListener {
/* Named searches allow to quickly reconfigure the decoder */
privatestaticfinalStringKWS_SEARCH="wakeup";
privatestaticfinalStringKEYPHRASE="hello";
privatestaticfinalStringFORECAST_SEARCH="forecast";
privatestaticfinalStringDIGITS_SEARCH="digits";
privatestaticfinalStringPHONE_SEARCH="phones";
privatestaticfinalStringMENU_SEARCH="menu";/* Used to handle permission request */
privatestaticfinalintPERMISSIONS_REQUEST_RECORD_AUDIO=1;
privateSpeechRecognizerrecognizer;
privateHashMap<String,Integer>captions;
privateFileappDir;@Override
publicvoidonCreate(Bundlestate){super.onCreate(state);setContentView(R.layout.main);try{Log.d("Tag","before trying to sync assets");Assetsassets=newAssets(MainActivity.this);appDir=assets.syncAssets();}catch(IOExceptione){thrownewRuntimeException("failed to synchronize assets",e);}try{Log.d("TAG","before recognizer instantiaiton");recognizer=SpeechRecognizerSetup.defaultSetup().setAcousticModel(newFile(appDir,"en-us-ptm")).setDictionary(newFile(appDir,"cmudict-en-us.dict")).setRawLogDir(appDir)//Todisableloggingofrawaudiocommentoutthiscall(takesalotofspaceonthedevice).setKeywordThreshold(1e-40f).getRecognizer();recognizer.addListener(this);recognizer.addKeyphraseSearch(KWS_SEARCH,KEYPHRASE);recognizer.startListening(KWS_SEARCH);}catch(Exceptione){e.printStackTrace();}}@Override
publicvoidonPartialResult(Hypothesishyp){if(hyp==null)return;//Restarttherecognitionifkeywordisfound
Stringtext=hyp.getHypstr();Log.d("Spoken text",text);((TextView)findViewById(R.id.result_text)).setText(text);recognizer.cancel();recognizer.startListening(KWS_SEARCH);}@Override
publicvoidonDestroy(){super.onDestroy();if(recognizer!=null){recognizer.cancel();recognizer.shutdown();}}@Override
publicvoidonResult(Hypothesishypothesis){((TextView)findViewById(R.id.result_text)).setText("");if(hypothesis!=null){Stringtext=hypothesis.getHypstr();makeText(getApplicationContext(),text,Toast.LENGTH_SHORT).show();}}@Override
publicvoidonBeginningOfSpeech(){}/** * We stop recognizer here to get a final result */@Override
publicvoidonEndOfSpeech(){if(!recognizer.getSearchName().equals(KWS_SEARCH))switchSearch(KWS_SEARCH);}
privatevoidswitchSearch(StringsearchName){recognizer.stop();//Ifwearenotspotting,startlisteningwithtimeout(10000msor10seconds).if(searchName.equals(KWS_SEARCH))recognizer.startListening(searchName);elserecognizer.startListening(searchName,10000);Stringcaption=getResources().getString(captions.get(searchName));((TextView)findViewById(R.id.caption_text)).setText(caption);}@Override
publicvoidonError(Exceptionerror){((TextView)findViewById(R.id.caption_text)).setText(error.getMessage());}@Override
publicvoidonTimeout(){switchSearch(KWS_SEARCH);}
}
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello Sir, You have done a great job.
I havce tried the demo its working fine . but I am having some doubts it would helpful for me if you resolved it.
1) when I tried my key pharese "help", it will detect the help word without saying a single word. though I have made my cabin noise frees still it detects the help word without saying anything. I don't undestand how to resolve this.
2) Do I need to add someting in dictionary to reognize my word "help".
3) Whats is the realtionship between keypharse and keyword.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
1) when I tried my key pharese "help", it will detect the help word without saying a single word. though I have made my cabin noise frees still it detects the help word without saying anything. I don't undestand how to resolve this.
This keyphrase is too short for reliable detection. Tutorial recommends to use keyword of 3-5 syllables. You need to read the tutorial first.
Hi,
thanks for the soloution.
I am playing the music in the background and at the same time if I say "Oh Mighty Computer" it dosent recognize it. what should I do in that case? My requirement is such that it will play music continuesly in background and meanwhile if I say "Oh Mighty Computer" it should detect it. how do I remove noise from the system. the music is played in the same device through which we are listening.
Can you have solution for this?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
tried to get the Android demo application up and running. onBeginningOfSpeech and onEndOfSpeech are being called regularly.But never onResult or onPartialResult. As a result I am stuck on the screen where I have to say "oh mighty computer". What might be the reason for this?
Demo creates raw files in on sdcard in the folder pointed by setRawLogDir in sources. Share them.
here we go
Last edit: callintegrator 2014-06-13
Demo waits for "oh mighty computer" to activate decoding.
I don't hear you say "oh mighty computer" in the raw file. Start the application, say "oh mighty computer" multiple times to activate the decoding. Share the raw files again.
I said it quite a few times, I did it again and again. Now I got it convinced to recognize, at least some times. The recognition afterwards is as bad as the recognition of "oh mighty computer". Can the system be trained on Android? I only need it to recognize a few phrases, but that reliably. I am fine to train it to recognize my accent.
PS: Can I analyze the raw file myself to see what is being said?
You need to share raw files in order to get help on this issue
Yes, things could be significantly improved
Yes, they are simply raw audio files. You can open them in Audacity or in Wavesurfer.
How can I do this?
You need to share raw files
https://anonfiles.com/file/74530c8ff2fa278aa5a80df9f5dd7a38
In these files you should find about 10 times "oh might computer", once it recognized it and I was able to say "digits" and 1,2, then it switched back to "oh mighty computer".
but how shall that help on terms I need like "item 1" or "paint" or whatever? Especially in an app you have different users, so each user should be able to say these words once at startup for a better recognition.
Well, we can't guarantee you a good accuracy for arbitrary accent yet. To improve "oh mighty computer" recognition you need to make the following changes in the dictionary:
Also make sure that kwsThreshold is configured in sources in recognizer setup method to 1e-40, not 1e-20.
I have downloaded the demo application for Android. from https://github.com/cmusphinx/pocketsphinx-android-demo.
I'm also encoutered the same problems which are mentioned before.
https://sourceforge.net/p/cmusphinx/discussion/help/thread/3644b282/#34dc
https://sourceforge.net/p/cmusphinx/discussion/help/thread/3644b282/#2b49
And I want to recognize my own words also (Continous Speech).
https://sourceforge.net/p/cmusphinx/discussion/help/thread/3644b282/#ec5f
Could anyone help me??
Last edit: Jeyamaal 2017-06-01
Sure, as soon as you describe the problem you have in more details.
1.I said "Oh mighty computer" several times, I did it again and again.
But it coudn't recogniz the words.
2.Raw files is not created in the destinated folder.
Please check my Log details.
6-01 20:23:55.571 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: dict.c(213): Dictionary size 133425, allocated 0 KiB for strings, 0 KiB for phones
06-01 20:23:55.571 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: dict.c(361): 5 words read
06-01 20:23:55.571 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: dict2pid.c(396): Building PID tables for dictionary
06-01 20:23:55.571 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: dict2pid.c(406): Allocating 42^3 * 2 bytes (144 KiB) for word-initial triphones
06-01 20:23:55.621 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: dict2pid.c(132): Allocated 21336 bytes (20 KiB) for word-final triphones
06-01 20:23:55.631 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: dict2pid.c(196): Allocated 21336 bytes (20 KiB) for single-phone word triphones
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: kws_search.c(406): KWS(beam: -1080, plp: -23, default threshold -450, delay 10)
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/SpeechRecognizer: Load JSGF /storage/emulated/0/Android/data/edu.cmu.sphinx.pocketsphinx/files/sync/menu.gram
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: jsgf.c(706): Defined rule: PUBLIC <menu.item>
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(208): Computing transitive closure for null transitions
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(270): 0 null transitions added
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_search.c(227): FSG(beam: -1080, pbeam: -1080, wbeam: -634; wip: -26, pip: 0)
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(423): Adding silence transitions for <sil> to FSG
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(443): Added 2 silence word transitions
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(423): Adding silence transitions for <sil> to FSG
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(443): Added 2 silence word transitions
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(423): Adding silence transitions for [NOISE] to FSG
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(443): Added 2 silence word transitions
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_search.c(173): Added 1 alternate word transitions
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_lextree.c(110): Allocated 172 bytes (0 KiB) for left and right context phones
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_lextree.c(256): 25 HMM nodes in lextree (8 leaves)
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_lextree.c(259): Allocated 3000 bytes (2 KiB) for all lextree nodes
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_lextree.c(262): Allocated 960 bytes (0 KiB) for lextree leafnodes
06-01 20:23:55.681 12924-12975/edu.cmu.sphinx.pocketsphinx I/SpeechRecognizer: Load JSGF /storage/emulated/0/Android/data/edu.cmu.sphinx.pocketsphinx/files/sync/digits.gram
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: jsgf.c(706): Defined rule: <digits.digit>
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: jsgf.c(706): Defined rule: <digits.g00001>
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: jsgf.c(706): Defined rule: PUBLIC <digits.digits>
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: jsgf.c(365): Right recursion <digits.g00001> 2 => 0
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(208): Computing transitive closure for null transitions
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(270): 0 null transitions added
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_search.c(227): FSG(beam: -1080, pbeam: -1080, wbeam: -634; wip: -26, pip: 0)
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(423): Adding silence transitions for <sil> to FSG
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(443): Added 3 silence word transitions
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(423): Adding silence transitions for <sil> to FSG
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(443): Added 3 silence word transitions
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(423): Adding silence transitions for [NOISE] to FSG
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_model.c(443): Added 3 silence word transitions
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_search.c(173): Added 4 alternate word transitions
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_lextree.c(110): Allocated 258 bytes (0 KiB) for left and right context phones
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_lextree.c(256): 191 HMM nodes in lextree (163 leaves)
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_lextree.c(259): Allocated 22920 bytes (22 KiB) for all lextree nodes
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: fsg_lextree.c(262): Allocated 19560 bytes (19 KiB) for lextree leafnodes
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/SpeechRecognizer: Load N-gram model /storage/emulated/0/Android/data/edu.cmu.sphinx.pocketsphinx/files/sync/weather.dmp
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(354): Trying to read LM in trie binary format
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(365): Header doesn't match
06-01 20:23:55.691 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(177): Trying to read LM in arpa format
06-01 20:23:55.701 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(70): No \data\ mark in LM file
06-01 20:23:55.701 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(445): Trying to read LM in dmp format
06-01 20:23:55.701 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(527): ngrams 1=779, 2=14416, 3=46976
06-01 20:23:56.022 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: lm_trie.c(474): Training quantizer
06-01 20:23:56.202 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: lm_trie.c(482): Building LM trie
06-01 20:24:01.618 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_search_fwdtree.c(74): Initializing search tree
06-01 20:24:01.628 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_search_fwdtree.c(101): 788 unique initial diphones
06-01 20:24:01.628 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_search_fwdtree.c(186): Creating search channels
06-01 20:24:01.668 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_search_fwdtree.c(323): Max nonroot chan increased to 2220
06-01 20:24:01.668 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_search_fwdtree.c(333): Created 256 root, 2092 non-root channels, 9 single-phone words
06-01 20:24:01.668 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
06-01 20:24:01.668 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(354): Trying to read LM in trie binary format
06-01 20:24:01.668 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(365): Header doesn't match
06-01 20:24:01.668 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(177): Trying to read LM in arpa format
06-01 20:24:01.668 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(70): No \data\ mark in LM file
06-01 20:24:01.668 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(445): Trying to read LM in dmp format
06-01 20:24:01.668 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: ngram_model_trie.c(527): ngrams 1=43, 2=1509, 3=21837
06-01 20:24:01.708 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: lm_trie.c(474): Training quantizer
06-01 20:24:01.748 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: lm_trie.c(482): Building LM trie
06-01 20:24:01.778 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: allphone_search.c(236): Building PHMM net of 42 phones
06-01 20:24:01.778 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: allphone_search.c(309): 42 nodes, 1764 links
06-01 20:24:01.778 12924-12975/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: allphone_search.c(603): Allphone(beam: -1080, pbeam: -1080)
06-01 20:24:01.778 12924-12924/edu.cmu.sphinx.pocketsphinx I/SpeechRecognizer: Start recognition "wakeup"
06-01 20:24:01.888 12924-13478/edu.cmu.sphinx.pocketsphinx D/SpeechRecognizer: Starting decoding
06-01 20:24:01.888 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: pocketsphinx.c(986): Writing raw audio file: /storage/emulated/0/Android/data/edu.cmu.sphinx.pocketsphinx/files/sync/000000000.raw
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: INFO: cmn_live.c(88): Update from <
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 40.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 10.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 10.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 0.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 0.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 0.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 0.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 0.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 0.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 0.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 0.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 0.00
06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: 0.00
* 06-01 20:24:15.492 12924-13478/edu.cmu.sphinx.pocketsphinx I/cmusphinx: >
Raw files must be there, your log says
Writing raw audio file: /storage/emulated/0/Android/data/edu.cmu.sphinx.pocketsphinx/files/sync/000000000.raw
You need to share them. Also you need to try to run the system for longer time, not just for 5 seconds
Some times its recognize , but most of the time it couldn't recognized.
You say simply "computer", it expects the keyphrase "oh mighty computer" without pauses. And pronunciation should be closer to US English.
I wrote to identify the "hello" keyword to initiate the voice to text, then it should convert voice to text continuoulsy.But it only identify the "hello".
My purpose is to continoulsy do the voice to text process.
Could you please help me??
Please check my code below.
package edu.cmu.pocketsphinx.demo;
import android.Manifest;
import android.app.Activity;
import android.content.pm.PackageManager;
import android.os.AsyncTask;
import android.os.Bundle;
import android.support.v4.app.ActivityCompat;
import android.support.v4.content.ContextCompat;
import android.util.Log;
import android.widget.TextView;
import android.widget.Toast;
import java.io.File;
import java.io.IOException;
import java.util.HashMap;
import edu.cmu.pocketsphinx.Assets;
import edu.cmu.pocketsphinx.Hypothesis;
import edu.cmu.pocketsphinx.RecognitionListener;
import edu.cmu.pocketsphinx.SpeechRecognizer;
import edu.cmu.pocketsphinx.SpeechRecognizerSetup;
import static android.widget.Toast.makeText;
public class MainActivity extends Activity implements
RecognitionListener {
}
It is not possible
https://stackoverflow.com/questions/25949295/cmusphinx-pocketsphinx-recognize-all-or-large-amount-of-words
Hello Sir, You have done a great job.
I havce tried the demo its working fine . but I am having some doubts it would helpful for me if you resolved it.
1) when I tried my key pharese "help", it will detect the help word without saying a single word. though I have made my cabin noise frees still it detects the help word without saying anything. I don't undestand how to resolve this.
2) Do I need to add someting in dictionary to reognize my word "help".
3) Whats is the realtionship between keypharse and keyword.
This keyphrase is too short for reliable detection. Tutorial recommends to use keyword of 3-5 syllables. You need to read the tutorial first.
http://cmusphinx.github.io/wiki/tutoriallm
keyphase consist of several words like "oh mighty computer".
Hi,
thanks for the soloution.
I am playing the music in the background and at the same time if I say "Oh Mighty Computer" it dosent recognize it. what should I do in that case? My requirement is such that it will play music continuesly in background and meanwhile if I say "Oh Mighty Computer" it should detect it. how do I remove noise from the system. the music is played in the same device through which we are listening.
Can you have solution for this?
Pocketsphinx does not support such thing right now. That would be a problem with a PhD disseration.