CMU Sphinx / Forums / Help: Is it the right way to implement Pocketsphinx voice control over Android app?

Zbigniew - 2015-09-07

Hello,

I implemented simple keyword search for Android using Pocketsphinx, however my results are not satisfactory. Please advise me if I did it the right way and what I can improve.

The goal is to recognize key words spoken by the user to the phone.

My list of words looks like this (file "phrases"):

forced error /1e-38/
second /1e-10/
double fault /1e-20/
error /1e+3/
winner /1e-12/
player one /1e-28/
etc.

This is the way I setup my recognizer:

private void setupRecognizer(File assetsDir) throws IOException { recognizer = defaultSetup() .setAcousticModel(new File(assetsDir, "en-us-ptm")) .setDictionary(new File(assetsDir, "cmudict-en-us.dict")) .setRawLogDir(assetsDir) .setBoolean("-allphone_ci", true) .getRecognizer(); recognizer.addListener(this); File phrases = new File(assetsDir, "phrases"); recognizer.addKeywordSearch(PHRASES_SEARCH, phrases); }

Here I collect the result:

@Override public void onPartialResult(Hypothesis hypothesis) { if (hypothesis != null) { String text = hypothesis.getHypstr(); Log.d(TAG, "onPartialResult: " + text); makeText(getApplicationContext(), text, Toast.LENGTH_SHORT).show(); highlightPhrase(text); } }

And I restart recognizer on end of speech:

@Override public void onEndOfSpeech() { recognizer.stop(); recognizer.startListening(PHRASES_SEARCH); }

I adjusted words' tresholds and it's better than before, but still I have some problems:

Sometimes words are triggered in silence.

On the other hand, sometimes I cannot trigger a word (it maybe my English which is far from native, though Google translator is not better).

Sometimes hypothesis.getHypstr() returns string consisting of multiple keywords, while I would prefer to always return only single one.

Vulnerable to environmental distractions.

It feels like I have little control over what recognizer hears, adjusting tresholds in silent environment will cause the app to go crazy when used in bus for example. Is it a good idea to use Pocketsphinx to control Android app with voice? If so, how to implement it properly?

Please share your experience, I will be very grateful.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2015-09-07
  
  From what your describe it looks like thresholds are too high, you need to lower them.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Zbigniew - 2015-09-08

Nickolay, thanks for your answer.

However, if the thesholds are too low, too many words are popping in silence, and if the tresholds are high, I have troubles with triggering the words, even if I speak them different ways. Are there any .wav samples from en-us-ptm acoustic model to play them to my app for testing?

Last edit: Zbigniew 2015-09-08

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2015-09-08
  
  Hi Zbigniew
  
  Well, there could be multiple issues. If there are too many words system could become slow to process data for example and it will destroy the accuracy.
  
  I suggest you to enable raw data collection (-rawlogdir option) and share raw files so I can look what is going on there.
  
  If you want to test pocketsphinx keyword spotting, you can download an English book chapter from librivox and spot the a few keywords there with pocketsphinx_continuous.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Nickolay V. Shmyrev - 2015-09-08
    
    And, obviously, our model is US English, it might have trouble with accented speech. Maybe you can try to speak closer to US English.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Is it the right way to implement Pocketsphinx voice control over Android app?

Speech Recognition Toolkit

Forums

Help

Is it the right way to implement Pocketsphinx voice control over Android app? document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Is it the right way to implement Pocketsphinx voice control over Android app?