Menu

Android app with pocketsphinx

Help
mesh
2016-06-01
2016-06-14
  • mesh

    mesh - 2016-06-01

    Hi,

    I'm starting small android app for which I would like to use speech recognition.
    Use-case for pocketsphinx is to use it to detect several simple voice commands which would direct the flow of the app.
    Which model of recognition do you suggest I use?
    Is it possible to define more than one keyword?
    I would like to find mode where pocketsphinx fires an event as soon as key word or phrase is detected. Is that possible?
    I managed to get modified demo app running and it starts listening for words that I listed instead of digits in the file but only after triggering this kind of recognition after detecting the key phrase. Problem is that event is not fireing until timeout is reached but only after detecting silence.

    Is it possible to continuously monitor for a list of words or phrases and fire an event as soon as this is recognized and not wait for the silence?

    Thanks in advance for any pointers I get.

     
    • Nickolay V. Shmyrev

      Which model of recognition do you suggest I use?

      You can use default one

      Is it possible to define more than one keyword? I would like to find mode where pocketsphinx fires an event as soon as key word or phrase is detected. Is that possible?

      Yes, sure

      http://stackoverflow.com/questions/25748113/recognizing-multiple-keywords-using-pocketsphinx

       
  • mesh

    mesh - 2016-06-12

    Hi,

    So, I have tried to implement what was said in the stackoverflow thread and I managed to do it but application is triggering recognition of "up" word constantly.

    This is what I did:
    - shortened the dict file to contain only 4 words (for testing): up, down, left, right
    - created a key-words.txt with this content: pastie_link
    - changed the initialization method for the recognizer object: pastie_link

    I tried playing with treshold values but I can't set it up so that all key words get recognized independently.
    App is either always triggering "up" or it does not recognize "down".

    Can you please give me some pointers on the correct values for those?

     

    Last edit: mesh 2016-06-12
    • Nickolay V. Shmyrev

      Thresholds for words are independent, so you can tune them independently too, "up" recognition does not affect "down" recognition. If you have too many false alarms, you can raise threshold even higher than 1.0.

      Overall, your keyphrases are too short to be detected reliably. You can use a longer activation keyphrase to avoid false alarms.

       
      • mesh

        mesh - 2016-06-14

        OK, thanks Nickolay!

        I'll try with longer phrases and if I understand correctly: the higher the number for each word implies more precise matching of "heard" sound with the model that describes that word(s). Right?

        And if I would want to somehow train the recognition engine to the way I or whoever pronounces these phrases - is that possible?

         
        • Nickolay V. Shmyrev

          I'll try with longer phrases and if I understand correctly: the higher the number for each word implies more precise matching of "heard" sound with the model that describes that word(s). Right?

          Yes

          And if I would want to somehow train the recognition engine to the way I or whoever pronounces these phrases - is that possible?

          No

           

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.