Menu

Pocketsphinx for Android Model Configuration

Help
Anonymous
2014-10-06
2014-10-06
  • Anonymous

    Anonymous - 2014-10-06

    The project
    I’m working on a mobile speech recognition system for my master thesis. Its purpose is to compare the Android built-in VR to open source VRs like pocketsphinx. The source code of the project is located under https://github.com/marfnk/android-continuous-voice/

    What I did
    I tried to use the voxforge language models (german/english) as well as the CMU Sphinx recommended models from this source.
    As you can see here .../assets/sync/models/ and here PocketSphinxRecognitionService.java - I used the given files.
    The service uses hmm/de-de-voxforge, dict/voxforge_de.dic and lm/voxforge_de.dmp.

    The Problem
    Even though I managed to get it to work, the results with all of those models are so bad that it must be a misconfiguration of the model as stated in the FAQs.

    I didn’t find any tutorial or explanation of how to setup the given models with Pocketsphinx for Android. The most important model is voxforge_german.

    Can anyone help me with this one?

     
    • Nickolay V. Shmyrev

      Hello Marius

      Overall, large vocabulary recognition on the phone is a tricky task, for practical applications you need to do the following steps:

      1) Select subdomain of the texts you want to recognize, not generic domain.

      2) Build a language model for this domain

      3) Train semi-continuous model or ptm for the mobile speech recognition. Default german model from Voxforge is continuous and too slow for mobile device. You need to take latest material for that collected on Voxforge forum. Ideally you also want to segment few librivox audiobooks

      4) Our Android application stores raw audio on the device, you need to collect that audio for your task and build a test set. You need to measure accuracy of recognition with your language model and acoustic model and get word error rate estimates.

       
  • Anonymous

    Anonymous - 2014-10-06

    That means it is not possible at all to recognize voice without domain with pocketsphinx - just like Google does?
    I don't get why. I have no requirements on performance or filesizes.

     
    • Nickolay V. Shmyrev

      That means it is not possible at all to recognize voice without domain with pocketsphinx - just like Google does?

      Google doesn't recognize voice without the domain on the devices. Their language model is built from web queries and pretty limited for offline usage.

      You can read about google system here:

      http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41176.pdf

       
      • Anonymous

        Anonymous - 2014-10-06

        I won't give up ;) But thanks a lot for your advices.

         
        • Nickolay V. Shmyrev

          Ok, let us know if you need help.

           

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.