The project
I’m working on a mobile speech recognition system for my master thesis. Its purpose is to compare the Android built-in VR to open source VRs like pocketsphinx. The source code of the project is located under https://github.com/marfnk/android-continuous-voice/
The Problem
Even though I managed to get it to work, the results with all of those models are so bad that it must be a misconfiguration of the model as stated in the FAQs.
I didn’t find any tutorial or explanation of how to setup the given models with Pocketsphinx for Android. The most important model is voxforge_german.
Can anyone help me with this one?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Overall, large vocabulary recognition on the phone is a tricky task, for practical applications you need to do the following steps:
1) Select subdomain of the texts you want to recognize, not generic domain.
2) Build a language model for this domain
3) Train semi-continuous model or ptm for the mobile speech recognition. Default german model from Voxforge is continuous and too slow for mobile device. You need to take latest material for that collected on Voxforge forum. Ideally you also want to segment few librivox audiobooks
4) Our Android application stores raw audio on the device, you need to collect that audio for your task and build a test set. You need to measure accuracy of recognition with your language model and acoustic model and get word error rate estimates.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2014-10-06
That means it is not possible at all to recognize voice without domain with pocketsphinx - just like Google does?
I don't get why. I have no requirements on performance or filesizes.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The project
I’m working on a mobile speech recognition system for my master thesis. Its purpose is to compare the Android built-in VR to open source VRs like pocketsphinx. The source code of the project is located under https://github.com/marfnk/android-continuous-voice/
What I did
I tried to use the voxforge language models (german/english) as well as the CMU Sphinx recommended models from this source.
As you can see here .../assets/sync/models/ and here PocketSphinxRecognitionService.java - I used the given files.
The service uses hmm/de-de-voxforge, dict/voxforge_de.dic and lm/voxforge_de.dmp.
The Problem
Even though I managed to get it to work, the results with all of those models are so bad that it must be a misconfiguration of the model as stated in the FAQs.
I didn’t find any tutorial or explanation of how to setup the given models with Pocketsphinx for Android. The most important model is voxforge_german.
Can anyone help me with this one?
Hello Marius
Overall, large vocabulary recognition on the phone is a tricky task, for practical applications you need to do the following steps:
1) Select subdomain of the texts you want to recognize, not generic domain.
2) Build a language model for this domain
3) Train semi-continuous model or ptm for the mobile speech recognition. Default german model from Voxforge is continuous and too slow for mobile device. You need to take latest material for that collected on Voxforge forum. Ideally you also want to segment few librivox audiobooks
4) Our Android application stores raw audio on the device, you need to collect that audio for your task and build a test set. You need to measure accuracy of recognition with your language model and acoustic model and get word error rate estimates.
That means it is not possible at all to recognize voice without domain with pocketsphinx - just like Google does?
I don't get why. I have no requirements on performance or filesizes.
Google doesn't recognize voice without the domain on the devices. Their language model is built from web queries and pretty limited for offline usage.
You can read about google system here:
http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41176.pdf
I won't give up ;) But thanks a lot for your advices.
Ok, let us know if you need help.