I'm evaluating CMU Sphinx for a very critical system. The idea is to help the
real time operation with speech recognition, mainly some single words (up to
20) followed by 4 numbers, in spanish and english.
I'm trying the demos included in Ubuntu 12 but I'm getting a very very low
accuracy rate. For example in HelloWorld.jar I cannot get a success matching,
and with ZipCity.jar I get a 50% for every single digit.
I want to work hard with this platform to get a very good result, and I am
willing to develop custom grammars, dictionaries and models if it is needed.
But first, I want to be sure about the maximum level of accuracy I can expect;
and only with the demos the result is very disappointing.
May I be doing something wrong? Can be an installation problem? I tried with
AudioTool.jar and I can record and reproduce the microphone stream.
Anybody has tried the demos in Ubuntu 12?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I'm evaluating CMU Sphinx for a very critical system. The idea is to help the
real time operation with speech recognition, mainly some single words (up to
20) followed by 4 numbers, in spanish and english.
I'm trying the demos included in Ubuntu 12 but I'm getting a very very low
accuracy rate. For example in HelloWorld.jar I cannot get a success matching,
and with ZipCity.jar I get a 50% for every single digit.
I want to work hard with this platform to get a very good result, and I am
willing to develop custom grammars, dictionaries and models if it is needed.
But first, I want to be sure about the maximum level of accuracy I can expect;
and only with the demos the result is very disappointing.
May I be doing something wrong? Can be an installation problem? I tried with
AudioTool.jar and I can record and reproduce the microphone stream.
Anybody has tried the demos in Ubuntu 12?
This issue is covered in tutorial
http://cmusphinx.sourceforge.net/wiki/tutorialbeforestart#existing_accuracy_f
igures
For the small vocabulary the word error rate must be less than 5% in clean
environment and less than 30% in noisy environment.
Maybe
Unlikely
Sure
In order to get help you need to share the audio recording you have made. One
possible issue is the DC offset in your microphone.
Please use Sphinx4 Help forum to ask for help about Sphinx4. This is a forum
for other packages.