Dear all,
I want to build a speech recognition system to test the recognition performance when recognizing my sound files. ( The sound files contain 2 speakers: one speaking 6 digits and the other speaking other words.) I am new to sphinx and find that there is not much documentation for it. I have difficulties on starting using it.
Would you tell me how can I do this? Do I have to train it myself? I've tried sphinx2-demo and spinx2-simple, the accuracy rate were quite low. Instead of speaking in real time, can I input a sound file and have the text as output? How to do this?
I really need your helps. Please give me some advice.
Thanks,
Nancy
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Dear all,
I want to build a speech recognition system to test the recognition performance when recognizing my sound files. ( The sound files contain 2 speakers: one speaking 6 digits and the other speaking other words.) I am new to sphinx and find that there is not much documentation for it. I have difficulties on starting using it.
Would you tell me how can I do this? Do I have to train it myself? I've tried sphinx2-demo and spinx2-simple, the accuracy rate were quite low. Instead of speaking in real time, can I input a sound file and have the text as output? How to do this?
I really need your helps. Please give me some advice.
Thanks,
Nancy
You don't have to train yourself except if you want another language (other than american english).
In the demos you can only say the words contained in the dic file. Try things like
rotate left 45 degrees.
or
rotate right
...
By the way, first try your mike with a recorder software. When replaying, the sound should be clear. Set carefully the mike sensitivity.