Dear all,
I want to build a speech recognition system to test the recognition performance when recognizing my sound files. ( The sound files contain 2 speakers: one speaking 6 digits and the other speaking other words.) I am new to sphinx and find that there is not much documentation for it. I have difficulties on starting using it.
Would you tell me how can I do this? Do I have to train it myself? I've tried sphinx2-demo and spinx2-simple, the accuracy rate were quite low. Instead of speaking in real time, can I input a sound file and have the text as output? How to do this?
I really need your helps. Please give me some advice.
Thanks,
Nancy
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2002-04-08
What you are trying to do is no different from what the rest of us on this iste are doing.
Question: What language are you speaking in ? Only supports American/French languages, but English english is comming soon....
Question: How many words do you require it to recognise and what are they. Are they currently included in the (language modl) lm that are available, or are they outside it. If outside, you may have to trian it yourself as I am.
Question: If you want text output, you will need to interface to the API toolkit.
All this is documented, if hard to find. Keep trying...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2002-04-09
I have lots of sound files (contains 2 speakers, one is saying 6 digits and the other one is saying other word in other language), I want to find a SR system to recognize them and record the accuracy rate.
The target words are just 10 digit.
I don't really want the text output, I need the percentage of successful rate.
Thanks,
Nancy
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Dear all,
I want to build a speech recognition system to test the recognition performance when recognizing my sound files. ( The sound files contain 2 speakers: one speaking 6 digits and the other speaking other words.) I am new to sphinx and find that there is not much documentation for it. I have difficulties on starting using it.
Would you tell me how can I do this? Do I have to train it myself? I've tried sphinx2-demo and spinx2-simple, the accuracy rate were quite low. Instead of speaking in real time, can I input a sound file and have the text as output? How to do this?
I really need your helps. Please give me some advice.
Thanks,
Nancy
What you are trying to do is no different from what the rest of us on this iste are doing.
Question: What language are you speaking in ? Only supports American/French languages, but English english is comming soon....
Question: How many words do you require it to recognise and what are they. Are they currently included in the (language modl) lm that are available, or are they outside it. If outside, you may have to trian it yourself as I am.
Question: If you want text output, you will need to interface to the API toolkit.
All this is documented, if hard to find. Keep trying...
I have lots of sound files (contains 2 speakers, one is saying 6 digits and the other one is saying other word in other language), I want to find a SR system to recognize them and record the accuracy rate.
The target words are just 10 digit.
I don't really want the text output, I need the percentage of successful rate.
Thanks,
Nancy