I selected 10 speakers, told them to speak 30 words. Then I select wave files of these 9 speakers to train the acoustic model and the remaining one set wavefiles for testing....
I got 100% WER and SER during decode..... !!
I surprise,, the engine made to study the same thing from all speakers but not able to recognize the same thing by a different speaker .......
How it possible.....? Is sphinx purely dependent on speakers??
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi all,,
Greetings..
I selected 10 speakers, told them to speak 30 words. Then I select wave files of these 9 speakers to train the acoustic model and the remaining one set wavefiles for testing....
I got 100% WER and SER during decode..... !!
I surprise,, the engine made to study the same thing from all speakers but not able to recognize the same thing by a different speaker .......
How it possible.....? Is sphinx purely dependent on speakers??
You can share your training folder to get help on this issue.
Of course.. Here is a sharable link to my data....
https://drive.google.com/file/d/0B_74UylilDfCYmdyZmhhYlJSaEE/view?usp=sharing
Thank u for response..
Your mistake is that arpa model is not properly prepared, it is build from phones instead of words.
If you build arpa lm from words, it will be much more accurate, it will guess most of the words actually.
Okay,,, That was a mistake.... Thank uu...
:)