I use PocketSphinx for name recognition. There are about 4000 different names,
and the target accurancy is 90%. Unfortunately at the moment, it's only 65% in
accurancy using PocketSphinx with the acoustic model hub4wsj_sc_8k, and
language model in JSFG model.
I wonder if PocketSphinx can achieve the accurancy of 90% for such 4000
command size. And if could, what can I do to improve the accurancy? Many
thanks!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I hardly believe you'll get 90% accuracy on names recognition with such a
large vocabulary. Taken into account accents and foreign names. You'll
probably be interested in these papers
I trained an hour of voice for US-English as continuous speech recognition..
when I used pocketsphinx for decoding.. It is giving only numbers not words /
vocabulary.. zero accuracy..
Using same AM, LM and dict.. i decoded by sphinx3.. Its giving very good
accuracy.. Please tell me whats wrong with pocketsphinx
actually i sent an email to u along with path to be download..
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
i sent an email to u along with path to be download..
Sorry, I don't remember I got anything like that. Probably it went to spam.
Can you share them again? Preferably don't contact me by email, post links
here.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I use 3 small test datasets to measure the accuracy. Each dataset cotains 100
names readed by one person. So there are 3 persons and 300 names. The accuracy
is around 60%-65% when using different acoustic models, but not much
difference.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I use PocketSphinx for name recognition. There are about 4000 different names,
and the target accurancy is 90%. Unfortunately at the moment, it's only 65% in
accurancy using PocketSphinx with the acoustic model hub4wsj_sc_8k, and
language model in JSFG model.
I wonder if PocketSphinx can achieve the accurancy of 90% for such 4000
command size. And if could, what can I do to improve the accurancy? Many
thanks!
I hardly believe you'll get 90% accuracy on names recognition with such a
large vocabulary. Taken into account accents and foreign names. You'll
probably be interested in these papers
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.120.4741&rep=rep1&ty
pe=pdf
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.111.8077&rep=rep1&ty
pe=pdf
Thank you nshmyrev.
I'm new in speech recognition, and don't know what to do now to improve the
accurancy.
I've tried to cluster the names into classes in which the names have similar
pronunciation, but this doesn't make sense.
By ways, do you mean there's no possibility to achieve 90% accuracy in a such
task? If possible, what can I do next?
Thank you~
There is no magic bullet here. Read the articles mentioned above, try to
implement what's described in them.
hi...
I trained an hour of voice for US-English as continuous speech recognition..
when I used pocketsphinx for decoding.. It is giving only numbers not words /
vocabulary.. zero accuracy..
Using same AM, LM and dict.. i decoded by sphinx3.. Its giving very good
accuracy.. Please tell me whats wrong with pocketsphinx
actually i sent an email to u along with path to be download..
Sorry, I don't remember I got anything like that. Probably it went to spam.
Can you share them again? Preferably don't contact me by email, post links
here.
hello ximigou.. how do you measure that kind of accuracy?
I use 3 small test datasets to measure the accuracy. Each dataset cotains 100
names readed by one person. So there are 3 persons and 300 names. The accuracy
is around 60%-65% when using different acoustic models, but not much
difference.