Is there a way to display the phones that Sphinx2 thinks that it is seeing as it selects the correct word? For example - sphinx recognizes "K AA T", therefore, it returns cat. I am getting very low recognition results (50%-60% accuracy) and wondering how I can see what sphinx thinks it sees before it returns a blank response for 'not found' - thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
There is a way: you set the sphinx to a special mode called allphone.
However I am pretty sure that you won't progress like this. This because the allphone mode will not output clean phonemes like
K AE T
but rather
K H H K AA AE H H AA AE OY AA H T TH DH H T TH
And you won't be able to use that easily.
If you have problems with recognition accuracy, just be sure they always come (90% of cases) from the sound input level/quality. Either your speech is too weak, or it is saturated. Just find a way to listen to it. Use the -rawlogdir option to save your utterances to the disk.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2003-08-11
I have not tried this, although you might want to try uttproc_partial_result(int32 *frm, char **hyp); It looks like it might do something useful. You would put this in for example tty-continuous.c:
uttproc_end_utt ();
>>uttproc_partial_result(int32 *frm, char **hyp);<<
if (uttproc_result (&fr, &hyp, 1) < 0) etc.
For more information check out the Sphinx-2 API section of docs or fbs.h
If you try this please advise how you make out.
Thanks,
Steve
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Is there a way to display the phones that Sphinx2 thinks that it is seeing as it selects the correct word? For example - sphinx recognizes "K AA T", therefore, it returns cat. I am getting very low recognition results (50%-60% accuracy) and wondering how I can see what sphinx thinks it sees before it returns a blank response for 'not found' - thanks.
There is a way: you set the sphinx to a special mode called allphone.
However I am pretty sure that you won't progress like this. This because the allphone mode will not output clean phonemes like
K AE T
but rather
K H H K AA AE H H AA AE OY AA H T TH DH H T TH
And you won't be able to use that easily.
If you have problems with recognition accuracy, just be sure they always come (90% of cases) from the sound input level/quality. Either your speech is too weak, or it is saturated. Just find a way to listen to it. Use the -rawlogdir option to save your utterances to the disk.
I have not tried this, although you might want to try uttproc_partial_result(int32 *frm, char **hyp); It looks like it might do something useful. You would put this in for example tty-continuous.c:
uttproc_end_utt ();
>>uttproc_partial_result(int32 *frm, char **hyp);<<
if (uttproc_result (&fr, &hyp, 1) < 0) etc.
For more information check out the Sphinx-2 API section of docs or fbs.h
If you try this please advise how you make out.
Thanks,
Steve