I found a paper presenting a very interesting idea, which using the google's speech recognition result, then using the Sphinx4 to do the post-processing work. They claim that they get better result. On paper, the idea fit my current project perfectly, cause google's Speech Recognition enhanced mode only allows input 500 words as "hints" to improve the accuracy. If I could use google's recognition result then use Sphinx4 to do the post-processing using my own language model, that will really overcome google's 500 words limition.
According to the paper, first they convert Google recognition result string to phonemic representation.
I just start to dig into the Sphix4 code. Just wondering if any one could give me a hint where to modify to pass in the phonemic representation as parameter.
Thanks!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi everyone,
I found a paper presenting a very interesting idea, which using the google's speech recognition result, then using the Sphinx4 to do the post-processing work. They claim that they get better result. On paper, the idea fit my current project perfectly, cause google's Speech Recognition enhanced mode only allows input 500 words as "hints" to improve the accuracy. If I could use google's recognition result then use Sphinx4 to do the post-processing using my own language model, that will really overcome google's 500 words limition.
According to the paper, first they convert Google recognition result string to phonemic representation.
I just start to dig into the Sphix4 code. Just wondering if any one could give me a hint where to modify to pass in the phonemic representation as parameter.
Thanks!
hey, can you provide the document?
did it work?