I'm new to CMUSphinx and have so far installed Debian Jessie on a VM and configured and installed all the required libraries including getting my USB and internal microphone to work with pulseaudio etc.
I have tested arecord -f S16_LE -r 16000 /tmp/sample.wav and aplay/tmp/sample.wav and all is well :)
(/tmp/sample.wav is a 5 second recording with a single utterance of the word "Testing").
Honestly, I can't even tell where in the output it's reporting the speech recogitino or if it worked? Is it the word "dang" that's being reported? (which would be incorrect).
I have followed the official tutorials and a dozen more on Google but this is as far as I can get.
Many thanks
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Pocketsphinx is not very good in recognizing short phrases of high volume. If you repeat this word several times, other instances will be recognized correctly. For example, you can concatenate two-three copies of the same file and try.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
My actual use case, it to be continually listening and record (and transcribe) speach to text, from anything from thirty minute to two hour conversations.
Is perhaps Pocketsphinx not the best technology for this case?
Thank you Nickolay
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi everyone,
I'm new to CMUSphinx and have so far installed Debian Jessie on a VM and configured and installed all the required libraries including getting my USB and internal microphone to work with pulseaudio etc.
I have tested arecord -f S16_LE -r 16000 /tmp/sample.wav and aplay/tmp/sample.wav and all is well :)
(/tmp/sample.wav is a 5 second recording with a single utterance of the word "Testing").
I'm running the following command:
pocketsphinx_continuous -infile /tmp/sample.wav
and getting the following output:
Honestly, I can't even tell where in the output it's reporting the speech recogitino or if it worked? Is it the word "dang" that's being reported? (which would be incorrect).
I have followed the official tutorials and a dozen more on Google but this is as far as I can get.
Many thanks
Yes
To get help on accuracy issues you need to provide the file sample.wav
I apologise, I've attached it. Thank you
Pocketsphinx is not very good in recognizing short phrases of high volume. If you repeat this word several times, other instances will be recognized correctly. For example, you can concatenate two-three copies of the same file and try.
I understand.
My actual use case, it to be continually listening and record (and transcribe) speach to text, from anything from thirty minute to two hour conversations.
Is perhaps Pocketsphinx not the best technology for this case?
Thank you Nickolay
Something like
https://github.com/srvk/eesen-transcriber
should work better
I'll take a look.
Many thanks for your help