I'm trying to transcribe a long audio file of a professor speaking. However, any time the professor pauses in his speech for several seconds, Sphinx seems to think that's the end of the file and stops decoding. Is there anyway that I can handle this? Perhaps tell Sphinx to allow longer pauses in speech?
Thanks in advance
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I get an array index out of bounds error with this technique. It completes the loop once, successfully, and then throws the index out of bounds error from recognizer.recognize(). I've put both my .java file and my config files online. If you could take a look at them and see if you can figure out what I'm doing wrong, I'd appreciate it.
You are trying to recognize 22 kHz with 320 bytes per read. It's clear why the index goes out. Also remember that 22kHz you are pointing is not the directive to resample the audio to the required format, it's just a hint for the reader.
Resample audio to 16 kHz first manually, then decode it.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm trying to transcribe a long audio file of a professor speaking. However, any time the professor pauses in his speech for several seconds, Sphinx seems to think that's the end of the file and stops decoding. Is there anyway that I can handle this? Perhaps tell Sphinx to allow longer pauses in speech?
Thanks in advance
I get an array index out of bounds error with this technique. It completes the loop once, successfully, and then throws the index out of bounds error from recognizer.recognize(). I've put both my .java file and my config files online. If you could take a look at them and see if you can figure out what I'm doing wrong, I'd appreciate it.
http://students.cec.wustl.edu/~mal2/
Thanks
You are trying to recognize 22 kHz with 320 bytes per read. It's clear why the index goes out. Also remember that 22kHz you are pointing is not the directive to resample the audio to the required format, it's just a hint for the reader.
Resample audio to 16 kHz first manually, then decode it.
Probably you are just using wrong decoder. sphinx3_continuous for example should do everything correctly. sphinx4 could also be setup properly.
I looked in the javadocs for a continuous decoder, but didn't see anything. Where should I look for such a thing?
Thanks
Trascriber demo is the proper example of the continuous decoder. All you need to have is endpointer in frontend
and
while ((result = recognizer.recognize())!= null) {
...........
}
loop in java code.