Hello,
I'm new to pocketsphinx (and to speech recognition in general.) I'm trying to get started with some keyword recognition. I have a set of (test) phrases I want to recognize, and I have an audio file where I say some of those phrases, along with some random other words or phrases that are not key-phrases. I've been testing pocketsphinx_continuous with this stuff, and it's recognizing some phrases, but missing more than half. I'm trying to figure out if this is because of something I'm doing wrong with pocketsphinx, such as some additional data I should be providing (a -dict argument?), or if the data that i'm testing with is just problematic for some reason, etc.
I'm attaching my keyphrase file, test audio file, and output file. The way I'm running the program is
"pocketsphinx_continuous -infile pocketsphinx-test-audio.wav -kws keyphrase.file &> sphinx_test_out.txt".
I have tried this with a variety of threshold values without much difference in the results, although I'm not really sure what the range of reasonable threshold values is.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have tried this with a variety of threshold values without much difference in the results, although I'm not really sure what the range of reasonable threshold values is.
The range of thresholds is from 1e-50 to 1.0. For longer phrases they are closer to 1e-50, for shorter close to 1. It is not recommended to use short phrases for keyword detection, the reasonable length is 4-5 syllables.
For your phrases the reasonable keyphrase file should look like this:
start recording /1e-20/
stop recording /1e-20/
delete score /1e-20/
one two three four /1e-20/
create new score /1e-30/
delete score /1e-30/
recognize this phrase /1e-40/
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I'm new to pocketsphinx (and to speech recognition in general.) I'm trying to get started with some keyword recognition. I have a set of (test) phrases I want to recognize, and I have an audio file where I say some of those phrases, along with some random other words or phrases that are not key-phrases. I've been testing pocketsphinx_continuous with this stuff, and it's recognizing some phrases, but missing more than half. I'm trying to figure out if this is because of something I'm doing wrong with pocketsphinx, such as some additional data I should be providing (a -dict argument?), or if the data that i'm testing with is just problematic for some reason, etc.
I'm attaching my keyphrase file, test audio file, and output file. The way I'm running the program is
"pocketsphinx_continuous -infile pocketsphinx-test-audio.wav -kws keyphrase.file &> sphinx_test_out.txt".
I have tried this with a variety of threshold values without much difference in the results, although I'm not really sure what the range of reasonable threshold values is.
here are the input files and output data for the above.
Hello
Welcome to CMUSphinx forums!
The range of thresholds is from 1e-50 to 1.0. For longer phrases they are closer to 1e-50, for shorter close to 1. It is not recommended to use short phrases for keyword detection, the reasonable length is 4-5 syllables.
For your phrases the reasonable keyphrase file should look like this: