help with keyword mode

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

help with keyword mode

Forum: Help

Creator: Nick Joliat

Created: 2015-07-15

Updated: 2015-07-16

Nick Joliat - 2015-07-15

Hello,
I'm new to pocketsphinx (and to speech recognition in general.) I'm trying to get started with some keyword recognition. I have a set of (test) phrases I want to recognize, and I have an audio file where I say some of those phrases, along with some random other words or phrases that are not key-phrases. I've been testing pocketsphinx_continuous with this stuff, and it's recognizing some phrases, but missing more than half. I'm trying to figure out if this is because of something I'm doing wrong with pocketsphinx, such as some additional data I should be providing (a -dict argument?), or if the data that i'm testing with is just problematic for some reason, etc.

I'm attaching my keyphrase file, test audio file, and output file. The way I'm running the program is
"pocketsphinx_continuous -infile pocketsphinx-test-audio.wav -kws keyphrase.file &> sphinx_test_out.txt".

I have tried this with a variety of threshold values without much difference in the results, although I'm not really sure what the range of reasonable threshold values is.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nick Joliat - 2015-07-15

here are the input files and output data for the above.

keyphrase.file

pocketsphinx-test-audio.wav

sphinx_test_out.txt

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2015-07-16

Hello

Welcome to CMUSphinx forums!

I have tried this with a variety of threshold values without much difference in the results, although I'm not really sure what the range of reasonable threshold values is.

The range of thresholds is from 1e-50 to 1.0. For longer phrases they are closer to 1e-50, for shorter close to 1. It is not recommended to use short phrases for keyword detection, the reasonable length is 4-5 syllables.

For your phrases the reasonable keyphrase file should look like this:

start recording /1e-20/ stop recording /1e-20/ delete score /1e-20/ one two three four /1e-20/ create new score /1e-30/ delete score /1e-30/ recognize this phrase /1e-40/
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.