I've looked both at the pydocs and the pythong tutorials. I have not found python documentation for all the methods of decoder that show up when I do dir(decoder), which is making it hard to solve me problem myself. I've also looked on the official tutorial pages on sourceforge. The test examples on the githup page are useful https://github.com/cmusphinx/pocketsphinx/tree/master/swig/python/test but dont' cover what I'm trying to do, which is to get the timing information of multiple keywords in an audio file. If there is any documentation of that which I missed, a link would be greatly appreciated.
for this problem
Specifically, what I want to do is run the python version of PocketSphinx to identify the time stamps of where various keywords occur in an audio file. I want to find multpile keywords in a keyword list. I can get timestamps doing transcription and single keyword search, but can't seem to figure out how to get them when searching for multiple keywords. Ideally, I'd also get a probability score like in transcription.
This is my code so far:
importsys,osfrompocketsphinx.pocketsphinximport*fromsphinxbase.sphinxbaseimport*modeldir="../../../sphinx/pocketsphinx-5prealpha/model"datadir="sounds"##### keylist = "MICROBE /1e-15/##### MICROBES /1e-15/##### BIRTHDAY /1e-20/"##### Create a decoder with certain modelconfig=Decoder.default_config()config.set_string('-hmm',os.path.join(modeldir,'en-us/en-us'))config.set_string('-dict',os.path.join(modeldir,'en-us/cmudict-en-us.dict'))decoder=Decoder(config)##### Add searchesdecoder.set_kws('keyword','keylist.list')decoder.set_search('keyword')##### Open file to read the datastream=open(path.join(DATADIR,'microbes_2.wav'),'rb')#####Process audio chunk by chunk. On keyphrase detected perform action and restart searchdecoder.start_utt()in_speech_bf=FalsewhileTrue:buf=stream.read(1024)ifbuf:decoder.process_raw(buf,False,False)ifdecoder.get_in_speech()!=in_speech_bf:in_speech_bf=decoder.get_in_speech()ifnotin_speech_bf:decoder.end_utt()ifdecoder.get_search()=='keyword':print('got keyword')print("keywords listening for is: ",decoder.get_kws('keyword'))print("seg = ",[(seg.word,seg.prob,seg.start_frame,seg.end_frame)forsegindecoder.seg()])else:print("didn't get keyword")decoder.start_utt()else:break
I can use alternative code (below) that looks for only one keyword and gets a result that includes a timestamp. I can't figure out how to get this style of result for multiple timestamps.
### Example code for a single keyword that gets me a timestamp: importsys,osfrompocketsphinx.pocketsphinximport*fromsphinxbase.sphinxbaseimport*modeldir="../../../sphinx/pocketsphinx-5prealpha/model"datadir="sounds"##### Create a decoder with certain modelconfig=Decoder.default_config()config.set_string('-hmm',os.path.join(modeldir,'en-us/en-us'))config.set_string('-dict',os.path.join(modeldir,'en-us/cmudict-en-us.dict'))config.set_string('-keyphrase','microbes')config.set_float('-kws_threshold',1e+20)##### Open file to read the datastream=open(path.join(DATADIR,'microbes_2.wav'),'rb')##### Process audio chunk by chunk. On keyphrase detected perform action and restart searchdecoder=Decoder(config)decoder.start_utt()whileTrue:buf=stream.read(1024)ifbuf:decoder.process_raw(buf,False,False)else:breakifdecoder.hyp()!=None:print([(seg.word,seg.prob,seg.start_frame,seg.end_frame)forsegindecoder.seg()])print("Detected keyphrase, restarting search")decoder.end_utt()decoder.start_utt()
Basically, I'd like to get the same output as this line:
~~~~
print ([(seg.word, seg.prob, seg.start_frame, seg.end_frame) for seg in decoder.seg()])
which is : [('microbes', -898, 1551, 1607)]
~~~
but for multiple keywords.
Last edit: Nickolay V. Shmyrev 2017-11-21
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for the quick reply Nickolay. I'm already using the keyword list and decoder.set_kws method in my code above. (This is the code under the "This is my code so far:" heading not the code under the "Example code for a single keyword that gets me a timestamp:" heading ) The code using keyword list and decoder.set_kws method detects that one of my keywords was says, but I'm only getting a TRUE returned. I'd like to get what word was said and the time in the file when it was said.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Also, when I tried to use the single keyword decoder method syntax for multpile keywords, the seg in decoder.seg is empty.
This is an example of the decoder method that works for single keyword but not multiple keyword keylist method. When I try to use it for multiple keyword search, it returns an empty seg [].
while True:
buf = stream.read(1024)
if buf:
decoder.process_raw(buf, False, False)
else:
break
if decoder.hyp() != None:
print ([(seg.word, seg.prob, seg.start_frame, seg.end_frame) for seg in decoder.seg()])
print ("Detected keyphrase, restarting search")
decoder.end_utt()
decoder.start_utt()
Last edit: Justin Gosses 2017-11-21
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It is detecting keywords are being said, but it is only returning True.
I do not see what do you mean by "returning true" here. The code you quote just prints the message, it does not return anything.<
Sorry, my language was not specific enough.if decoder.get_search() == 'keyword': evaluates to True, because decoder.get_search() == 'keyword', therefore, it prints the message. I'd like to get the word and timestamp, not just that a keyword was seen and a message printed.<<
Can you give an example line of code using decoder.set_kws method that would return a timestamp?
It is more or less provided above<
The code above doesn't get me what I'm asking for. I just found a reference https://cmusphinx.github.io/wiki/faq/#q-how-to-implement-hot-word-listening that "This feature is not yet implemented in sphinx4 decoder." This is at the end of the "Q: How to implement “Hot word listening”. I am using modeldir = "../../../sphinx/pocketsphinx-5prealpha/model". Can I not get this to work because it is not implimented in pocketsphinx-5prealpha? (I'm using python code) Maybe multiple keyword search that can return timestamps is still not implimented in 2017?
When I try to use it for multiple keyword search, it returns an empty seg [].
It means no keywords are detected. You can change threshold to get more detections.<
I've tried very large and small values for the threshold of detection in the keyword.list. Changing the threshold doesn't result in any change in code behavior. I've used 1e50 ,1e-20, 1e-15, and 1. When I try single keyword search, detection works fine with similar thresholds. <<
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Ok, start this with formatting your post properly. Double quoted text should be within double quotes, quoted text within single quotes and your response without quotes.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
If anyone has gotten mutiple keyword search via keyword list to work using the python wrapper of PocketSphinx, I'd very much appreciate a link to example code. For now, I'll just use a loop to search each audio file for each keyword, which is slow, or call the command line pocketsphinx multiple keyword search from Python, which is less than ideal. Both work, but I'd still like to do multiple keyword search from within python pocketsphinx if I can as I suspect it would be significantly faster.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm using the python pocketsphinx implimentation.
Looking for better documentation
I've looked both at the pydocs and the pythong tutorials. I have not found python documentation for all the methods of decoder that show up when I do dir(decoder), which is making it hard to solve me problem myself. I've also looked on the official tutorial pages on sourceforge. The test examples on the githup page are useful https://github.com/cmusphinx/pocketsphinx/tree/master/swig/python/test but dont' cover what I'm trying to do, which is to get the timing information of multiple keywords in an audio file. If there is any documentation of that which I missed, a link would be greatly appreciated.
for this problem
Specifically, what I want to do is run the python version of PocketSphinx to identify the time stamps of where various keywords occur in an audio file. I want to find multpile keywords in a keyword list. I can get timestamps doing transcription and single keyword search, but can't seem to figure out how to get them when searching for multiple keywords. Ideally, I'd also get a probability score like in transcription.
This is my code so far:
I can use alternative code (below) that looks for only one keyword and gets a result that includes a timestamp. I can't figure out how to get this style of result for multiple timestamps.
Basically, I'd like to get the same output as this line:
~~~~
print ([(seg.word, seg.prob, seg.start_frame, seg.end_frame) for seg in decoder.seg()])
which is :
[('microbes', -898, 1551, 1607)]
~~~
but for multiple keywords.
Last edit: Nickolay V. Shmyrev 2017-11-21
You need to use keyword list file for that. You can configure it with
or with
decoder.set_kws
method.Thanks for the quick reply Nickolay. I'm already using the keyword list and decoder.set_kws method in my code above. (This is the code under the "This is my code so far:" heading not the code under the "Example code for a single keyword that gets me a timestamp:" heading ) The code using keyword list and decoder.set_kws method detects that one of my keywords was says, but I'm only getting a TRUE returned. I'd like to get what word was said and the time in the file when it was said.
The code you wrote should print timestamps for multiple keywords, just try with longer example. Keywords which are very close are not detected.
It is detecting keywords are being said, but it is only returning True.
Specifically, these two lines combine to return true and print "got keyword"
if decoder.get_search() == 'keyword':
print('got keyword')
Can you give an example line of code using decoder.set_kws method that would return a timestamp?
Also, when I tried to use the single keyword decoder method syntax for multpile keywords, the seg in decoder.seg is empty.
This is an example of the decoder method that works for single keyword but not multiple keyword keylist method. When I try to use it for multiple keyword search, it returns an empty seg [].
Last edit: Justin Gosses 2017-11-21
I do not see what do you mean by "returning true" here. The code you quote just prints the message, it does not return anything.
It is more or less provided above
It means no keywords are detected. You can change threshold to get more detections.
It is detecting keywords are being said, but it is only returning True.
Can you give an example line of code using decoder.set_kws method that would return a timestamp?
When I try to use it for multiple keyword search, it returns an empty seg [].
Ok, start this with formatting your post properly. Double quoted text should be within double quotes, quoted text within single quotes and your response without quotes.
If anyone has gotten mutiple keyword search via keyword list to work using the python wrapper of PocketSphinx, I'd very much appreciate a link to example code. For now, I'll just use a loop to search each audio file for each keyword, which is slow, or call the command line pocketsphinx multiple keyword search from Python, which is less than ideal. Both work, but I'd still like to do multiple keyword search from within python pocketsphinx if I can as I suspect it would be significantly faster.