CMU Sphinx / Forums / Help: Python Pocketsphinx decoder documentation? What to get time stamps of multiple keywords?

I'm using the python pocketsphinx implimentation.

Looking for better documentation

I've looked both at the pydocs and the pythong tutorials. I have not found python documentation for all the methods of decoder that show up when I do dir(decoder), which is making it hard to solve me problem myself. I've also looked on the official tutorial pages on sourceforge. The test examples on the githup page are useful https://github.com/cmusphinx/pocketsphinx/tree/master/swig/python/test but dont' cover what I'm trying to do, which is to get the timing information of multiple keywords in an audio file. If there is any documentation of that which I missed, a link would be greatly appreciated.

for this problem

Specifically, what I want to do is run the python version of PocketSphinx to identify the time stamps of where various keywords occur in an audio file. I want to find multpile keywords in a keyword list. I can get timestamps doing transcription and single keyword search, but can't seem to figure out how to get them when searching for multiple keywords. Ideally, I'd also get a probability score like in transcription.

This is my code so far:

import sys, os
from pocketsphinx.pocketsphinx import *
from sphinxbase.sphinxbase import *

modeldir  = "../../../sphinx/pocketsphinx-5prealpha/model"
datadir  = "sounds"

##### keylist = "MICROBE /1e-15/
##### MICROBES /1e-15/
##### BIRTHDAY /1e-20/"

##### Create a decoder with certain model
config = Decoder.default_config()

config.set_string('-hmm', os.path.join(modeldir, 'en-us/en-us'))
config.set_string('-dict', os.path.join(modeldir, 'en-us/cmudict-en-us.dict'))

decoder = Decoder(config)
##### Add searches
decoder.set_kws('keyword', 'keylist.list')
decoder.set_search('keyword')

##### Open file to read the data
stream = open(path.join(DATADIR, 'microbes_2.wav'), 'rb')

#####Process audio chunk by chunk. On keyphrase detected perform action and restart search
decoder.start_utt()
in_speech_bf = False
while True:
    buf = stream.read(1024)
    if buf:
        decoder.process_raw(buf, False, False)
        if decoder.get_in_speech() != in_speech_bf:
            in_speech_bf = decoder.get_in_speech()
            if not in_speech_bf:
                decoder.end_utt()              
                if decoder.get_search() == 'keyword':
                    print('got keyword')
                    print("keywords listening for is: ",decoder.get_kws('keyword'))
                    print ("seg = ",[(seg.word, seg.prob, seg.start_frame, seg.end_frame) for seg in decoder.seg()])
                else:
                    print("didn't get keyword")
                decoder.start_utt()    
    else:
         break

I can use alternative code (below) that looks for only one keyword and gets a result that includes a timestamp. I can't figure out how to get this style of result for multiple timestamps.

### Example code for a single keyword that gets me a timestamp: 
import sys, os
from pocketsphinx.pocketsphinx import *
from sphinxbase.sphinxbase import *

modeldir  = "../../../sphinx/pocketsphinx-5prealpha/model"
datadir  = "sounds"

##### Create a decoder with certain model
config = Decoder.default_config()
config.set_string('-hmm', os.path.join(modeldir, 'en-us/en-us'))
config.set_string('-dict', os.path.join(modeldir, 'en-us/cmudict-en-us.dict'))
config.set_string('-keyphrase', 'microbes')
config.set_float('-kws_threshold', 1e+20)
##### Open file to read the data
stream = open(path.join(DATADIR, 'microbes_2.wav'), 'rb')

##### Process audio chunk by chunk. On keyphrase detected perform action and restart search
decoder = Decoder(config)
decoder.start_utt()
while True:
    buf = stream.read(1024)
    if buf:
         decoder.process_raw(buf, False, False)
    else:
         break
    if decoder.hyp() != None:
        print ([(seg.word, seg.prob, seg.start_frame, seg.end_frame) for seg in decoder.seg()])
        print ("Detected keyphrase, restarting search")
        decoder.end_utt()
        decoder.start_utt()

Basically, I'd like to get the same output as this line:

~~~~
print ([(seg.word, seg.prob, seg.start_frame, seg.end_frame) for seg in decoder.seg()])
which is :
[('microbes', -898, 1551, 1607)]
~~~

but for multiple keywords.

Last edit: Nickolay V. Shmyrev 2017-11-21

Nickolay V. Shmyrev - 2017-11-21

You need to use keyword list file for that. You can configure it with

config.set_string ('-kws', 'file_path')

or with decoder.set_kws method.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Justin Gosses - 2017-11-21

Thanks for the quick reply Nickolay. I'm already using the keyword list and decoder.set_kws method in my code above. (This is the code under the "This is my code so far:" heading not the code under the "Example code for a single keyword that gets me a timestamp:" heading ) The code using keyword list and decoder.set_kws method detects that one of my keywords was says, but I'm only getting a TRUE returned. I'd like to get what word was said and the time in the file when it was said.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2017-11-21
  
  The code you wrote should print timestamps for multiple keywords, just try with longer example. Keywords which are very close are not detected.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Justin Gosses - 2017-11-21
    
    It is detecting keywords are being said, but it is only returning True.
    
    Specifically, these two lines combine to return true and print "got keyword"
    if decoder.get_search() == 'keyword':
    print('got keyword')
    
    Can you give an example line of code using decoder.set_kws method that would return a timestamp?
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Justin Gosses - 2017-11-21
    
    Also, when I tried to use the single keyword decoder method syntax for multpile keywords, the seg in decoder.seg is empty.
    
    This is an example of the decoder method that works for single keyword but not multiple keyword keylist method. When I try to use it for multiple keyword search, it returns an empty seg [].
    
    while True: buf = stream.read(1024) if buf: decoder.process_raw(buf, False, False) else: break if decoder.hyp() != None: print ([(seg.word, seg.prob, seg.start_frame, seg.end_frame) for seg in decoder.seg()]) print ("Detected keyphrase, restarting search") decoder.end_utt() decoder.start_utt()
    
    Last edit: Justin Gosses 2017-11-21
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Nickolay V. Shmyrev - 2017-11-22
      
      It is detecting keywords are being said, but it is only returning True.
      
      I do not see what do you mean by "returning true" here. The code you quote just prints the message, it does not return anything.
      
      Can you give an example line of code using decoder.set_kws method that would return a timestamp?
      
      It is more or less provided above
      
      When I try to use it for multiple keyword search, it returns an empty seg [].
      
      It means no keywords are detected. You can change threshold to get more detections.
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Justin Gosses - 2017-11-22
        
        It is detecting keywords are being said, but it is only returning True.
        
        I do not see what do you mean by "returning true" here. The code you quote just prints the message, it does not return anything.<
        
        Sorry, my language was not specific enough.if decoder.get_search() == 'keyword': evaluates to True, because decoder.get_search() == 'keyword', therefore, it prints the message. I'd like to get the word and timestamp, not just that a keyword was seen and a message printed.<<
        
        Can you give an example line of code using decoder.set_kws method that would return a timestamp?
        
        It is more or less provided above<
        
        The code above doesn't get me what I'm asking for. I just found a reference https://cmusphinx.github.io/wiki/faq/#q-how-to-implement-hot-word-listening that "This feature is not yet implemented in sphinx4 decoder." This is at the end of the "Q: How to implement “Hot word listening”. I am using modeldir = "../../../sphinx/pocketsphinx-5prealpha/model". Can I not get this to work because it is not implimented in pocketsphinx-5prealpha? (I'm using python code) Maybe multiple keyword search that can return timestamps is still not implimented in 2017?
        
        When I try to use it for multiple keyword search, it returns an empty seg [].
        
        It means no keywords are detected. You can change threshold to get more detections.<
        
        I've tried very large and small values for the threshold of detection in the keyword.list. Changing the threshold doesn't result in any change in code behavior. I've used 1e50 ,1e-20, 1e-15, and 1. When I try single keyword search, detection works fine with similar thresholds. <<
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Nickolay V. Shmyrev - 2017-11-22
        
        Ok, start this with formatting your post properly. Double quoted text should be within double quotes, quoted text within single quotes and your response without quotes.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Justin Gosses - 2017-11-29

If anyone has gotten mutiple keyword search via keyword list to work using the python wrapper of PocketSphinx, I'd very much appreciate a link to example code. For now, I'll just use a loop to search each audio file for each keyword, which is slow, or call the command line pocketsphinx multiple keyword search from Python, which is less than ideal. Both work, but I'd still like to do multiple keyword search from within python pocketsphinx if I can as I suspect it would be significantly faster.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Python Pocketsphinx decoder documentation? What to get time stamps of...

Speech Recognition Toolkit

Forums

Help

Python Pocketsphinx decoder documentation? What to get time stamps of multiple keywords?

Looking for better documentation

for this problem

This is my code so far:

This is an example of the decoder method that works for single keyword but not multiple keyword keylist method. When I try to use it for multiple keyword search, it returns an empty seg [].

Python Pocketsphinx decoder documentation? What to get time stamps of...

Speech Recognition Toolkit

Forums

Help

Python Pocketsphinx decoder documentation? What to get time stamps of multiple keywords? document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Looking for better documentation

for this problem

This is my code so far:

This is an example of the decoder method that works for single keyword but not multiple keyword keylist method. When I try to use it for multiple keyword search, it returns an empty seg [].

Python Pocketsphinx decoder documentation? What to get time stamps of multiple keywords?