CMU Sphinx / Forums / Help: Python - PocketSphinx [Errno Input overflowed

Hello! First I want to thank you all for any inputs and tips! I attached my code in the end and I have a couple questions and issues here:

I am trying to write a python program that will turn on LED upon my words on Raspberry Pi 3.. I want to add specific activation words and then execute one of the TWO commands from my super tiny language model. After much research, I realized that I have to use "keyword" mode on decoder to activate listening on my Rpi (eg. "hello pi"). THEN I switch decoder to 'lm' mode to decode for (1 out of the 2 possible) command I have for it. (turn on led /turn off led) Is this understanding correct? What is difference between set_keyphrase and set_kws? Can set_kws work on a phrase("ok pi")?
I have been able to successfully record and decode voice from terminal with my USB microphone with the following command:

-arecord -f cd -c 1 -D plughw:0,0 -r 16k test.wav

I know my USB microphone is default to be recording at 44100Hz samprate but this way I manually changed it to 16k. However, when I configure pyaudio.open (rate = 16000), it gives me an error says sample rate error. When i switch rate = 44100 (usb default rate) then no error occurs.. Can I somehow make my USB microphone sample at lower than default rate? If not, is there any USB microphone you guys have used that has default sample rate at 16k Hz?

Even using the default sound card on Pi (rate = 16000), within 2 seconds of running program, it gives me an overflowing error:

buf = stream.read(1024)
ERROR: [Error Input overflowed] -9981

I have already been using a 3 sentence language model generated online anyone has any inputs on this? Running my USB microphone at 44100Hz sample rate is also making the overflowed issue worse...

My code:

import sys,os, time
from pocketsphinx.pocketsphinx import *
from sphinxbase.sphinxbase import *

#initialize decoder configuration
config = Decoder.default_config()
config.set_string('-hmm', '/home/pi/pocketsphinx-5prealpha/model/en-us/en-us')
config.set_string('-dict', '/home/pi/pocketsphinx-5prealpha/8000.dic')

#set up decoder search mode with defined language model
decoder = Decoder(config)

#lm = NGramModel(config, decoder.get_logmath(), path.join(modeldir, '8000.lm'))
#decoder.set_lm('8000', lm)
decoder.set_lm_file('lm', '/home/pi/pocketsphinx-5prealpha/8000.lm')
decoder.set_kws('keyword', 'keyword.list')
decoder.set_search('keyword')

#PyAudio set up
import pyaudio
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16,
              channels=1,
              rate=16000, #rate = 16000
              input=True,
              frames_per_buffer=1024,
              input_device_index=3)
stream.start_stream()

#silence to speech/speech to silence indicator 
utterance_started = False

#decode input speech
decoder.start_utt()
while True:
    #print 'running'
    buf = stream.read(1024)
    if buf:
        decoder.process_raw(buf, False, False)
        if decoder.get_in_speech() !=utterance_started:
            utterance_started = decoder.get_in_speech()
            if not utterance_started:
                decoder.end_utt()
                print 'status decoder.get_search():', decoder.get_search()

                try:
                    if decoder.hyp().hypstr != None:
                        print 'hypothesis:', decoder.hyp().hypstr
                except AttributeError:
                    pass
                time.sleep(1)

                if decoder.get_search() == 'keyword':
                    decoder.set_search('lm')
                else:
                    decoder.set_search('keyword')

                    decoder.start_utt()
    else:
        break
decoder.end_utt()

Is this understanding correct?

No you can use keyword spotting for commands without activation keyphrase since your commands are simple.

What is difference between set_keyphrase and set_kws?

keyphrase is single activation keyphrase from a string, kws configures multiple keyphrases from file

Can set_kws work on a phrase("ok pi")?

Yes if you put the phrase in a file.

Can I somehow make my USB microphone sample at lower than default rate?

You can configure alsa to do resampling or you can use pulseaudio, it will do resampling automatically.

If not, is there any USB microphone you guys have used that has default sample rate at 16k Hz?

It depends on sound card not microphone. It is fine to record at 44khz, you just need to do resampling properly.

ERROR: [Error Input overflowed] -9981

This means the software is too slow, you can debug it by running same recognition from prerecorded audio file and collecting applicaiton logs.

Jing Yu - 2017-03-29

Thank you so much for such fast reply! I will look into running prerecorded file. One question: I understand that since my phrases are simple I can just directly use keyword spotting.. but I really need to have an activation first and THEN commands so that i can prevent any false positive commands.. Is there any way i can do that with pocket sphinx? Thank you again for all the time, help, and consideration!!

Last edit: Jing Yu 2017-03-29

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2017-03-29
  
  Is there any way i can do that with pocket sphinx?
  
  It is implemented in the code above.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Jing Yu - 2017-03-29
    
    So I have 2 findings after running the example codes in swig:
    
    I was running kws_test.py - successful
    1. but as soon as I change the ".dic" from us language dict to my own generated .dic (still contains "go", "forward" and "metes") , the program does not recognize anything any more.
    2.
    2. I then restore the example code to its original code. After making sure it runs, I commented the stream = open(os.path.join(datadir, "goforward.raw"), "rb") and uncommented what was originally in the example file with ONE change - I added input_device_index=5 to the p.open line:
    
    stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, input_device_index=5, frame_per_buffer=1024)
    
    When I run the program, it errored with the following:
    Traceback: return pa.read_stream(self._stream, num_frames)
    IOError: [Errno Input overflowed] -9981
    
    Please advise why these two issues are happening... I feel like I have drained all the online resources so I am kind of desparate right now... I really really appreciate your selfless time and consideration!!!!
    
    Last edit: Jing Yu 2017-03-29
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Nickolay V. Shmyrev - 2017-03-29
      
      To get help on this issue you need to provide:
      
      1) All data file syou are using
      2) Model of raspberry pi you have
      3) Complete pocketsphinx log output.
      4) Complete code you are running
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Jing Yu - 2017-03-30
        
        1) I tested on this file:https://github.com/cmusphinx/pocketsphinx/blob/master/swig/python/test/kws_test.py, using standard files (en-us) also from github.
        
        2) I use Rpi 3 on Rasbian Jessie
        
        3) I am not exactly sure how to get the log file after some search on internet so I just run "python kws_test.py" in cmd line and saw the error output. I had a lot of trouble configuring my usb sound card before so I uninstalled pulse audio and made changes to ~/.asoundrc to have my usb card on 0 and default pcm on 1.
        
        ALSA lib pcm_dsnoop.c:618:(snd_pcm_dsnoop_open) unable to open slave ALSA lib pcm_dmix.c:1022:(snd_pcm_dmix_open) unable to open slave ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline ALSA lib pulse.c:243:(pulse_connect) PulseAudio: Unable to connect: Connection refused ALSA lib pulse.c:243:(pulse_connect) PulseAudio: Unable to connect: Connection refused ALSA lib pcm_dmix.c:1022:(snd_pcm_dmix_open) unable to open slave Cannot connect to server socket err = No such file or directory Cannot connect to server request channel jack server is not running or cannot be started Segmentation fault
        
        4) I was running this example file from github: file:https://github.com/cmusphinx/pocketsphinx/blob/master/swig/python/test/kws_test.py
        Ps: many thanks for all the wonderful examples you wrote!!!!
        
        Please let me know if there is anything I can provide. I really appreciate your help!!
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Nickolay V. Shmyrev - 2017-03-30
        
        To debug segmentation fault you need to run the python under gdb and collect backtrace when it crash:
        
        gdb --args python kws_test.py
        
        then type run to run the command. When it crashes type bt and save the output.
        
        I recommend to test decoding from a file instead of microphone first.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

i guess a more important question is...how do I resample after setting pyaudio at 44.1k? Thank you!

You do not need resample. You can process 44.1 khz audio, you need to add options:

config.set_float('-samprate', 44100.0)
config.set_int('-nfft', 2048)

This is awesome and super helpful!!

I ran the codes with pre-recorded audio and had no problem.. its the Pyaudio that is my pain point.. Even just a simple record-playback python code using pyaudio gives me "overflowed" error code. Below is my code:

import sys,os, time
from pocketsphinx.pocketsphinx import *
from sphinxbase.sphinxbase import *

#initialize decoder configuration
config = Decoder.default_config()
config.set_string('-hmm', '/home/pi/pocketsphinx-5prealpha/model/en-us/en-us')
config.set_string('-dict', '/home/pi/pocketsphinx-5prealpha/8000.dic')
config.set_float('-samprate', 44100.0)
config.set_int('-nfft', 2048)

#set up decoder search mode with defined language model
decoder = Decoder(config)

#lm = NGramModel(config, decoder.get_logmath(), path.join(modeldir, '8000.lm'))
#decoder.set_lm('8000', lm)
decoder.set_lm_file('lm', '/home/pi/pocketsphinx-5prealpha/8000.lm')
decoder.set_kws('keyword', 'keyword.list')
decoder.set_search('keyword')

#PyAudio set up
import pyaudio
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16,
              channels=1,
              rate=44100, #rate = 16000
              input=True,
              input_device_index=5, 
              frames_per_buffer=8192)
stream.start_stream()

#silence to speech/speech to silence indicator 
utterance_started = False

#decode input speech
decoder.start_utt()
while True:
    #print 'running'
    buf = stream.read(8198)
    if buf:
        decoder.process_raw(buf, False, False)
        if decoder.get_in_speech() !=utterance_started:
            utterance_started = decoder.get_in_speech()
            if not utterance_started:
                decoder.end_utt()
                print 'status decoder.get_search():', decoder.get_search()

                try:
                    if decoder.hyp().hypstr != None:
                        print 'hypothesis:', decoder.hyp().hypstr
                except AttributeError:
                    pass

                if decoder.get_search() == 'keyword':
                    decoder.set_search('lm')
                    print '-----set to lm mode now'
                else:
                    decoder.set_search('keyword')
                    print '-----set to keyword mode now'
                    decoder.start_utt()

    else:
        break
decoder.end_utt()

The error output looks like this:

Starting program: /usr/bin/python kw_to_grammar.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
INFO: pocketsphinx.c(152): Parsed model-specific feature parameters from /home/pi/pocketsphinx-5prealpha/model/en-us/en-us/feat.params
Current configuration:
[NAME]          [DEFLT]     [VALUE]
-agc            none        none
-agcthresh      2.0     2.000000e+00
-allphone               
-allphone_ci        no      no
-alpha          0.97        9.700000e-01
-ascale         20.0        2.000000e+01
-aw         1       1
-backtrace      no      no
-beam           1e-48       1.000000e-48
-bestpath       yes     yes
-bestpathlw     9.5     9.500000e+00
-ceplen         13      13
-cmn            live        batch
-cmninit        40,3,-1     41.00,-5.29,-0.12,5.09,2.48,-4.07,-1.37,-1.78,-5.08,-2.05,-6.45,-1.42,1.17
-compallsen     no      no
-debug                  0
-dict                   /home/pi/pocketsphinx-5prealpha/8000.dic
-dictcase       no      no
-dither         no      no
-doublebw       no      no
-ds         1       1
-fdict                  
-feat           1s_c_d_dd   1s_c_d_dd
-featparams             
-fillprob       1e-8        1.000000e-08
-frate          100     100
-fsg                    
-fsgusealtpron      yes     yes
-fsgusefiller       yes     yes
-fwdflat        yes     yes
-fwdflatbeam        1e-64       1.000000e-64
-fwdflatefwid       4       4
-fwdflatlw      8.5     8.500000e+00
-fwdflatsfwin       25      25
-fwdflatwbeam       7e-29       7.000000e-29
-fwdtree        yes     yes
-hmm                    /home/pi/pocketsphinx-5prealpha/model/en-us/en-us
-input_endian       little      little
-jsgf                   
-keyphrase              
-kws                    
-kws_delay      10      10
-kws_plp        1e-1        1.000000e-01
-kws_threshold      1       1.000000e+00
-latsize        5000        5000
-lda                    
-ldadim         0       0
-lifter         0       22
-lm                 
-lmctl                  
-lmname                 
-logbase        1.0001      1.000100e+00
-logfn                  
-logspec        no      no
-lowerf         133.33334   1.300000e+02
-lpbeam         1e-40       1.000000e-40
-lponlybeam     7e-29       7.000000e-29
-lw         6.5     6.500000e+00
-maxhmmpf       30000       30000
-maxwpf         -1      -1
-mdef                   
-mean                   
-mfclogdir              
-min_endfr      0       0
-mixw                   
-mixwfloor      0.0000001   1.000000e-07
-mllr                   
-mmap           yes     yes
-ncep           13      13
-nfft           512     2048
-nfilt          40      25
-nwpen          1.0     1.000000e+00
-pbeam          1e-48       1.000000e-48
-pip            1.0     1.000000e+00
-pl_beam        1e-10       1.000000e-10
-pl_pbeam       1e-10       1.000000e-10
-pl_pip         1.0     1.000000e+00
-pl_weight      3.0     3.000000e+00
-pl_window      5       5
-rawlogdir              
-remove_dc      no      no
-remove_noise       yes     yes
-remove_silence     yes     yes
-round_filters      yes     yes
-samprate       16000       4.410000e+04
-seed           -1      -1
-sendump                
-senlogdir              
-senmgau                
-silprob        0.005       5.000000e-03
-smoothspec     no      no
-svspec                 0-12/13-25/26-38
-tmat                   
-tmatfloor      0.0001      1.000000e-04
-topn           4       4
-topn_beam      0       0
-toprule                
-transform      legacy      dct
-unit_area      yes     yes
-upperf         6855.4976   6.800000e+03
-uw         1.0     1.000000e+00
-vad_postspeech     50      50
-vad_prespeech      20      20
-vad_startspeech    10      10
-vad_threshold      2.0     2.000000e+00
-var                    
-varfloor       0.0001      1.000000e-04
-varnorm        no      no
-verbose        no      no
-warp_params                
-warp_type      inverse_linear  inverse_linear
-wbeam          7e-29       7.000000e-29
-wip            0.65        6.500000e-01
-wlen           0.025625    2.562500e-02

INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='batch', VARNORM='no', AGC='none'
INFO: acmod.c(162): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(518): Reading model definition: /home/pi/pocketsphinx-5prealpha/model/en-us/en-us/mdef
INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: /home/pi/pocketsphinx-5prealpha/model/en-us/en-us/mdef
INFO: bin_mdef.c(516): 42 CI-phone, 137053 CD-phone, 3 emitstate/phone, 126 CI-sen, 5126 Sen, 29324 Sen-Seq
INFO: tmat.c(149): Reading HMM transition probability matrices: /home/pi/pocketsphinx-5prealpha/model/en-us/en-us/transition_matrices
INFO: acmod.c(113): Attempting to use PTM computation module
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /home/pi/pocketsphinx-5prealpha/model/en-us/en-us/means
INFO: ms_gauden.c(242): 42 codebook, 3 feature, size: 
INFO: ms_gauden.c(244):  128x13
INFO: ms_gauden.c(244):  128x13
INFO: ms_gauden.c(244):  128x13
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /home/pi/pocketsphinx-5prealpha/model/en-us/en-us/variances
INFO: ms_gauden.c(242): 42 codebook, 3 feature, size: 
INFO: ms_gauden.c(244):  128x13
INFO: ms_gauden.c(244):  128x13
INFO: ms_gauden.c(244):  128x13
INFO: ms_gauden.c(304): 222 variance values floored
INFO: ptm_mgau.c(476): Loading senones from dump file /home/pi/pocketsphinx-5prealpha/model/en-us/en-us/sendump
INFO: ptm_mgau.c(500): BEGIN FILE FORMAT DESCRIPTION
INFO: ptm_mgau.c(563): Rows: 128, Columns: 5126
INFO: ptm_mgau.c(595): Using memory-mapped I/O for senones
INFO: ptm_mgau.c(838): Maximum top-N: 4
INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
INFO: dict.c(320): Allocating 4107 * 20 bytes (80 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: /home/pi/pocketsphinx-5prealpha/8000.dic
INFO: dict.c(213): Dictionary size 6, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(336): 6 words read
INFO: dict.c(358): Reading filler dictionary: /home/pi/pocketsphinx-5prealpha/model/en-us/en-us/noisedict
INFO: dict.c(213): Dictionary size 11, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(361): 5 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 42^3 * 2 bytes (144 KiB) for word-initial triphones
INFO: dict2pid.c(132): Allocated 21336 bytes (20 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 21336 bytes (20 KiB) for single-phone word triphones
INFO: ngram_model_trie.c(354): Trying to read LM in trie binary format
INFO: ngram_model_trie.c(365): Header doesn't match
INFO: ngram_model_trie.c(177): Trying to read LM in arpa format
INFO: ngram_model_trie.c(193): LM of order 3
INFO: ngram_model_trie.c(195): #1-grams: 7
INFO: ngram_model_trie.c(195): #2-grams: 8
INFO: ngram_model_trie.c(195): #3-grams: 6
INFO: lm_trie.c(474): Training quantizer
INFO: lm_trie.c(482): Building LM trie
INFO: ngram_search_fwdtree.c(74): Initializing search tree
INFO: ngram_search_fwdtree.c(101): 5 unique initial diphones
INFO: ngram_search_fwdtree.c(186): Creating search channels
INFO: ngram_search_fwdtree.c(323): Max nonroot chan increased to 143
INFO: ngram_search_fwdtree.c(333): Created 5 root, 15 non-root channels, 5 single-phone words
INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
INFO: kws_search.c(406): KWS(beam: -1080, plp: -23, default threshold 0, delay 10)
ALSA lib pcm_dsnoop.c:618:(snd_pcm_dsnoop_open) unable to open slave
ALSA lib pcm_dmix.c:1022:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
[New Thread 0x7188d460 (LWP 1195)]
ALSA lib pulse.c:243:(pulse_connect) PulseAudio: Unable to connect: Connection refused

[Thread 0x7188d460 (LWP 1195) exited]
[New Thread 0x7188d460 (LWP 1196)]
ALSA lib pulse.c:243:(pulse_connect) PulseAudio: Unable to connect: Connection refused

[Thread 0x7188d460 (LWP 1196) exited]
ALSA lib pcm_dmix.c:1022:(snd_pcm_dmix_open) unable to open slave
[New Thread 0x7588e460 (LWP 1197)]
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
[Thread 0x7588e460 (LWP 1197) exited]

Program received signal SIGSEGV, Segmentation fault.
0x76fae11c in ?? () from /usr/lib/python2.7/dist-packages/_portaudio.so

I read through the error and it seems that pyaudio can not open my usb microphone.. I have already set frame_per_buffer to be a lot bigger but it still overflows... should I set it higher?

Jing Yu - 2017-03-30

If this is helpful, in python shell the error returns:

Traceback (most recent call last): File "/home/pi/kw_to_grammar.py", line 46, in <module> decoder.end_utt() File "/usr/local/lib/python2.7/dist-packages/pocketsphinx/pocketsphinx.py", line 321, in end_utt return _pocketsphinx.Decoder_end_utt(self) RuntimeError: Decoder_end_utt returned -1
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2017-03-30
  
  I asked you to provide backtrace above. I will not ask third time.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Jing Yu - 2017-03-30
    
    I am so sorry about the miscomminucation. I tried to use gdb backtrace but it returned "no stack.". I searched online to see if I can resolve this issue but nobody seemed to be talking about no stack.. I then proceeded to try the python debugger pdb and went through my program line by line but the output seemed to be the same as the one I posted. I apologize that I could not offer more detailed info as you asked... I understand this is very difficult so I am very thankful for your time! If you have any comments I am more than willing to learn and try. Thank you!
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Nickolay V. Shmyrev - 2017-03-31
      
      To debug segmentation fault you need to run the python under gdb and collect backtrace when it crash:
      
      gdb --args python kws_test.py
      
      then type run to run the command. When it crashes type bt and save the output.
      
      See also https://wiki.mageia.org/en/Debugging_software_crashes
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Jing Yu - 2017-04-03
        
        Hello Nickolay!
        
        Thank you for all the help and patience. I have resolved all the pyaudio related issues and now have no errors and can record things smoothly. The only problem I have encountered is that when i try to configure decoder per your recommandation before:
        
        config.set_float('-samprate', 44100.0)
        
        it gives me a RuntimeError error for new_decoder return -1.
        
        Once I commented out this line everything works perfectly. However the recognition is really bad thats why I want to match samprates... Any thoughts? Maybe this line does not work for pocketsphin-5prealpha version?
        
        BIG THANKS FOR HELPING ME GET THIS FAR!!!!
        
        Last edit: Jing Yu 2017-04-03
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Nickolay V. Shmyrev - 2017-04-03
        
        You can process 44.1 khz audio, you need to add TWO options TOGETHER:
        
        config.set_float('-samprate', 44100.0) config.set_int('-nfft', 2048)
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Hey guys. Im trying to replicate the above code with the keyword 'Hello Computer'. Changed the dic paths appropriately, but only have 1 keyphrase, which is in my dictionary (1489.dic). Just to be sure, the line decoder.set_kws('keyword', 'keyword.list') where keyword.list is a path to a file containing keywords that is foud in my home directory, and 'keyword' is a predfined arg for set_kws and not where i should put 'hello computer'. ?

Secondly, here is my error output:

ALSA lib pcm_dmix.c:1022:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm_dmix.c:1022:(snd_pcm_dmix_open) unable to open slave
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
INFO: cmn_live.c(120): Update from < 41.00 -5.29 -0.12  5.09  2.48 -4.07 -1.37 -1.78 -5.08 -2.05 -6.45 -1.42  1.17 >
INFO: cmn_live.c(138): Update to   < 27.74 14.31 12.36 -1.54  1.00 -0.99 -2.98 -1.79  0.55 -0.95 -1.67  1.60  1.33 >
INFO: kws_search.c(656): kws 0.30 CPU 0.357 xRT
INFO: kws_search.c(658): kws 0.98 wall 1.170 xRT
Result:
Traceback (most recent call last):
  File "key-gram-switch.py", line 41, in <module>
    print 'Result:', decoder.hyp().hypstr
AttributeError: 'NoneType' object has no attribute 'hypstr'
INFO: kws_search.c(448): TOTAL kws 0.00 CPU nan xRT
INFO: kws_search.c(451): TOTAL kws 0.00 wall nan xRT
INFO: ngram_search_fwdtree.c(429): TOTAL fwdtree 0.00 CPU nan xRT
INFO: ngram_search_fwdtree.c(432): TOTAL fwdtree 0.00 wall nan xRT
INFO: ngram_search_fwdflat.c(176): TOTAL fwdflat 0.00 CPU nan xRT
INFO: ngram_search_fwdflat.c(179): TOTAL fwdflat 0.00 wall nan xRT
INFO: ngram_search.c(303): TOTAL bestpath 0.00 CPU nan xRT
INFO: ngram_search.c(306): TOTAL bestpath 0.00 wall nan xRT
INFO: kws_search.c(448): TOTAL kws 0.30 CPU 0.361 xRT
INFO: kws_search.c(451): TOTAL kws 0.98 wall 1.184 xRT

my microphone is fully functional as well.

Last edit: Bari Tala 2017-04-09

Hey sorry, here is my code.

#!/usr/bin/env

import sys, os
import pyaudio
from pocketsphinx.pocketsphinx import *
from sphinxbase.sphinxbase import *

MODELDIR = "/home/pi/pocketsphinx-5prealpha/model"
datadir = "/home/pi/"

#Init decoder
config = Decoder.default_config()
config.set_string('-hmm', os.path.join(MODELDIR, 'en-us/en-us'))
config.set_string('-dict', os.path.join(MODELDIR, '1489.dic'))
config.set_float('-kws_threshold', 1e-20)

decoder = Decoder(config)

# Add searches
#decoder.set_kws('keyword', '/home/pi/keyword.list')
decoder.set_keyphrase("keyword", "HELLO COMPUTER")
decoder.set_lm_file('lm', '/home/pi/pocketsphinx-5prealpha/model/1489.lm')
decoder.set_search('keyword')

p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
stream.start_stream()

in_speech_bf = False
decoder.start_utt()
while True:
    buf = stream.read(1024)
    if buf:
        decoder.process_raw(buf, False, False)
        if decoder.get_in_speech() != in_speech_bf:
            in_speech_bf = decoder.get_in_speech()
            if not in_speech_bf:
                decoder.end_utt()

                # Print hypothesis and switch search to another mode
                print 'Result:', decoder.hyp().hypstr

                if decoder.get_search() == 'keyword':
                     decoder.set_search('lm')
                else:
                     decoder.set_search('keyword')

                decoder.start_utt()
    else:
        break
decoder.end_utt()
stream.end_stream()

I'm getting the error as above. But SOMETIMES when I run it and quickly say 'Hello Computer' it will print 'Hello Computer' but then it will give the same error.

Last edit: Bari Tala 2017-04-09

Nickolay V. Shmyrev - 2017-04-09

You need to check hyp for none before accessing hypstr.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bari Tala - 2017-04-09

Hey Nickolay, thanks for the quick response. I'm trying to add an if statement as below. But still getting the same error. You're saying I need to make sure decoder.hyp() is not empty, right?

while True: buf = stream.read(1024) if buf: decoder.process_raw(buf, False, False) if decoder.get_in_speech() != in_speech_bf: in_speech_bf = decoder.get_in_speech() if not in_speech_bf: decoder.end_utt() **if type(decoder.hyp()) is not None:** #make sure not NoneType # Print hypothesis and switch search to another mode print 'Result:', decoder.hyp().hypstr if decoder.get_search() == 'keyword': decoder.set_search('lm') else: decoder.set_search('keyword') decoder.start_utt() else: break decoder.end_utt() stream.end_stream()

Hey figured it out. i changed it now to:

if decoder.hyp() is not None:

After I say 'Hello Computer' then pause and say 'Down' ( or any other word in my dictionary) it again gives me a decoder.hyp() of None. It then won't throw the error anymore, but seems to switch to a continuous recognition instead of switching back to keyword requiring.

Last edit: Bari Tala 2017-04-09
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2017-04-09
  
  http://stackoverflow.com/questions/23086383/how-to-test-nonetype-in-python
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bari Tala - 2017-04-09

SUCCESS!
First an explanation to those wondering whats going on and other various python newbies like myself. When listening, decoder.hyp() defaults to None if it doesn't recognize keyword, the script will terminate when you call hypstr() on the None. So, just make sure you do the keyword-to-continuous switching only when decoder.hyp() isn't none (i.e. when it has heard a keyword).

Thanks for the help. Great product and will keep you updated!

Bari

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- ismail nur adli - 2020-04-26
  
  Hey Bari Tala, if you still have the code, can you share it to me. I kinda desparate
  
  Thanks!
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ismail nur adli - 2020-05-12

Last edit: ismail nur adli 2020-05-12

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Python - PocketSphinx [Errno Input overflowed - 9981]

Speech Recognition Toolkit

Forums

Help

Python - PocketSphinx [Errno Input overflowed - 9981]

Python - PocketSphinx [Errno Input overflowed - 9981]

Speech Recognition Toolkit

Forums

Help

Python - PocketSphinx [Errno Input overflowed - 9981] document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Python - PocketSphinx [Errno Input overflowed - 9981]