Menu

Python - PocketSphinx [Errno Input overflowed - 9981]

Help
Jing Yu
2017-03-29
2020-05-12
  • Jing Yu

    Jing Yu - 2017-03-29

    Hello! First I want to thank you all for any inputs and tips! I attached my code in the end and I have a couple questions and issues here:

    1. I am trying to write a python program that will turn on LED upon my words on Raspberry Pi 3.. I want to add specific activation words and then execute one of the TWO commands from my super tiny language model. After much research, I realized that I have to use "keyword" mode on decoder to activate listening on my Rpi (eg. "hello pi"). THEN I switch decoder to 'lm' mode to decode for (1 out of the 2 possible) command I have for it. (turn on led /turn off led) Is this understanding correct? What is difference between set_keyphrase and set_kws? Can set_kws work on a phrase("ok pi")?

    2. I have been able to successfully record and decode voice from terminal with my USB microphone with the following command:

    -arecord -f cd -c 1 -D plughw:0,0 -r 16k test.wav

    I know my USB microphone is default to be recording at 44100Hz samprate but this way I manually changed it to 16k. However, when I configure pyaudio.open (rate = 16000), it gives me an error says sample rate error. When i switch rate = 44100 (usb default rate) then no error occurs.. Can I somehow make my USB microphone sample at lower than default rate? If not, is there any USB microphone you guys have used that has default sample rate at 16k Hz?

    1. Even using the default sound card on Pi (rate = 16000), within 2 seconds of running program, it gives me an overflowing error:

    buf = stream.read(1024)
    ERROR: [Error Input overflowed] -9981

    I have already been using a 3 sentence language model generated online anyone has any inputs on this? Running my USB microphone at 44100Hz sample rate is also making the overflowed issue worse...

    My code:

    import sys,os, time
    from pocketsphinx.pocketsphinx import *
    from sphinxbase.sphinxbase import *
    
    #initialize decoder configuration
    config = Decoder.default_config()
    config.set_string('-hmm', '/home/pi/pocketsphinx-5prealpha/model/en-us/en-us')
    config.set_string('-dict', '/home/pi/pocketsphinx-5prealpha/8000.dic')
    
    #set up decoder search mode with defined language model
    decoder = Decoder(config)
    
    #lm = NGramModel(config, decoder.get_logmath(), path.join(modeldir, '8000.lm'))
    #decoder.set_lm('8000', lm)
    decoder.set_lm_file('lm', '/home/pi/pocketsphinx-5prealpha/8000.lm')
    decoder.set_kws('keyword', 'keyword.list')
    decoder.set_search('keyword')
    
    #PyAudio set up
    import pyaudio
    p = pyaudio.PyAudio()
    stream = p.open(format=pyaudio.paInt16,
                  channels=1,
                  rate=16000, #rate = 16000
                  input=True,
                  frames_per_buffer=1024,
                  input_device_index=3)
    stream.start_stream()
    
    #silence to speech/speech to silence indicator 
    utterance_started = False
    
    #decode input speech
    decoder.start_utt()
    while True:
        #print 'running'
        buf = stream.read(1024)
        if buf:
            decoder.process_raw(buf, False, False)
            if decoder.get_in_speech() !=utterance_started:
                utterance_started = decoder.get_in_speech()
                if not utterance_started:
                    decoder.end_utt()
                    print 'status decoder.get_search():', decoder.get_search()
    
                    try:
                        if decoder.hyp().hypstr != None:
                            print 'hypothesis:', decoder.hyp().hypstr
                    except AttributeError:
                        pass
                    time.sleep(1)
    
                    if decoder.get_search() == 'keyword':
                        decoder.set_search('lm')
                    else:
                        decoder.set_search('keyword')
    
                        decoder.start_utt()
        else:
            break
    decoder.end_utt()
    
     
    • Nickolay V. Shmyrev

      Is this understanding correct?

      No you can use keyword spotting for commands without activation keyphrase since your commands are simple.

      What is difference between set_keyphrase and set_kws?

      keyphrase is single activation keyphrase from a string, kws configures multiple keyphrases from file

      Can set_kws work on a phrase("ok pi")?

      Yes if you put the phrase in a file.

      Can I somehow make my USB microphone sample at lower than default rate?

      You can configure alsa to do resampling or you can use pulseaudio, it will do resampling automatically.

      If not, is there any USB microphone you guys have used that has default sample rate at 16k Hz?

      It depends on sound card not microphone. It is fine to record at 44khz, you just need to do resampling properly.

      ERROR: [Error Input overflowed] -9981

      This means the software is too slow, you can debug it by running same recognition from prerecorded audio file and collecting applicaiton logs.

       
      • Jing Yu

        Jing Yu - 2017-03-29

        Thank you so much for such fast reply! I will look into running prerecorded file. One question: I understand that since my phrases are simple I can just directly use keyword spotting.. but I really need to have an activation first and THEN commands so that i can prevent any false positive commands.. Is there any way i can do that with pocket sphinx? Thank you again for all the time, help, and consideration!!

         

        Last edit: Jing Yu 2017-03-29
        • Nickolay V. Shmyrev

          Is there any way i can do that with pocket sphinx?

          It is implemented in the code above.

           
          • Jing Yu

            Jing Yu - 2017-03-29

            So I have 2 findings after running the example codes in swig:

            I was running kws_test.py - successful
            1. but as soon as I change the ".dic" from us language dict to my own generated .dic (still contains "go", "forward" and "metes") , the program does not recognize anything any more.
            2.
            2. I then restore the example code to its original code. After making sure it runs, I commented the stream = open(os.path.join(datadir, "goforward.raw"), "rb") and uncommented what was originally in the example file with ONE change - I added input_device_index=5 to the p.open line:

                stream = p.open(format=pyaudio.paInt16,
                                                channels=1,
                                                rate=16000,
                                                input=True,
                                                input_device_index=5,
                                                frame_per_buffer=1024)
            

            When I run the program, it errored with the following:
            Traceback: return pa.read_stream(self._stream, num_frames)
            IOError: [Errno Input overflowed] -9981

            Please advise why these two issues are happening... I feel like I have drained all the online resources so I am kind of desparate right now... I really really appreciate your selfless time and consideration!!!!

             

            Last edit: Jing Yu 2017-03-29
            • Nickolay V. Shmyrev

              To get help on this issue you need to provide:

              1) All data file syou are using
              2) Model of raspberry pi you have
              3) Complete pocketsphinx log output.
              4) Complete code you are running

               
              • Jing Yu

                Jing Yu - 2017-03-30

                1) I tested on this file:https://github.com/cmusphinx/pocketsphinx/blob/master/swig/python/test/kws_test.py, using standard files (en-us) also from github.

                2) I use Rpi 3 on Rasbian Jessie

                3) I am not exactly sure how to get the log file after some search on internet so I just run "python kws_test.py" in cmd line and saw the error output. I had a lot of trouble configuring my usb sound card before so I uninstalled pulse audio and made changes to ~/.asoundrc to have my usb card on 0 and default pcm on 1.

                ALSA lib pcm_dsnoop.c:618:(snd_pcm_dsnoop_open) unable to open slave
                ALSA lib pcm_dmix.c:1022:(snd_pcm_dmix_open) unable to open slave
                ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
                ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
                ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
                ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
                ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
                ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
                ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
                ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
                ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
                ALSA lib pulse.c:243:(pulse_connect) PulseAudio: Unable to connect: Connection refused
                
                ALSA lib pulse.c:243:(pulse_connect) PulseAudio: Unable to connect: Connection refused
                
                ALSA lib pcm_dmix.c:1022:(snd_pcm_dmix_open) unable to open slave
                Cannot connect to server socket err = No such file or directory
                Cannot connect to server request channel
                jack server is not running or cannot be started
                Segmentation fault
                

                4) I was running this example file from github: file:https://github.com/cmusphinx/pocketsphinx/blob/master/swig/python/test/kws_test.py
                Ps: many thanks for all the wonderful examples you wrote!!!!

                Please let me know if there is anything I can provide. I really appreciate your help!!

                 
                • Nickolay V. Shmyrev

                  To debug segmentation fault you need to run the python under gdb and collect backtrace when it crash:

                   gdb --args python kws_test.py
                  

                  then type run to run the command. When it crashes type bt and save the output.

                  I recommend to test decoding from a file instead of microphone first.

                   
      • Jing Yu

        Jing Yu - 2017-03-30

        i guess a more important question is...how do I resample after setting pyaudio at 44.1k? Thank you!

         
        • Nickolay V. Shmyrev

          You do not need resample. You can process 44.1 khz audio, you need to add options:

          config.set_float('-samprate', 44100.0)
          config.set_int('-nfft', 2048)
          
           
          • Jing Yu

            Jing Yu - 2017-03-30

            This is awesome and super helpful!!

            I ran the codes with pre-recorded audio and had no problem.. its the Pyaudio that is my pain point.. Even just a simple record-playback python code using pyaudio gives me "overflowed" error code. Below is my code:

            import sys,os, time
            from pocketsphinx.pocketsphinx import *
            from sphinxbase.sphinxbase import *
            
            #initialize decoder configuration
            config = Decoder.default_config()
            config.set_string('-hmm', '/home/pi/pocketsphinx-5prealpha/model/en-us/en-us')
            config.set_string('-dict', '/home/pi/pocketsphinx-5prealpha/8000.dic')
            config.set_float('-samprate', 44100.0)
            config.set_int('-nfft', 2048)
            
            #set up decoder search mode with defined language model
            decoder = Decoder(config)
            
            #lm = NGramModel(config, decoder.get_logmath(), path.join(modeldir, '8000.lm'))
            #decoder.set_lm('8000', lm)
            decoder.set_lm_file('lm', '/home/pi/pocketsphinx-5prealpha/8000.lm')
            decoder.set_kws('keyword', 'keyword.list')
            decoder.set_search('keyword')
            
            #PyAudio set up
            import pyaudio
            p = pyaudio.PyAudio()
            stream = p.open(format=pyaudio.paInt16,
                          channels=1,
                          rate=44100, #rate = 16000
                          input=True,
                          input_device_index=5, 
                          frames_per_buffer=8192)
            stream.start_stream()
            
            #silence to speech/speech to silence indicator 
            utterance_started = False
            
            #decode input speech
            decoder.start_utt()
            while True:
                #print 'running'
                buf = stream.read(8198)
                if buf:
                    decoder.process_raw(buf, False, False)
                    if decoder.get_in_speech() !=utterance_started:
                        utterance_started = decoder.get_in_speech()
                        if not utterance_started:
                            decoder.end_utt()
                            print 'status decoder.get_search():', decoder.get_search()
            
                            try:
                                if decoder.hyp().hypstr != None:
                                    print 'hypothesis:', decoder.hyp().hypstr
                            except AttributeError:
                                pass
            
                            if decoder.get_search() == 'keyword':
                                decoder.set_search('lm')
                                print '-----set to lm mode now'
                            else:
                                decoder.set_search('keyword')
                                print '-----set to keyword mode now'
                                decoder.start_utt()
            
                else:
                    break
            decoder.end_utt()
            

            The error output looks like this:

            Starting program: /usr/bin/python kw_to_grammar.py
            [Thread debugging using libthread_db enabled]
            Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
            INFO: pocketsphinx.c(152): Parsed model-specific feature parameters from /home/pi/pocketsphinx-5prealpha/model/en-us/en-us/feat.params
            Current configuration:
            [NAME]          [DEFLT]     [VALUE]
            -agc            none        none
            -agcthresh      2.0     2.000000e+00
            -allphone               
            -allphone_ci        no      no
            -alpha          0.97        9.700000e-01
            -ascale         20.0        2.000000e+01
            -aw         1       1
            -backtrace      no      no
            -beam           1e-48       1.000000e-48
            -bestpath       yes     yes
            -bestpathlw     9.5     9.500000e+00
            -ceplen         13      13
            -cmn            live        batch
            -cmninit        40,3,-1     41.00,-5.29,-0.12,5.09,2.48,-4.07,-1.37,-1.78,-5.08,-2.05,-6.45,-1.42,1.17
            -compallsen     no      no
            -debug                  0
            -dict                   /home/pi/pocketsphinx-5prealpha/8000.dic
            -dictcase       no      no
            -dither         no      no
            -doublebw       no      no
            -ds         1       1
            -fdict                  
            -feat           1s_c_d_dd   1s_c_d_dd
            -featparams             
            -fillprob       1e-8        1.000000e-08
            -frate          100     100
            -fsg                    
            -fsgusealtpron      yes     yes
            -fsgusefiller       yes     yes
            -fwdflat        yes     yes
            -fwdflatbeam        1e-64       1.000000e-64
            -fwdflatefwid       4       4
            -fwdflatlw      8.5     8.500000e+00
            -fwdflatsfwin       25      25
            -fwdflatwbeam       7e-29       7.000000e-29
            -fwdtree        yes     yes
            -hmm                    /home/pi/pocketsphinx-5prealpha/model/en-us/en-us
            -input_endian       little      little
            -jsgf                   
            -keyphrase              
            -kws                    
            -kws_delay      10      10
            -kws_plp        1e-1        1.000000e-01
            -kws_threshold      1       1.000000e+00
            -latsize        5000        5000
            -lda                    
            -ldadim         0       0
            -lifter         0       22
            -lm                 
            -lmctl                  
            -lmname                 
            -logbase        1.0001      1.000100e+00
            -logfn                  
            -logspec        no      no
            -lowerf         133.33334   1.300000e+02
            -lpbeam         1e-40       1.000000e-40
            -lponlybeam     7e-29       7.000000e-29
            -lw         6.5     6.500000e+00
            -maxhmmpf       30000       30000
            -maxwpf         -1      -1
            -mdef                   
            -mean                   
            -mfclogdir              
            -min_endfr      0       0
            -mixw                   
            -mixwfloor      0.0000001   1.000000e-07
            -mllr                   
            -mmap           yes     yes
            -ncep           13      13
            -nfft           512     2048
            -nfilt          40      25
            -nwpen          1.0     1.000000e+00
            -pbeam          1e-48       1.000000e-48
            -pip            1.0     1.000000e+00
            -pl_beam        1e-10       1.000000e-10
            -pl_pbeam       1e-10       1.000000e-10
            -pl_pip         1.0     1.000000e+00
            -pl_weight      3.0     3.000000e+00
            -pl_window      5       5
            -rawlogdir              
            -remove_dc      no      no
            -remove_noise       yes     yes
            -remove_silence     yes     yes
            -round_filters      yes     yes
            -samprate       16000       4.410000e+04
            -seed           -1      -1
            -sendump                
            -senlogdir              
            -senmgau                
            -silprob        0.005       5.000000e-03
            -smoothspec     no      no
            -svspec                 0-12/13-25/26-38
            -tmat                   
            -tmatfloor      0.0001      1.000000e-04
            -topn           4       4
            -topn_beam      0       0
            -toprule                
            -transform      legacy      dct
            -unit_area      yes     yes
            -upperf         6855.4976   6.800000e+03
            -uw         1.0     1.000000e+00
            -vad_postspeech     50      50
            -vad_prespeech      20      20
            -vad_startspeech    10      10
            -vad_threshold      2.0     2.000000e+00
            -var                    
            -varfloor       0.0001      1.000000e-04
            -varnorm        no      no
            -verbose        no      no
            -warp_params                
            -warp_type      inverse_linear  inverse_linear
            -wbeam          7e-29       7.000000e-29
            -wip            0.65        6.500000e-01
            -wlen           0.025625    2.562500e-02
            
            INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='batch', VARNORM='no', AGC='none'
            INFO: acmod.c(162): Using subvector specification 0-12/13-25/26-38
            INFO: mdef.c(518): Reading model definition: /home/pi/pocketsphinx-5prealpha/model/en-us/en-us/mdef
            INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
            INFO: bin_mdef.c(336): Reading binary model definition: /home/pi/pocketsphinx-5prealpha/model/en-us/en-us/mdef
            INFO: bin_mdef.c(516): 42 CI-phone, 137053 CD-phone, 3 emitstate/phone, 126 CI-sen, 5126 Sen, 29324 Sen-Seq
            INFO: tmat.c(149): Reading HMM transition probability matrices: /home/pi/pocketsphinx-5prealpha/model/en-us/en-us/transition_matrices
            INFO: acmod.c(113): Attempting to use PTM computation module
            INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /home/pi/pocketsphinx-5prealpha/model/en-us/en-us/means
            INFO: ms_gauden.c(242): 42 codebook, 3 feature, size: 
            INFO: ms_gauden.c(244):  128x13
            INFO: ms_gauden.c(244):  128x13
            INFO: ms_gauden.c(244):  128x13
            INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /home/pi/pocketsphinx-5prealpha/model/en-us/en-us/variances
            INFO: ms_gauden.c(242): 42 codebook, 3 feature, size: 
            INFO: ms_gauden.c(244):  128x13
            INFO: ms_gauden.c(244):  128x13
            INFO: ms_gauden.c(244):  128x13
            INFO: ms_gauden.c(304): 222 variance values floored
            INFO: ptm_mgau.c(476): Loading senones from dump file /home/pi/pocketsphinx-5prealpha/model/en-us/en-us/sendump
            INFO: ptm_mgau.c(500): BEGIN FILE FORMAT DESCRIPTION
            INFO: ptm_mgau.c(563): Rows: 128, Columns: 5126
            INFO: ptm_mgau.c(595): Using memory-mapped I/O for senones
            INFO: ptm_mgau.c(838): Maximum top-N: 4
            INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
            INFO: dict.c(320): Allocating 4107 * 20 bytes (80 KiB) for word entries
            INFO: dict.c(333): Reading main dictionary: /home/pi/pocketsphinx-5prealpha/8000.dic
            INFO: dict.c(213): Dictionary size 6, allocated 0 KiB for strings, 0 KiB for phones
            INFO: dict.c(336): 6 words read
            INFO: dict.c(358): Reading filler dictionary: /home/pi/pocketsphinx-5prealpha/model/en-us/en-us/noisedict
            INFO: dict.c(213): Dictionary size 11, allocated 0 KiB for strings, 0 KiB for phones
            INFO: dict.c(361): 5 words read
            INFO: dict2pid.c(396): Building PID tables for dictionary
            INFO: dict2pid.c(406): Allocating 42^3 * 2 bytes (144 KiB) for word-initial triphones
            INFO: dict2pid.c(132): Allocated 21336 bytes (20 KiB) for word-final triphones
            INFO: dict2pid.c(196): Allocated 21336 bytes (20 KiB) for single-phone word triphones
            INFO: ngram_model_trie.c(354): Trying to read LM in trie binary format
            INFO: ngram_model_trie.c(365): Header doesn't match
            INFO: ngram_model_trie.c(177): Trying to read LM in arpa format
            INFO: ngram_model_trie.c(193): LM of order 3
            INFO: ngram_model_trie.c(195): #1-grams: 7
            INFO: ngram_model_trie.c(195): #2-grams: 8
            INFO: ngram_model_trie.c(195): #3-grams: 6
            INFO: lm_trie.c(474): Training quantizer
            INFO: lm_trie.c(482): Building LM trie
            INFO: ngram_search_fwdtree.c(74): Initializing search tree
            INFO: ngram_search_fwdtree.c(101): 5 unique initial diphones
            INFO: ngram_search_fwdtree.c(186): Creating search channels
            INFO: ngram_search_fwdtree.c(323): Max nonroot chan increased to 143
            INFO: ngram_search_fwdtree.c(333): Created 5 root, 15 non-root channels, 5 single-phone words
            INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
            INFO: kws_search.c(406): KWS(beam: -1080, plp: -23, default threshold 0, delay 10)
            ALSA lib pcm_dsnoop.c:618:(snd_pcm_dsnoop_open) unable to open slave
            ALSA lib pcm_dmix.c:1022:(snd_pcm_dmix_open) unable to open slave
            ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
            ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
            ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
            ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
            ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
            ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
            ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
            ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
            ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
            [New Thread 0x7188d460 (LWP 1195)]
            ALSA lib pulse.c:243:(pulse_connect) PulseAudio: Unable to connect: Connection refused
            
            [Thread 0x7188d460 (LWP 1195) exited]
            [New Thread 0x7188d460 (LWP 1196)]
            ALSA lib pulse.c:243:(pulse_connect) PulseAudio: Unable to connect: Connection refused
            
            [Thread 0x7188d460 (LWP 1196) exited]
            ALSA lib pcm_dmix.c:1022:(snd_pcm_dmix_open) unable to open slave
            [New Thread 0x7588e460 (LWP 1197)]
            Cannot connect to server socket err = No such file or directory
            Cannot connect to server request channel
            jack server is not running or cannot be started
            [Thread 0x7588e460 (LWP 1197) exited]
            
            Program received signal SIGSEGV, Segmentation fault.
            0x76fae11c in ?? () from /usr/lib/python2.7/dist-packages/_portaudio.so
            

            I read through the error and it seems that pyaudio can not open my usb microphone.. I have already set frame_per_buffer to be a lot bigger but it still overflows... should I set it higher?

             
          • Jing Yu

            Jing Yu - 2017-03-30

            If this is helpful, in python shell the error returns:

            Traceback (most recent call last):
              File "/home/pi/kw_to_grammar.py", line 46, in <module>
                decoder.end_utt()
              File "/usr/local/lib/python2.7/dist-packages/pocketsphinx/pocketsphinx.py", line 321, in end_utt
                return _pocketsphinx.Decoder_end_utt(self)
            RuntimeError: Decoder_end_utt returned -1
            
             
            • Nickolay V. Shmyrev

              I asked you to provide backtrace above. I will not ask third time.

               
              • Jing Yu

                Jing Yu - 2017-03-30

                I am so sorry about the miscomminucation. I tried to use gdb backtrace but it returned "no stack.". I searched online to see if I can resolve this issue but nobody seemed to be talking about no stack.. I then proceeded to try the python debugger pdb and went through my program line by line but the output seemed to be the same as the one I posted. I apologize that I could not offer more detailed info as you asked... I understand this is very difficult so I am very thankful for your time! If you have any comments I am more than willing to learn and try. Thank you!

                 
                • Nickolay V. Shmyrev

                  To debug segmentation fault you need to run the python under gdb and collect backtrace when it crash:

                     gdb --args python kws_test.py
                  

                  then type run to run the command. When it crashes type bt and save the output.

                  See also https://wiki.mageia.org/en/Debugging_software_crashes

                   
                  • Jing Yu

                    Jing Yu - 2017-04-03

                    Hello Nickolay!

                    Thank you for all the help and patience. I have resolved all the pyaudio related issues and now have no errors and can record things smoothly. The only problem I have encountered is that when i try to configure decoder per your recommandation before:

                    config.set_float('-samprate', 44100.0)

                    it gives me a RuntimeError error for new_decoder return -1.

                    Once I commented out this line everything works perfectly. However the recognition is really bad thats why I want to match samprates... Any thoughts? Maybe this line does not work for pocketsphin-5prealpha version?

                    BIG THANKS FOR HELPING ME GET THIS FAR!!!!

                     

                    Last edit: Jing Yu 2017-04-03
                    • Nickolay V. Shmyrev

                      You can process 44.1 khz audio, you need to add TWO options TOGETHER:

                      config.set_float('-samprate', 44100.0)
                      config.set_int('-nfft', 2048)
                      
                       
  • Bari Tala

    Bari Tala - 2017-04-09

    Hey guys. Im trying to replicate the above code with the keyword 'Hello Computer'. Changed the dic paths appropriately, but only have 1 keyphrase, which is in my dictionary (1489.dic). Just to be sure, the line decoder.set_kws('keyword', 'keyword.list') where keyword.list is a path to a file containing keywords that is foud in my home directory, and 'keyword' is a predfined arg for set_kws and not where i should put 'hello computer'. ?

    Secondly, here is my error output:

    ALSA lib pcm_dmix.c:1022:(snd_pcm_dmix_open) unable to open slave
    ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
    ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
    ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
    ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
    ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
    ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
    ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
    ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
    ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
    ALSA lib pcm_dmix.c:1022:(snd_pcm_dmix_open) unable to open slave
    Cannot connect to server socket err = No such file or directory
    Cannot connect to server request channel
    jack server is not running or cannot be started
    INFO: cmn_live.c(120): Update from < 41.00 -5.29 -0.12  5.09  2.48 -4.07 -1.37 -1.78 -5.08 -2.05 -6.45 -1.42  1.17 >
    INFO: cmn_live.c(138): Update to   < 27.74 14.31 12.36 -1.54  1.00 -0.99 -2.98 -1.79  0.55 -0.95 -1.67  1.60  1.33 >
    INFO: kws_search.c(656): kws 0.30 CPU 0.357 xRT
    INFO: kws_search.c(658): kws 0.98 wall 1.170 xRT
    Result:
    Traceback (most recent call last):
      File "key-gram-switch.py", line 41, in <module>
        print 'Result:', decoder.hyp().hypstr
    AttributeError: 'NoneType' object has no attribute 'hypstr'
    INFO: kws_search.c(448): TOTAL kws 0.00 CPU nan xRT
    INFO: kws_search.c(451): TOTAL kws 0.00 wall nan xRT
    INFO: ngram_search_fwdtree.c(429): TOTAL fwdtree 0.00 CPU nan xRT
    INFO: ngram_search_fwdtree.c(432): TOTAL fwdtree 0.00 wall nan xRT
    INFO: ngram_search_fwdflat.c(176): TOTAL fwdflat 0.00 CPU nan xRT
    INFO: ngram_search_fwdflat.c(179): TOTAL fwdflat 0.00 wall nan xRT
    INFO: ngram_search.c(303): TOTAL bestpath 0.00 CPU nan xRT
    INFO: ngram_search.c(306): TOTAL bestpath 0.00 wall nan xRT
    INFO: kws_search.c(448): TOTAL kws 0.30 CPU 0.361 xRT
    INFO: kws_search.c(451): TOTAL kws 0.98 wall 1.184 xRT
    

    my microphone is fully functional as well.

     

    Last edit: Bari Tala 2017-04-09
  • Bari Tala

    Bari Tala - 2017-04-09

    Hey sorry, here is my code.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    #!/usr/bin/env
    
    import sys, os
    import pyaudio
    from pocketsphinx.pocketsphinx import *
    from sphinxbase.sphinxbase import *
    
    MODELDIR = "/home/pi/pocketsphinx-5prealpha/model"
    datadir = "/home/pi/"
    
    #Init decoder
    config = Decoder.default_config()
    config.set_string('-hmm', os.path.join(MODELDIR, 'en-us/en-us'))
    config.set_string('-dict', os.path.join(MODELDIR, '1489.dic'))
    config.set_float('-kws_threshold', 1e-20)
    
    decoder = Decoder(config)
    
    # Add searches
    #decoder.set_kws('keyword', '/home/pi/keyword.list')
    decoder.set_keyphrase("keyword", "HELLO COMPUTER")
    decoder.set_lm_file('lm', '/home/pi/pocketsphinx-5prealpha/model/1489.lm')
    decoder.set_search('keyword')
    
    p = pyaudio.PyAudio()
    stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
    stream.start_stream()
    
    in_speech_bf = False
    decoder.start_utt()
    while True:
        buf = stream.read(1024)
        if buf:
            decoder.process_raw(buf, False, False)
            if decoder.get_in_speech() != in_speech_bf:
                in_speech_bf = decoder.get_in_speech()
                if not in_speech_bf:
                    decoder.end_utt()
    
                    # Print hypothesis and switch search to another mode
                    print 'Result:', decoder.hyp().hypstr
    
                    if decoder.get_search() == 'keyword':
                         decoder.set_search('lm')
                    else:
                         decoder.set_search('keyword')
    
                    decoder.start_utt()
        else:
            break
    decoder.end_utt()
    stream.end_stream()
    

    I'm getting the error as above. But SOMETIMES when I run it and quickly say 'Hello Computer' it will print 'Hello Computer' but then it will give the same error.

     

    Last edit: Bari Tala 2017-04-09
    • Nickolay V. Shmyrev

      You need to check hyp for none before accessing hypstr.

       
  • Bari Tala

    Bari Tala - 2017-04-09

    Hey Nickolay, thanks for the quick response. I'm trying to add an if statement as below. But still getting the same error. You're saying I need to make sure decoder.hyp() is not empty, right?

    while True:
        buf = stream.read(1024)
        if buf:
            decoder.process_raw(buf, False, False)
            if decoder.get_in_speech() != in_speech_bf:
                in_speech_bf = decoder.get_in_speech()
                if not in_speech_bf:
                    decoder.end_utt()
    
                    **if type(decoder.hyp()) is not None:**       #make sure not NoneType
    
                        # Print hypothesis and switch search to another mode
                        print 'Result:', decoder.hyp().hypstr
    
                    if decoder.get_search() == 'keyword':
                         decoder.set_search('lm')
                    else:
                         decoder.set_search('keyword')
    
                    decoder.start_utt()
        else:
            break
    decoder.end_utt()
    stream.end_stream()
    

    Hey figured it out. i changed it now to:

    if decoder.hyp() is not None:

    After I say 'Hello Computer' then pause and say 'Down' ( or any other word in my dictionary) it again gives me a decoder.hyp() of None. It then won't throw the error anymore, but seems to switch to a continuous recognition instead of switching back to keyword requiring.

     

    Last edit: Bari Tala 2017-04-09
  • Bari Tala

    Bari Tala - 2017-04-09

    SUCCESS!
    First an explanation to those wondering whats going on and other various python newbies like myself. When listening, decoder.hyp() defaults to None if it doesn't recognize keyword, the script will terminate when you call hypstr() on the None. So, just make sure you do the keyword-to-continuous switching only when decoder.hyp() isn't none (i.e. when it has heard a keyword).

    Thanks for the help. Great product and will keep you updated!

    Bari

     
    • ismail nur adli

      ismail nur adli - 2020-04-26

      Hey Bari Tala, if you still have the code, can you share it to me. I kinda desparate

      Thanks!

       
  • ismail nur adli

    ismail nur adli - 2020-05-12
     

    Last edit: ismail nur adli 2020-05-12

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.