Menu

Pocketsphinx does not work on ev3dev

MotoR
2017-04-22
2017-04-23
  • MotoR

    MotoR - 2017-04-22

    I tried to use Pocketsphinx on ev3dev on Lego EV3 brick. I made and installed libraries sphinxbase-5prealpha and pocketsphinx-5prealpha as written there - http://cmusphinx.sourceforge.net/wiki/tutorialpocketsphinx. But a test fails: when I run "pocketsphinx_continuous -inmic yes", the program freezes on "INFO: ngram_search_fwdtree.c(186): Creating search channels" and then after about 30 minutes arrears "Killed". What is wrong?

    I use a mic from a USB-webcam. The microphone works fine. I know this because I record a sound successfully.

    robot@ev3dev:~$ pocketsphinx_continuous -inmic yes
    INFO: pocketsphinx.c(152): Parsed model-specific feature parameters from /usr/lo                                                                         cal/share/pocketsphinx/model/en-us/en-us/feat.params
    Current configuration:
    [NAME]                  [DEFLT]         [VALUE]
    -agc                    none            none
    -agcthresh              2.0             2.000000e+00
    -allphone
    -allphone_ci            no              no
    -alpha                  0.97            9.700000e-01
    -ascale                 20.0            2.000000e+01
    -aw                     1               1
    -backtrace              no              no
    -beam                   1e-48           1.000000e-48
    -bestpath               yes             yes
    -bestpathlw             9.5             9.500000e+00
    -ceplen                 13              13
    -cmn                    live            batch
    -cmninit                40,3,-1         41.00,-5.29,-0.12,5.09,2.48,-4.07,-1.37,                                                                         -1.78,-5.08,-2.05,-6.45,-1.42,1.17
    -compallsen             no              no
    -debug                                  0
    -dict                                   /usr/local/share/pocketsphinx/model/en-u                                                                         s/cmudict-en-us.dict
    -dictcase               no              no
    -dither                 no              no
    -doublebw               no              no
    -ds                     1               1
    -fdict
    -feat                   1s_c_d_dd       1s_c_d_dd
    -featparams
    -fillprob               1e-8            1.000000e-08
    -frate                  100             100
    -fsg
    -fsgusealtpron          yes             yes
    -fsgusefiller           yes             yes
    -fwdflat                yes             yes
    -fwdflatbeam            1e-64           1.000000e-64
    -fwdflatefwid           4               4
    -fwdflatlw              8.5             8.500000e+00
    -fwdflatsfwin           25              25
    -fwdflatwbeam           7e-29           7.000000e-29
    -fwdtree                yes             yes
    -hmm                                    /usr/local/share/pocketsphinx/model/en-u                                                                         s/en-us
    -input_endian           little          little
    -jsgf
    -keyphrase
    -kws
    -kws_delay              10              10
    -kws_plp                1e-1            1.000000e-01
    -kws_threshold          1               1.000000e+00
    -latsize                5000            5000
    -lda
    -ldadim                 0               0
    -lifter                 0               22
    -lm                                     /usr/local/share/pocketsphinx/model/en-u                                                                         s/en-us.lm.bin
    -lmctl
    -lmname
    -logbase                1.0001          1.000100e+00
    -logfn
    -logspec                no              no
    -lowerf                 133.33334       1.300000e+02
    -lpbeam                 1e-40           1.000000e-40
    -lponlybeam             7e-29           7.000000e-29
    -lw                     6.5             6.500000e+00
    -maxhmmpf               30000           30000
    -maxwpf                 -1              -1
    -mdef
    -mean
    -mfclogdir
    -min_endfr              0               0
    -mixw
    -mixwfloor              0.0000001       1.000000e-07
    -mllr
    -mmap                   yes             yes
    -ncep                   13              13
    -nfft                   512             512
    -nfilt                  40              25
    -nwpen                  1.0             1.000000e+00
    -pbeam                  1e-48           1.000000e-48
    -pip                    1.0             1.000000e+00
    -pl_beam                1e-10           1.000000e-10
    -pl_pbeam               1e-10           1.000000e-10
    -pl_pip                 1.0             1.000000e+00
    -pl_weight              3.0             3.000000e+00
    -pl_window              5               5
    -rawlogdir
    -remove_dc              no              no
    -remove_noise           yes             yes
    -remove_silence         yes             yes
    -round_filters          yes             yes
    -samprate               16000           1.600000e+04
    -seed                   -1              -1
    -sendump
    -senlogdir
    -senmgau
    -silprob                0.005           5.000000e-03
    -smoothspec             no              no
    -svspec                                 0-12/13-25/26-38
    -tmat
    -tmatfloor              0.0001          1.000000e-04
    -topn                   4               4
    -topn_beam              0               0
    -toprule
    -transform              legacy          dct
    -unit_area              yes             yes
    -upperf                 6855.4976       6.800000e+03
    -uw                     1.0             1.000000e+00
    -vad_postspeech         50              50
    -vad_prespeech          20              20
    -vad_startspeech        10              10
    -vad_threshold          2.0             2.000000e+00
    -var
    -varfloor               0.0001          1.000000e-04
    -varnorm                no              no
    -verbose                no              no
    -warp_params
    -warp_type              inverse_linear  inverse_linear
    -wbeam                  7e-29           7.000000e-29
    -wip                    0.65            6.500000e-01
    -wlen                   0.025625        2.562500e-02
    
    INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13,                                                                          CMN='batch', VARNORM='no', AGC='none'
    INFO: acmod.c(162): Using subvector specification 0-12/13-25/26-38
    INFO: mdef.c(518): Reading model definition: /usr/local/share/pocketsphinx/model                                                                         /en-us/en-us/mdef
    INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef fi                                                                         le
    INFO: bin_mdef.c(336): Reading binary model definition: /usr/local/share/pockets                                                                         phinx/model/en-us/en-us/mdef
    INFO: bin_mdef.c(516): 42 CI-phone, 137053 CD-phone, 3 emitstate/phone, 126 CI-s                                                                         en, 5126 Sen, 29324 Sen-Seq
    INFO: tmat.c(149): Reading HMM transition probability matrices: /usr/local/share                                                                         /pocketsphinx/model/en-us/en-us/transition_matrices
    INFO: acmod.c(113): Attempting to use PTM computation module
    INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /usr/local/share/poc                                                                         ketsphinx/model/en-us/en-us/means
    INFO: ms_gauden.c(242): 42 codebook, 3 feature, size:
    INFO: ms_gauden.c(244):  128x13
    INFO: ms_gauden.c(244):  128x13
    INFO: ms_gauden.c(244):  128x13
    INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /usr/local/share/poc                                                                         ketsphinx/model/en-us/en-us/variances
    INFO: ms_gauden.c(242): 42 codebook, 3 feature, size:
    INFO: ms_gauden.c(244):  128x13
    INFO: ms_gauden.c(244):  128x13
    INFO: ms_gauden.c(244):  128x13
    INFO: ms_gauden.c(304): 222 variance values floored
    INFO: ptm_mgau.c(476): Loading senones from dump file /usr/local/share/pocketsph                                                                         inx/model/en-us/en-us/sendump
    INFO: ptm_mgau.c(500): BEGIN FILE FORMAT DESCRIPTION
    INFO: ptm_mgau.c(563): Rows: 128, Columns: 5126
    INFO: ptm_mgau.c(595): Using memory-mapped I/O for senones
    INFO: ptm_mgau.c(838): Maximum top-N: 4
    INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion p                                                                         enalty 0
    INFO: dict.c(320): Allocating 138824 * 20 bytes (2711 KiB) for word entries
    INFO: dict.c(333): Reading main dictionary: /usr/local/share/pocketsphinx/model/                                                                         en-us/cmudict-en-us.dict
    INFO: dict.c(213): Dictionary size 134723, allocated 1016 KiB for strings, 1679                                                                          KiB for phones
    INFO: dict.c(336): 134723 words read
    INFO: dict.c(358): Reading filler dictionary: /usr/local/share/pocketsphinx/mode                                                                         l/en-us/en-us/noisedict
    INFO: dict.c(213): Dictionary size 134728, allocated 0 KiB for strings, 0 KiB fo                                                                         r phones
    INFO: dict.c(361): 5 words read
    INFO: dict2pid.c(396): Building PID tables for dictionary
    INFO: dict2pid.c(406): Allocating 42^3 * 2 bytes (144 KiB) for word-initial trip                                                                         hones
    INFO: dict2pid.c(132): Allocated 21336 bytes (20 KiB) for word-final triphones
    INFO: dict2pid.c(196): Allocated 21336 bytes (20 KiB) for single-phone word trip                                                                         hones
    INFO: ngram_model_trie.c(354): Trying to read LM in trie binary format
    INFO: ngram_search_fwdtree.c(74): Initializing search tree
    INFO: ngram_search_fwdtree.c(101): 791 unique initial diphones
    INFO: ngram_search_fwdtree.c(186): Creating search channels
    Killed
    robot@ev3dev:~$
    
     
    • Nickolay V. Shmyrev

      64Mb of memory is not enough to run pocketsphinx continuous speech recognition. You can only use pocketsphinx for few commands with very simple model, not with a large vocabulary.

       
  • MotoR

    MotoR - 2017-04-23

    Thanks. Actually I want to use only 5-7 words to control EV3. How to start working with a simple model if I am using Python?

     
    • Nickolay V. Shmyrev

      Collect the data for your specific commands (1-5 hours is recommended in tutorial). Train an acoustic model based on tutorial, use it with a small grammar. Overall it is not easy to fit recognizer on such a limited hardware, you'd better figure out more powerful robot.

       
  • MotoR

    MotoR - 2017-04-27

    Thanks for the theory but how to do it practically. Which apps do I need to use? Which commands?

     

Log in to post a comment.

MongoDB Logo MongoDB