CMU Sphinx / Forums / Help: keyword spotting way too slow

Hi, there, I follow the tutorial and pocketSphinx works on my MIPS embedded system of which CPU is 580 MHz. My goal is wake up words detection, so I train my own acoustic model with 3 random picked words "eight, happy and dog ". For each words, amount of training data is about 1000(Same word speeched by different people that I take from GOOGLE open speech command sets). Keyword spotting also works on my embedded platform but is way too slow. Recognition process time is 2~4 times as long as recording time on an average. Following is the log:

INFO: pocketsphinx.c(152): Parsed model-specific feature parameters from my_db.ci_semi/feat.params
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-allphone
-allphone_ci no no
-alpha 0.97 9.700000e-01
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-ceplen 13 13
-cmn live batch
-cmninit 40,3,-1 40,3,-1
-compallsen no no
-debug 0
-dict my_db.dic
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd s2_4x
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm my_db.ci_semi
-input_endian little little
-jsgf
-keyphrase
-kws keywords.txt
-kws_delay 10 10
-kws_plp 1e-1 1.000000e-01
-kws_threshold 1 1.000000e+00
-latsize 5000 5000
-lda
-ldadim 0 0
-lifter 0 22
-lm
-lmctl
-lmname
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.300000e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 6.500000e+00
-maxhmmpf 30000 30000
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 25
-nwpen 1.0 1.000000e+00
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-10 1.000000e-10
-pl_pip 1.0 1.000000e+00
-pl_weight 3.0 3.000000e+00
-pl_window 5 5
-rawlogdir
-remove_dc no no
-remove_noise yes yes
-remove_silence yes yes
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 6.800000e+03
-uw 1.0 1.000000e+00
-vad_postspeech 50 50
-vad_prespeech 20 20
-vad_startspeech 10 10
-vad_threshold 2.0 2.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 0.65 6.500000e-01
-wlen 0.025625 2.562500e-02

INFO: feat.c(715): Initializing feature stream to type: 's2_4x', ceplen=13, CMN='batch', VARNORM='no', AGC='none'
INFO: mdef.c(518): Reading model definition: my_db.ci_semi/mdef
INFO: bin_mdef.c(181): Allocating 44 * 8 bytes (0 KiB) for CD tree
INFO: tmat.c(149): Reading HMM transition probability matrices: my_db.ci_semi/transition_matrices
INFO: acmod.c(113): Attempting to use PTM computation module
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: my_db.ci_semi/means
INFO: ms_gauden.c(242): 1 codebook, 4 feature, size:
INFO: ms_gauden.c(244): 256x12
INFO: ms_gauden.c(244): 256x24
INFO: ms_gauden.c(244): 256x3
INFO: ms_gauden.c(244): 256x12
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: my_db.ci_semi/variances
INFO: ms_gauden.c(242): 1 codebook, 4 feature, size:
INFO: ms_gauden.c(244): 256x12
INFO: ms_gauden.c(244): 256x24
INFO: ms_gauden.c(244): 256x3
INFO: ms_gauden.c(244): 256x12
INFO: ms_gauden.c(304): 0 variance values floored
INFO: ptm_mgau.c(808): Number of codebooks doesn't match number of ciphones, doesn't look like PTM: 1 != 10
INFO: acmod.c(115): Attempting to use semi-continuous computation module
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: my_db.ci_semi/means
INFO: ms_gauden.c(242): 1 codebook, 4 feature, size:
INFO: ms_gauden.c(244): 256x12
INFO: ms_gauden.c(244): 256x24
INFO: ms_gauden.c(244): 256x3
INFO: ms_gauden.c(244): 256x12
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: my_db.ci_semi/variances
INFO: ms_gauden.c(242): 1 codebook, 4 feature, size:
INFO: ms_gauden.c(244): 256x12
INFO: ms_gauden.c(244): 256x24
INFO: ms_gauden.c(244): 256x3
INFO: ms_gauden.c(244): 256x12
INFO: ms_gauden.c(304): 0 variance values floored
INFO: s2_semi_mgau.c(1099): Reading mixture weights file 'my_db.ci_semi/mixture_weights'
INFO: s2_semi_mgau.c(1192): Read 30 x 4 x 256 mixture weights
INFO: s2_semi_mgau.c(1297): Maximum top-N: 4 Top-N beams: 0 0 0 0
INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
INFO: dict.c(320): Allocating 4102 * 20 bytes (80 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: my_db.dic
INFO: dict.c(213): Dictionary size 3, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(336): 3 words read
INFO: dict.c(358): Reading filler dictionary: my_db.ci_semi/noisedict
INFO: dict.c(213): Dictionary size 6, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(361): 3 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 10^3 * 2 bytes (1 KiB) for word-initial triphones
INFO: dict2pid.c(132): Allocated 1240 bytes (1 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 1240 bytes (1 KiB) for single-phone word triphones
INFO: kws_search.c(406): KWS(beam: -1080, plp: -23, default threshold 0, delay 10)
INFO: continuous.c(307): Bill ./pocketsphinx_continuous COMPILED ON: Oct 12 2017, AT: 00:43:00

INFO: continuous.c(252): Ready....
INFO: continuous.c(261): Listening...
INFO: cmn_live.c(120): Update from < 40.00 3.00 -1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >
INFO: cmn_live.c(138): Update to < 27.33 0.52 1.37 -4.49 -0.44 -6.17 -3.14 -5.56 -3.70 -0.39 -2.37 -1.84 -0.34 >
INFO: kws_search.c(656): kws 9.99 CPU 2.035 xRT
INFO: kws_search.c(658): kws 10.89 wall 2.218 xRT
INFO: continuous.c(275): Ready....
INFO: continuous.c(261): Listening...
INFO: cmn_live.c(120): Update from < 27.33 0.52 1.37 -4.49 -0.44 -6.17 -3.14 -5.56 -3.70 -0.39 -2.37 -1.84 -0.34 >
INFO: cmn_live.c(138): Update to < 27.60 -0.02 0.50 -4.07 0.22 -6.29 -3.48 -5.81 -3.93 -0.76 -2.65 -1.42 -0.73 >
INFO: kws_search.c(656): kws 9.87 CPU 3.669 xRT
INFO: kws_search.c(658): kws 11.03 wall 4.100 xRT
INFO: continuous.c(275): Ready....
INFO: continuous.c(261): Listening...
INFO: cmn_live.c(88): Update from < 27.60 -0.02 0.50 -4.07 0.22 -6.29 -3.48 -5.81 -3.93 -0.76 -2.65 -1.42 -0.73 >
INFO: cmn_live.c(105): Update to < 27.60 -0.00 0.72 -3.89 0.36 -6.40 -3.48 -5.74 -3.82 -0.71 -2.46 -1.39 -0.81 >
Input overrun, read calls are too rare (non-fatal)
INFO: cmn_live.c(120): Update from < 27.60 -0.00 0.72 -3.89 0.36 -6.40 -3.48 -5.74 -3.82 -0.71 -2.46 -1.39 -0.81 >
INFO: cmn_live.c(138): Update to < 27.13 0.43 0.70 -4.50 0.54 -6.79 -3.53 -5.84 -3.51 -0.53 -2.28 -1.02 -0.79 >
INFO: kws_search.c(656): kws 3.92 CPU 2.052 xRT
INFO: kws_search.c(658): kws 4.34 wall 2.273 xRT
INFO: continuous.c(275): Ready....
INFO: continuous.c(261): Listening...
INFO: cmn_live.c(88): Update from < 27.13 0.43 0.70 -4.50 0.54 -6.79 -3.53 -5.84 -3.51 -0.53 -2.28 -1.02 -0.79 >
INFO: cmn_live.c(105): Update to < 30.11 -2.70 -1.85 -3.61 -0.89 -7.77 -1.84 -5.53 -5.68 1.30 -3.14 -1.91 -0.07 >
INFO: cmn_live.c(120): Update from < 30.11 -2.70 -1.85 -3.61 -0.89 -7.77 -1.84 -5.53 -5.68 1.30 -3.14 -1.91 -0.07 >
INFO: cmn_live.c(138): Update to < 33.45 -6.92 -3.16 -2.36 -2.19 -6.86 -1.05 -6.66 -6.49 2.41 -3.62 -3.17 1.30 >
INFO: kws_search.c(656): kws 13.11 CPU 3.553 xRT
INFO: kws_search.c(658): kws 14.50 wall 3.930 xRT
eight dog eight dog happy happy
INFO: continuous.c(275): Ready....
INFO: continuous.c(261): Listening...
INFO: cmn_live.c(88): Update from < 33.45 -6.92 -3.16 -2.36 -2.19 -6.86 -1.05 -6.66 -6.49 2.41 -3.62 -3.17 1.30 >
INFO: cmn_live.c(105): Update to < 35.11 -8.28 -3.81 -2.73 -2.63 -6.96 -1.24 -5.75 -6.61 3.63 -4.51 -2.84 0.87 >
Input overrun, read calls are too rare (non-fatal)
INFO: cmn_live.c(120): Update from < 35.11 -8.28 -3.81 -2.73 -2.63 -6.96 -1.24 -5.75 -6.61 3.63 -4.51 -2.84 0.87 >
INFO: cmn_live.c(138): Update to < 39.26 -10.99 -8.18 -0.13 -4.18 -8.29 0.90 -4.27 -8.87 3.23 -4.23 -4.07 1.65 >
INFO: kws_search.c(656): kws 11.91 CPU 4.395 xRT
INFO: kws_search.c(658): kws 13.41 wall 4.947 xRT
happy happy
INFO: continuous.c(275): Ready....
INFO: continuous.c(261): Listening...
INFO: cmn_live.c(88): Update from < 39.26 -10.99 -8.18 -0.13 -4.18 -8.29 0.90 -4.27 -8.87 3.23 -4.23 -4.07 1.65 >
INFO: cmn_live.c(105): Update to < 40.92 -12.52 -8.92 -0.11 -4.81 -8.77 1.83 -3.85 -9.60 3.78 -4.80 -3.97 1.83 >
INFO: cmn_live.c(120): Update from < 40.92 -12.52 -8.92 -0.11 -4.81 -8.77 1.83 -3.85 -9.60 3.78 -4.80 -3.97 1.83 >
INFO: cmn_live.c(138): Update to < 43.49 -14.38 -9.18 -0.10 -4.91 -8.16 2.19 -4.82 -9.69 4.61 -4.57 -4.79 2.26 >
INFO: kws_search.c(656): kws 11.78 CPU 3.192 xRT
INFO: kws_search.c(658): kws 12.95 wall 3.509 xRT
eight dog eight dog happy happy

------------------------------------------------------------------------------------------------------------------

I tried the change of argument "-maxhmmpf 3000 -maxwpf 2 -pl_window 8 -ds 2 -topn 2" but it's still not quick enough. Is there any step i missed? or my embedded platform is not powerful for keyword spotting? And is it normal that when I say nothing there still shows something like "INFO: cmn_live.c(120): Update from < ............... >
INFO: cmn_live.c(138): Update to < ........ >"?

Last edit: ahQi 2017-10-13

keyword spotting way too slow

Speech Recognition Toolkit

Forums

Help

keyword spotting way too slow document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------

keyword spotting way too slow