PocketSphinx is running on my Raspberry Pi 3 running Raspbian Jessie, using a .dic and .lm generated by the lmtool-new,html link.
While PocketSphinx recognizes 100% of the words/phrases from that processed sentence corpus output set, it spends far more time telling me "input overrun, read calls are too rare (non-fatal)", and far too little time "Listening..."
I seem to ONLY be able to give it a -samprate of 44100 - here's my command line to start Sphinx:
If I try a different samprate, it gives me an error about only "available rate of 44100 is too far from requested <whatever>", such as 16000, etc...
So it seems to work fine - when it works. I just need to help it not get input overruns, and I'm not sure what I should be adjusting if I can't adjust the samprate.
I'm using a USB Sound Card. Output of PS is shown below if that is helpful in determining the issue here. You can see the properly-heard phrases I tested in this quick run (RUN PROGRAM, RUN MODULE, BRING YOURSELF BACK ONLINE) displayed in the code output below.
Thank you.
pi@raspberrypi:~ $ pocketsphinx_continuous -hmm /usr/local/share/pocketsphinx/model/en-us/en-us -lm 8465.lm -dict 8465.dic -nfft 2048 -samprate 44100 -inmic yes
INFO: pocketsphinx.c(152): Parsed model-specific feature parameters from /usr/local/share/pocketsphinx/model/en-us/en-us/feat.params
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-allphone
-allphone_ci no no
-alpha 0.97 9.700000e-01
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-ceplen 13 13
-cmn live batch
-cmninit 40,3,-1 41.00,-5.29,-0.12,5.09,2.48,-4.07,-1.37,-1.78,-5.08,-2.05,-6.45,-1.42,1.17
-compallsen no no
-debug 0
-dict 8465.dic
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm /usr/local/share/pocketsphinx/model/en-us/en-us
-input_endian little little
-jsgf
-keyphrase
-kws
-kws_delay 10 10
-kws_plp 1e-1 1.000000e-01
-kws_threshold 1 1.000000e+00
-latsize 5000 5000
-lda
-ldadim 0 0
-lifter 0 22
-lm 8465.lm
-lmctl
-lmname
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.300000e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 6.500000e+00
-maxhmmpf 30000 30000
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 2048
-nfilt 40 25
-nwpen 1.0 1.000000e+00
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-10 1.000000e-10
-pl_pip 1.0 1.000000e+00
-pl_weight 3.0 3.000000e+00
-pl_window 5 5
-rawlogdir
-remove_dc no no
-remove_noise yes yes
-remove_silence yes yes
-round_filters yes yes
-samprate 16000 4.410000e+04
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec 0-12/13-25/26-38
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 6.800000e+03
-uw 1.0 1.000000e+00
-vad_postspeech 50 50
-vad_prespeech 20 20
-vad_startspeech 10 10
-vad_threshold 2.0 2.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 0.65 6.500000e-01
-wlen 0.025625 2.562500e-02
INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='batch', VARNORM='no', AGC='none'
INFO: acmod.c(162): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(518): Reading model definition: /usr/local/share/pocketsphinx/model/en-us/en-us/mdef
INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: /usr/local/share/pocketsphinx/model/en-us/en-us/mdef
INFO: bin_mdef.c(516): 42 CI-phone, 137053 CD-phone, 3 emitstate/phone, 126 CI-sen, 5126 Sen, 29324 Sen-Seq
INFO: tmat.c(149): Reading HMM transition probability matrices: /usr/local/share/pocketsphinx/model/en-us/en-us/transition_matrices
INFO: acmod.c(113): Attempting to use PTM computation module
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/en-us/en-us/means
INFO: ms_gauden.c(242): 42 codebook, 3 feature, size:
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/en-us/en-us/variances
INFO: ms_gauden.c(242): 42 codebook, 3 feature, size:
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(304): 222 variance values floored
INFO: ptm_mgau.c(476): Loading senones from dump file /usr/local/share/pocketsphinx/model/en-us/en-us/sendump
INFO: ptm_mgau.c(500): BEGIN FILE FORMAT DESCRIPTION
INFO: ptm_mgau.c(563): Rows: 128, Columns: 5126
INFO: ptm_mgau.c(595): Using memory-mapped I/O for senones
INFO: ptm_mgau.c(838): Maximum top-N: 4
INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
INFO: dict.c(320): Allocating 4153 * 20 bytes (81 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: 8465.dic
INFO: dict.c(213): Dictionary size 52, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(336): 52 words read
INFO: dict.c(358): Reading filler dictionary: /usr/local/share/pocketsphinx/model/en-us/en-us/noisedict
INFO: dict.c(213): Dictionary size 57, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(361): 5 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 42^3 * 2 bytes (144 KiB) for word-initial triphones
INFO: dict2pid.c(132): Allocated 21336 bytes (20 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 21336 bytes (20 KiB) for single-phone word triphones
INFO: ngram_model_trie.c(354): Trying to read LM in trie binary format
INFO: ngram_model_trie.c(365): Header doesn't match
INFO: ngram_model_trie.c(177): Trying to read LM in arpa format
INFO: ngram_model_trie.c(193): LM of order 3
INFO: ngram_model_trie.c(195): #1-grams: 42
INFO: ngram_model_trie.c(195): #2-grams: 64
INFO: ngram_model_trie.c(195): #3-grams: 51
INFO: lm_trie.c(474): Training quantizer
INFO: lm_trie.c(482): Building LM trie
INFO: ngram_search_fwdtree.c(74): Initializing search tree
INFO: ngram_search_fwdtree.c(101): 43 unique initial diphones
INFO: ngram_search_fwdtree.c(186): Creating search channels
INFO: ngram_search_fwdtree.c(323): Max nonroot chan increased to 259
INFO: ngram_search_fwdtree.c(333): Created 43 root, 131 non-root channels, 7 single-phone words
INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
INFO: continuous.c(307): pocketsphinx_continuous COMPILED ON: Feb 5 2017, AT: 11:32:44
INFO: continuous.c(252): Ready....
INFO: continuous.c(261): Listening...
INFO: cmn_live.c(120): Update from < 41.00 -5.29 -0.12 5.09 2.48 -4.07 -1.37 -1.78 -5.08 -2.05 -6.45 -1.42 1.17 >
INFO: cmn_live.c(138): Update to < 47.92 -29.06 -1.69 -9.24 -6.44 0.48 9.26 19.00 6.96 -23.92 -38.23 7.85 40.98 >
INFO: ngram_search_fwdtree.c(1550): 316 words recognized (5/fr)
INFO: ngram_search_fwdtree.c(1552): 20259 senones evaluated (312/fr)
INFO: ngram_search_fwdtree.c(1556): 8775 channels searched (135/fr), 2623 1st, 2435 last
INFO: ngram_search_fwdtree.c(1559): 509 words for which last channels evaluated (7/fr)
INFO: ngram_search_fwdtree.c(1561): 532 candidate words for entering last phone (8/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 0.51 CPU 0.785 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 2.15 wall 3.310 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 2 words
INFO: ngram_search_fwdflat.c(948): 294 words recognized (5/fr)
INFO: ngram_search_fwdflat.c(950): 558 senones evaluated (9/fr)
INFO: ngram_search_fwdflat.c(952): 308 channels searched (4/fr)
INFO: ngram_search_fwdflat.c(954): 308 words searched (4/fr)
INFO: ngram_search_fwdflat.c(957): 76 word transitions (1/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.06 CPU 0.092 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.06 wall 0.090 xRT
INFO: ngram_search.c(1250): lattice start node <s>.0 end node </s>.33
INFO: ngram_search.c(1276): Eliminated 2 nodes before end node
INFO: ngram_search.c(1381): Lattice has 124 nodes, 152 links
INFO: ps_lattice.c(1380): Bestpath score: -415
INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(</s>:33:63) = -27470
INFO: ps_lattice.c(1441): Joint P(O,S) = -37539 P(S|O) = -10069
INFO: ngram_search.c(872): bestpath 0.00 CPU 0.000 xRT
INFO: ngram_search.c(875): bestpath 0.00 wall 0.002 xRT
INFO: continuous.c(275): Ready....
Input overrun, read calls are too rare (non-fatal)
INFO: continuous.c(261): Listening...
INFO: cmn_live.c(120): Update from < 47.92 -29.06 -1.69 -9.24 -6.44 0.48 9.26 19.00 6.96 -23.92 -38.23 7.85 40.98 >
INFO: cmn_live.c(138): Update to < 49.71 -16.99 5.17 -4.44 -3.24 1.96 8.24 16.35 6.89 -21.22 -33.43 8.83 36.18 >
INFO: ngram_search_fwdtree.c(1550): 833 words recognized (7/fr)
INFO: ngram_search_fwdtree.c(1552): 44779 senones evaluated (358/fr)
INFO: ngram_search_fwdtree.c(1556): 26148 channels searched (209/fr), 4887 1st, 15002 last
INFO: ngram_search_fwdtree.c(1559): 1257 words for which last channels evaluated (10/fr)
INFO: ngram_search_fwdtree.c(1561): 851 candidate words for entering last phone (6/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 2.04 CPU 1.632 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 37.69 wall 30.153 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 21 words
INFO: ngram_search_fwdflat.c(948): 689 words recognized (6/fr)
INFO: ngram_search_fwdflat.c(950): 34551 senones evaluated (276/fr)
INFO: ngram_search_fwdflat.c(952): 32104 channels searched (256/fr)
INFO: ngram_search_fwdflat.c(954): 1962 words searched (15/fr)
INFO: ngram_search_fwdflat.c(957): 1110 word transitions (8/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.28 CPU 0.224 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.29 wall 0.229 xRT
INFO: ngram_search.c(1250): lattice start node <s>.0 end node </s>.83
INFO: ngram_search.c(1276): Eliminated 2 nodes before end node
INFO: ngram_search.c(1381): Lattice has 162 nodes, 288 links
INFO: ps_lattice.c(1380): Bestpath score: -2983
INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(</s>:83:123) = -178226
INFO: ps_lattice.c(1441): Joint P(O,S) = -193365 P(S|O) = -15139
INFO: ngram_search.c(872): bestpath 0.00 CPU 0.000 xRT
INFO: ngram_search.c(875): bestpath 0.00 wall 0.001 xRT
RUN MODULE
INFO: continuous.c(275): Ready....
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
INFO: continuous.c(261): Listening...
INFO: cmn_live.c(120): Update from < 49.71 -16.99 5.17 -4.44 -3.24 1.96 8.24 16.35 6.89 -21.22 -33.43 8.83 36.18 >
INFO: cmn_live.c(138): Update to < 50.79 -13.03 6.84 -3.98 -1.21 3.62 7.37 15.08 6.26 -20.81 -32.64 8.39 34.95 >
INFO: ngram_search_fwdtree.c(1550): 1164 words recognized (9/fr)
INFO: ngram_search_fwdtree.c(1552): 61058 senones evaluated (452/fr)
INFO: ngram_search_fwdtree.c(1556): 37198 channels searched (275/fr), 5593 1st, 23146 last
INFO: ngram_search_fwdtree.c(1559): 1636 words for which last channels evaluated (12/fr)
INFO: ngram_search_fwdtree.c(1561): 1129 candidate words for entering last phone (8/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 2.05 CPU 1.519 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 36.00 wall 26.665 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 25 words
INFO: ngram_search_fwdflat.c(948): 839 words recognized (6/fr)
INFO: ngram_search_fwdflat.c(950): 55062 senones evaluated (408/fr)
INFO: ngram_search_fwdflat.c(952): 46883 channels searched (347/fr)
INFO: ngram_search_fwdflat.c(954): 2635 words searched (19/fr)
INFO: ngram_search_fwdflat.c(957): 1657 word transitions (12/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.59 CPU 0.437 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.59 wall 0.436 xRT
INFO: ngram_search.c(1250): lattice start node <s>.0 end node </s>.96
INFO: ngram_search.c(1276): Eliminated 2 nodes before end node
INFO: ngram_search.c(1381): Lattice has 205 nodes, 334 links
INFO: ps_lattice.c(1380): Bestpath score: -3905
INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(</s>:96:133) = -218679
INFO: ps_lattice.c(1441): Joint P(O,S) = -238113 P(S|O) = -19434
INFO: ngram_search.c(872): bestpath 0.00 CPU 0.000 xRT
INFO: ngram_search.c(875): bestpath 0.00 wall 0.002 xRT
RUN PROGRAM
INFO: continuous.c(275): Ready....
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
INFO: continuous.c(261): Listening...
INFO: cmn_live.c(120): Update from < 50.79 -13.03 6.84 -3.98 -1.21 3.62 7.37 15.08 6.26 -20.81 -32.64 8.39 34.95 >
INFO: cmn_live.c(138): Update to < 51.45 -11.07 7.28 -3.61 -0.96 2.98 5.82 14.35 6.62 -19.00 -30.36 8.19 33.55 >
INFO: ngram_search_fwdtree.c(1550): 1084 words recognized (6/fr)
INFO: ngram_search_fwdtree.c(1552): 60419 senones evaluated (334/fr)
INFO: ngram_search_fwdtree.c(1556): 34551 channels searched (190/fr), 6883 1st, 19765 last
INFO: ngram_search_fwdtree.c(1559): 1596 words for which last channels evaluated (8/fr)
INFO: ngram_search_fwdtree.c(1561): 779 candidate words for entering last phone (4/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 3.20 CPU 1.768 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 62.88 wall 34.738 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 30 words
INFO: ngram_search_fwdflat.c(948): 831 words recognized (5/fr)
INFO: ngram_search_fwdflat.c(950): 48813 senones evaluated (270/fr)
INFO: ngram_search_fwdflat.c(952): 42125 channels searched (232/fr)
INFO: ngram_search_fwdflat.c(954): 2768 words searched (15/fr)
INFO: ngram_search_fwdflat.c(957): 1725 word transitions (9/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.59 CPU 0.326 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.59 wall 0.327 xRT
INFO: ngram_search.c(1250): lattice start node <s>.0 end node </s>.142
INFO: ngram_search.c(1276): Eliminated 1 nodes before end node
INFO: ngram_search.c(1381): Lattice has 213 nodes, 328 links
INFO: ps_lattice.c(1380): Bestpath score: -4621
INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(</s>:142:179) = -255884
INFO: ps_lattice.c(1441): Joint P(O,S) = -281615 P(S|O) = -25731
INFO: ngram_search.c(872): bestpath 0.01 CPU 0.006 xRT
INFO: ngram_search.c(875): bestpath 0.00 wall 0.002 xRT
BRING YOURSELF BACK ONLINE
INFO: continuous.c(275): Ready....
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
INFO: continuous.c(261): Listening...
INFO: cmn_live.c(120): Update from < 51.45 -11.07 7.28 -3.61 -0.96 2.98 5.82 14.35 6.62 -19.00 -30.36 8.19 33.55 >
INFO: cmn_live.c(138): Update to < 50.51 -12.29 6.92 -3.89 -1.34 2.71 6.23 15.25 6.76 -19.75 -31.44 8.36 34.79 >
INFO: ngram_search_fwdtree.c(1550): 366 words recognized (5/fr)
INFO: ngram_search_fwdtree.c(1552): 17627 senones evaluated (259/fr)
INFO: ngram_search_fwdtree.c(1556): 7721 channels searched (113/fr), 2752 1st, 2318 last
INFO: ngram_search_fwdtree.c(1559): 514 words for which last channels evaluated (7/fr)
INFO: ngram_search_fwdtree.c(1561): 204 candidate words for entering last phone (3/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 2.36 CPU 3.471 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 62.55 wall 91.983 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 7 words
INFO: ngram_search_fwdflat.c(948): 330 words recognized (5/fr)
INFO: ngram_search_fwdflat.c(950): 8804 senones evaluated (129/fr)
INFO: ngram_search_fwdflat.c(952): 5603 channels searched (82/fr)
INFO: ngram_search_fwdflat.c(954): 581 words searched (8/fr)
INFO: ngram_search_fwdflat.c(957): 258 word transitions (3/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.20 CPU 0.294 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.20 wall 0.300 xRT
INFO: ngram_search.c(1250): lattice start node <s>.0 end node </s>.49
INFO: ngram_search.c(1276): Eliminated 2 nodes before end node
INFO: ngram_search.c(1381): Lattice has 83 nodes, 152 links
INFO: ps_lattice.c(1380): Bestpath score: -1282
INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(</s>:49:66) = -61562
INFO: ps_lattice.c(1441): Joint P(O,S) = -85411 P(S|O) = -23849
INFO: ngram_search.c(872): bestpath 0.00 CPU 0.000 xRT
INFO: ngram_search.c(875): bestpath 0.00 wall 0.002 xRT
INFO: continuous.c(275): Ready....
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
INFO: continuous.c(261): Listening...
INFO: cmn_live.c(120): Update from < 50.51 -12.29 6.92 -3.89 -1.34 2.71 6.23 15.25 6.76 -19.75 -31.44 8.36 34.79 >
INFO: cmn_live.c(138): Update to < 49.60 -13.63 6.24 -4.11 -1.53 2.73 6.93 16.09 6.77 -20.35 -32.17 8.36 35.44 >
INFO: ngram_search_fwdtree.c(1550): 322 words recognized (5/fr)
INFO: ngram_search_fwdtree.c(1552): 15883 senones evaluated (230/fr)
INFO: ngram_search_fwdtree.c(1556): 6192 channels searched (89/fr), 2795 1st, 716 last
INFO: ngram_search_fwdtree.c(1559): 466 words for which last channels evaluated (6/fr)
INFO: ngram_search_fwdtree.c(1561): 176 candidate words for entering last phone (2/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 3.13 CPU 4.536 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 86.75 wall 125.730 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 2 words
INFO: ngram_search_fwdflat.c(948): 317 words recognized (5/fr)
INFO: ngram_search_fwdflat.c(950): 594 senones evaluated (9/fr)
INFO: ngram_search_fwdflat.c(952): 328 channels searched (4/fr)
INFO: ngram_search_fwdflat.c(954): 328 words searched (4/fr)
INFO: ngram_search_fwdflat.c(957): 68 word transitions (0/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.05 CPU 0.072 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.11 wall 0.159 xRT
INFO: ngram_search.c(1250): lattice start node <s>.0 end node </s>.50
INFO: ngram_search.c(1276): Eliminated 2 nodes before end node
INFO: ngram_search.c(1381): Lattice has 86 nodes, 167 links
INFO: ps_lattice.c(1380): Bestpath score: -814
INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(</s>:50:67) = -44150
INFO: ps_lattice.c(1441): Joint P(O,S) = -59453 P(S|O) = -15303
INFO: ngram_search.c(872): bestpath 0.01 CPU 0.015 xRT
INFO: ngram_search.c(875): bestpath 0.00 wall 0.002 xRT
INFO: continuous.c(275): Ready....
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
^[
^Z
[1]+ Stopped
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You are correct, however, it was a month ago, and the single answer given was from someone who referenced information that was out of date given the advance in the hardware I am using.
I believe it is reaonable then to redescribe the information (as I have above) such that the question is not viewed as "can Sphinx run on the Pi", but rather, what changes must be made to allow the software to run better without triggering the Input Overrun message, or, optionally, what parameter makes it so that the only sample rate is 44100 and cannot be set to 16000, which I suspect is the root of the issue.
I believe that there are a great many smart folks on here who could answer one or both of those two questions easily, but may not have seen that the answer given was sub-optimal. I have been tuning that command line myself and have gotten much better performance, so Sphinx is definitely running on my R Pi 3, and I believe it can run very well if I understand how to adjust the parameters creating the Overrun message.
Thanks for your understanding and patience.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
-ds, -topn, -maxhmmpf and -pl_window. While some adjustments have reduced the Input Overrun errors, the accuracy has suffered. While PocketSphinx is technically running with a custom -dict and -lm, and high accuracy on the items in my -dict, the time it spends actually listening seems very low. I know there's sharp folks out here, I'm really hoping you can be helpful. If I could justify the cost of a Jetson TX2, I would gladly make the switch, as speech recognition is a key component of my goals here, but I also figure if I can tune PocketSphinx to run reasonably on the Raspberry Pi 3... I should be able to get it to run on almost anthing that follows and has better power. An honest plea for your best ideas here friends.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You are correct, however, it was a month ago, and the single answer given was from someone who referenced information that was out of date given the advance in the hardware I am using.
Please try to understand the information in answers before making conclusions.
An honest plea for your best ideas here friends.
You can profile your system with something like oprofile, not sure if it is supported. Most likely it spends a lot of time in FFT, not in speech recognition. 44.1 khz and 2048 fft is a bit overkill for Pi3.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Just saw this, thanks. I agree that 44.1/2048 is MUCH more resolution than I believe I need. I am looking for a USB sound card that can support 16k, unfortunately without much luck so far. I will look into oprofile and post anything useful, thank you. Here's a video of a guy running PocketSphinx on a R Pi 3 and comparing it to the slow performance of an R Pi 2... I will be contacting him after downloading all of his code at github and see what I need to be doing differently. I have only a few time windows to get this stuff accomplished unfortunately... :) Check this guy's video, it's impressive:
PocketSphinx is running on my Raspberry Pi 3 running Raspbian Jessie, using a .dic and .lm generated by the lmtool-new,html link.
While PocketSphinx recognizes 100% of the words/phrases from that processed sentence corpus output set, it spends far more time telling me "input overrun, read calls are too rare (non-fatal)", and far too little time "Listening..."
I seem to ONLY be able to give it a -samprate of 44100 - here's my command line to start Sphinx:
If I try a different samprate, it gives me an error about only "available rate of 44100 is too far from requested <whatever>", such as 16000, etc...
So it seems to work fine - when it works. I just need to help it not get input overruns, and I'm not sure what I should be adjusting if I can't adjust the samprate.
I'm using a USB Sound Card. Output of PS is shown below if that is helpful in determining the issue here. You can see the properly-heard phrases I tested in this quick run (RUN PROGRAM, RUN MODULE, BRING YOURSELF BACK ONLINE) displayed in the code output below.
Thank you.
You already asked this question at
https://sourceforge.net/p/cmusphinx/discussion/help/thread/0ed128a5/
You are correct, however, it was a month ago, and the single answer given was from someone who referenced information that was out of date given the advance in the hardware I am using.
I believe it is reaonable then to redescribe the information (as I have above) such that the question is not viewed as "can Sphinx run on the Pi", but rather, what changes must be made to allow the software to run better without triggering the Input Overrun message, or, optionally, what parameter makes it so that the only sample rate is 44100 and cannot be set to 16000, which I suspect is the root of the issue.
I believe that there are a great many smart folks on here who could answer one or both of those two questions easily, but may not have seen that the answer given was sub-optimal. I have been tuning that command line myself and have gotten much better performance, so Sphinx is definitely running on my R Pi 3, and I believe it can run very well if I understand how to adjust the parameters creating the Overrun message.
Thanks for your understanding and patience.
UPDATE:
I have been adjusting the following parameters:
-ds, -topn, -maxhmmpf and -pl_window. While some adjustments have reduced the Input Overrun errors, the accuracy has suffered. While PocketSphinx is technically running with a custom -dict and -lm, and high accuracy on the items in my -dict, the time it spends actually listening seems very low. I know there's sharp folks out here, I'm really hoping you can be helpful. If I could justify the cost of a Jetson TX2, I would gladly make the switch, as speech recognition is a key component of my goals here, but I also figure if I can tune PocketSphinx to run reasonably on the Raspberry Pi 3... I should be able to get it to run on almost anthing that follows and has better power. An honest plea for your best ideas here friends.
Please try to understand the information in answers before making conclusions.
You can profile your system with something like oprofile, not sure if it is supported. Most likely it spends a lot of time in FFT, not in speech recognition. 44.1 khz and 2048 fft is a bit overkill for Pi3.
Just saw this, thanks. I agree that 44.1/2048 is MUCH more resolution than I believe I need. I am looking for a USB sound card that can support 16k, unfortunately without much luck so far. I will look into oprofile and post anything useful, thank you. Here's a video of a guy running PocketSphinx on a R Pi 3 and comparing it to the slow performance of an R Pi 2... I will be contacting him after downloading all of his code at github and see what I need to be doing differently. I have only a few time windows to get this stuff accomplished unfortunately... :) Check this guy's video, it's impressive:
https://vimeo.com/169445418
Thanks again.
Last edit: David Xanatos 2017-05-03