I'am running pocketsphinx_continuous (0.8 version) . It recognises fine but its very slow and also throws a warning:" could not find capture elemnt "
How do I get rid of this error and how do i increase the speed . i'am running on raspberrypi
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
And, if you are running on Raspberry Pi, it is not very powerful CPU. You need to optimize the system to recognize in realtime. I'm not sure what your requirements are, you probably want to explain more what system are you going to implement.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I dont want to use the new model as cont_ad.h is no more supported.
I have attached the modified continuous.c where in i call recognise_from_mic from a wrapper file. It is supposed to recognise only a single word . It takes lot of time everytime i run this. I have created a dic and lanmodel . it is supposed to recognise only 33 words. The accuracy is fine but speed is very verry less.
this is my configurations
Current configuration: [NAME][DEFLT][VALUE]
-adcdev
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-argfile
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-bghist no no
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-debug 0
-dict arctic20.dic
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm /usr/MY_DATA/wd/cmusphinx-5prealpha-en-us-2.0
-infile
-input_endian little little
-jsgf
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latsize 5000 5000
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm arctic20.lm
-lmctl
-lmname default default
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 6.500000e+00
-maxhmmpf -1 -1
-maxnewoov 20 20
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1.000000e+00
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-5 1.000000e-05
-pl_window 0 0
-rawlogdir
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec
-time no no
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-usewdphones no no
-uw 1.0 1.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 0.65 6.500000e-01
-wlen 0.025625 2.562500e-02
The new model requires a lot of VM hence I'am ok with the 0.8 version.
What are the ranges of beam/lbeam maxhmmpf etc that i can use so that i can optimize the speed .? Considering the present configuration that i pasted above in my prev question.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Doing the adaptation for the PTM model: In the step "Accumulating observation counts " -- where we run /bw I have specified "-lda feature_transform" as one of the option . Here i get an error :
SYSTEM ERROR: "lda.c", line 76: Failed to open transform file 'feature_transform' for reading: No such file or directory.
Is this option required for a ptm model ? How do i generate the feature_transform file?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
would u please give us the instruction to use the new version on pocketsphinx!?
I'm using it in voice controlled home autommation system, and I have a problem with the changes that I did with countinuou.c file.
Thank you.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'am running pocketsphinx_continuous (0.8 version) . It recognises fine but its very slow and also throws a warning:" could not find capture elemnt "
How do I get rid of this error and how do i increase the speed . i'am running on raspberrypi
Please use latest version pocketsphinx-5prealpha
It is the most accurate version available.
It also have "could not find capture" problem fixed.
And, if you are running on Raspberry Pi, it is not very powerful CPU. You need to optimize the system to recognize in realtime. I'm not sure what your requirements are, you probably want to explain more what system are you going to implement.
I dont want to use the new model as cont_ad.h is no more supported.
I have attached the modified continuous.c where in i call recognise_from_mic from a wrapper file. It is supposed to recognise only a single word . It takes lot of time everytime i run this. I have created a dic and lanmodel . it is supposed to recognise only 33 words. The accuracy is fine but speed is very verry less.
this is my configurations
Current configuration:
[NAME] [DEFLT] [VALUE]
-adcdev
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-argfile
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-bghist no no
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-debug 0
-dict arctic20.dic
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm /usr/MY_DATA/wd/cmusphinx-5prealpha-en-us-2.0
-infile
-input_endian little little
-jsgf
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latsize 5000 5000
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm arctic20.lm
-lmctl
-lmname default default
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 6.500000e+00
-maxhmmpf -1 -1
-maxnewoov 20 20
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1.000000e+00
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-5 1.000000e-05
-pl_window 0 0
-rawlogdir
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec
-time no no
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-usewdphones no no
-uw 1.0 1.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 0.65 6.500000e-01
-wlen 0.025625 2.562500e-02
INFO: cmd_ln.c(691): Parsing command line:
\ -lowerf 130 \ -upperf 6800 \ -nfilt 25 \ -transform dct \ -lifter 22 \ -feat 1s_c_d_dd \ -agc none \ -cmn current \ -varnorm no \ -cmninit 40,3,-1
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-ceplen 13 13
-cmn current current
-cmninit 8.0 40,3,-1
-dither no no
-doublebw no no
-feat 1s_c_d_dd 1s_c_d_dd
-frate 100 100
-input_endian little little
-lda /usr/MY_DATA/wd/cmusphinx-5prealpha-en-us-2.0/feature_transform
-ldadim 0 0
-lifter 0 22
-logspec no no
-lowerf 133.33334 1.300000e+02
-ncep 13 13
-nfft 512 512
-nfilt 40 25
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-smoothspec no no
-svspec
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 6.800000e+03
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2.562500e-02
INFO: acmod.c(246): Parsed model-specific feature parameters from /usr/MY_DATA/wd/cmusphinx-5prealpha-en-us-2.0/feat.params
INFO: feat.c(713): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(157): Reading linear feature transformation from /usr/MY_DATA/wd/cmusphinx-5prealpha-en-us-2.0/feature_transform
INFO: mdef.c(517): Reading model definition: /usr/MY_DATA/wd/cmusphinx-5prealpha-en-us-2.0/mdef
INFO: bin_mdef.c(179): Allocating 142124 * 8 bytes (1110 KiB) for CD tree
INFO: tmat.c(205): Reading HMM transition probability matrices: /usr/MY_DATA/wd/cmusphinx-5prealpha-en-us-2.0/transition_matrices
INFO: acmod.c(121): Attempting to use SCHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/MY_DATA/wd/cmusphinx-5prealpha-en-us-2.0/means
INFO: ms_gauden.c(292): 5138 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 32x36
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/MY_DATA/wd/cmusphinx-5prealpha-en-us-2.0/variances
INFO: ms_gauden.c(292): 5138 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 32x36
INFO: ms_gauden.c(354): 813 variance values floored
INFO: acmod.c(123): Attempting to use PTHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/MY_DATA/wd/cmusphinx-5prealpha-en-us-2.0/means
INFO: ms_gauden.c(292): 5138 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 32x36
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/MY_DATA/wd/cmusphinx-5prealpha-en-us-2.0/variances
INFO: ms_gauden.c(292): 5138 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 32x36
INFO: ms_gauden.c(354): 813 variance values floored
INFO: ptm_mgau.c(792): Number of codebooks exceeds 256: 5138
INFO: acmod.c(125): Falling back to general multi-stream GMM computation
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/MY_DATA/wd/cmusphinx-5prealpha-en-us-2.0/means
INFO: ms_gauden.c(292): 5138 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 32x36
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/MY_DATA/wd/cmusphinx-5prealpha-en-us-2.0/variances
INFO: ms_gauden.c(292): 5138 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 32x36
INFO: ms_gauden.c(354): 813 variance values floored
INFO: ms_senone.c(149): Reading senone mixture weights: /usr/MY_DATA/wd/cmusphinx-5prealpha-en-us-2.0/mixture_weights
INFO: ms_senone.c(200): Truncating senone logs3(pdf) values by 10 bits
INFO: ms_senone.c(207): Not transposing mixture weights in memory
INFO: ms_senone.c(266): Read mixture weights for 5138 senones: 1 features x 32 codewords
INFO: ms_senone.c(320): Mapping senones to individual codebooks
INFO: ms_mgau.c(141): The value of topn: 4
INFO: dict.c(317): Allocating 4128 * 20 bytes (80 KiB) for word entries
INFO: dict.c(332): Reading main dictionary: arctic20.dic
INFO: dict.c(211): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(335): 23 words read
INFO: dict.c(341): Reading filler dictionary: /usr/MY_DATA/wd/cmusphinx-5prealpha-en-us-2.0/noisedict
INFO: dict.c(211): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(344): 9 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(404): Allocating 46^3 * 2 bytes (190 KiB) for word-initial triphones
INFO: dict2pid.c(131): Allocated 25576 bytes (24 KiB) for word-final triphones
INFO: dict2pid.c(195): Allocated 25576 bytes (24 KiB) for single-phone word triphones
INFO: ngram_model_arpa.c(477): ngrams 1=23, 2=42, 3=21
INFO: ngram_model_arpa.c(135): Reading unigrams
INFO: ngram_model_arpa.c(516): 23 = #unigrams created
INFO: ngram_model_arpa.c(195): Reading bigrams
INFO: ngram_model_arpa.c(533): 42 = #bigrams created
INFO: ngram_model_arpa.c(534): 3 = #prob2 entries
INFO: ngram_model_arpa.c(542): 3 = #bo_wt2 entries
INFO: ngram_model_arpa.c(292): Reading trigrams
INFO: ngram_model_arpa.c(555): 21 = #trigrams created
INFO: ngram_model_arpa.c(556): 2 = #prob3 entries
INFO: ngram_search_fwdtree.c(99): 20 unique initial diphones
INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 10 single-phone words
INFO: ngram_search_fwdtree.c(186): Creating search tree
INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 10 single-phone words
INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 201
INFO: ngram_search_fwdtree.c(338): after: 20 root, 73 non-root channels, 9 single-phone words
INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25
INFO: continuous.c(44): ./remcon COMPILED ON: Mar 13 2015, AT: 23:02:38
Pocket sphinx initialised
button pressed?0
Button has been pressed......:)
In Function VoiceReco() At State S_Idle For Event E_Onbuttonpress 1
In microphone
Warning: Could not find Capture element
READY....
Listening...
Recording is stopped, start recording with ad_start_rec
Stopped listening, please wait...
INFO: cmn_prior.c(121): cmn_prior_update: from < 40.00 3.00 -1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >
INFO: cmn_prior.c(139): cmn_prior_update: to < 45.00 0.13 2.01 10.64 9.38 1.77 -0.27 2.89 3.38 -1.79 2.71 6.31 4.37 >
INFO: ngram_search_fwdtree.c(1549): 489 words recognized (8/fr)
INFO: ngram_search_fwdtree.c(1551): 15886 senones evaluated (269/fr)
INFO: ngram_search_fwdtree.c(1553): 8179 channels searched (138/fr), 1100 1st, 4921 last
INFO: ngram_search_fwdtree.c(1557): 702 words for which last channels evaluated (11/fr)
INFO: ngram_search_fwdtree.c(1560): 406 candidate words for entering last phone (6/fr)
INFO: ngram_search_fwdtree.c(1562): fwdtree 2.06 CPU 3.492 xRT
INFO: ngram_search_fwdtree.c(1565): fwdtree 2.24 wall 3.799 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 8 words
INFO: ngram_search_fwdflat.c(937): 395 words recognized (7/fr)
INFO: ngram_search_fwdflat.c(939): 14189 senones evaluated (240/fr)
INFO: ngram_search_fwdflat.c(941): 9351 channels searched (158/fr)
INFO: ngram_search_fwdflat.c(943): 877 words searched (14/fr)
INFO: ngram_search_fwdflat.c(945): 335 word transitions (5/fr)
INFO: ngram_search_fwdflat.c(948): fwdflat 1.61 CPU 2.729 xRT
INFO: ngram_search_fwdflat.c(951): fwdflat 1.70 wall 2.875 xRT
INFO: ngram_search.c(1214): not found in last frame, using MUTE.57 instead
INFO: ngram_search.c(1266): lattice start node
.0 end node MUTE.18INFO: ngram_search.c(1294): Eliminated 49 nodes before end node
INFO: ngram_search.c(1399): Lattice has 114 nodes, 150 links
INFO: ps_lattice.c(1365): Normalizer P(O) = alpha(MUTE:18:57) = -124187
INFO: ps_lattice.c(1403): Joint P(O,S) = -124870 P(S|O) = -683
INFO: ngram_search.c(1043): bestpath 0.00 CPU 0.000 xRT
INFO: ngram_search.c(1046): bestpath 0.01 wall 0.022 xRT
conf value = 1.000000
INFO: ngram_search.c(888): bestpath 0.00 CPU 0.000 xRT
INFO: ngram_search.c(891): bestpath 0.00 wall 0.000 xRT
recognised output : 000000000: MUTE
hypword for calling scripts: MUTE
mute
system call to IR ./scripts/MUTEIn Function RecoTimeOut() At State S_Listening For Event E_Listencomplete 1
Ok, then you are on your own
Attaching the modified continuous.c file
The new model requires a lot of VM hence I'am ok with the 0.8 version.
What are the ranges of beam/lbeam maxhmmpf etc that i can use so that i can optimize the speed .? Considering the present configuration that i pasted above in my prev question.
To start with I'll try with the options mentioned in http://cmusphinx.sourceforge.net/wiki/pocketsphinxhandhelds . But where do i change these default config values ?
Your first problem is that you are using very slow continuous model cmusphinx-5prealpha-en-us-2.0, which is very heavy on handheld devices.
You need to use cmusphinx-5prealpha-en-us-ptm-2.0 PTM model which must be 5 times faster.
Configuration values are changed in command line, you do not need to change the defaults.
Last edit: Nickolay V. Shmyrev 2015-03-18
ok I'll download the ptm model meant for handhelds . Thanks for your valuable inputs.
Doing the adaptation for the PTM model: In the step "Accumulating observation counts " -- where we run /bw I have specified "-lda feature_transform" as one of the option . Here i get an error :
SYSTEM ERROR: "lda.c", line 76: Failed to open transform file 'feature_transform' for reading: No such file or directory.
Is this option required for a ptm model ? How do i generate the feature_transform file?
-lda option is not required for PTM.
would u please give us the instruction to use the new version on pocketsphinx!?
I'm using it in voice controlled home autommation system, and I have a problem with the changes that I did with countinuou.c file.
Thank you.