Hello,
I have compiled with MS Visual Studio latest stuff from pocketsphinx trunk.
When I try it with CD/LDA model, I receive only silence result, perhaps with
old pocketsphinx.dll compiled in September 2010 from trunk, I have with the
same model and audio data correct recognition result. I have tried recognition
from the applicateion, pocketsphinx_batch and pocketsphinx_continuous and got
the same behaviour. What can be a problem?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hm, interesting. I wonder if your language model causes this. Does it have
as a transition word? Something like . Models trained with cmuclmtk
are usually like that.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I vave doublechecked again everything and have the same strange results, with
the old sphinxbase/pocketsphinx version compiled using VS on Win7 I have good
results and with newes version from trunk I have NOMATCH
There are log files from the old version:
===========================================
INFO: cmd_ln.c(512): Parsing command line:
Current configuration:
-agc none none
-agcthresh 2.0 2.000000e+000
-alpha 0.97 9.700000e-001
-ascale 20.0 2.000000e+001
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-048
-bestpath yes yes
-bestpathlw 9.5 9.500000e+000
-bghist no no
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-debug 0
-dict
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-008
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-064
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+000
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-029
-fwdtree yes yes
-hmm
-input_endian little little
-jsgf
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latsize 5000 5000
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm
-lmctl
-lmname default default
-logbase 1.0001 1.000100e+000
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+002
-lpbeam 1e-40 1.000000e-040
-lponlybeam 7e-29 7.000000e-029
-lw 6.5 6.500000e+000
-maxhmmpf -1 -1
-maxnewoov 20 20
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-007
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1.000000e+000
-pbeam 1e-48 1.000000e-048
-pip 1.0 1.000000e+000
-pl_beam 1e-10 1.000000e-010
-pl_pbeam 1e-5 1.000000e-005
-pl_window 0 0
-rawlogdir
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+004
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-003
-smoothspec no no
-svspec
-tmat
-tmatfloor 0.0001 1.000000e-004
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+003
-usewdphones no no
-uw 1.0 1.000000e+000
-var
-varfloor 0.0001 1.000000e-004
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-029
-wip 0.65 6.500000e-001
-wlen 0.025625 2.562500e-002
You can run test using test_ps.bat
The differences you can dee replacing sphinxbase.dll and pocketsphinx.dll from
old_lib or new_lib subfolders.
P.S. I have used in this test example voxforge english 8KHz acoustic model...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
That's strange because I've just compiled the trunk on WinXP with VS2008
Express and everything runs smoothly. It does go out-of-memory with new dlls
you submitted.
Maybe it's some kind of memory corruption issue caused by different
optimization paramaters. Do you change something like that?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hm, interesting.. I have compiled on Win7 using VS2008 Professional. I don't
have any problems like out-of-memory, but the old .dll's give me normal result
and new just silence. I will try to compile now on WinXP using VS2008 Express.
Can you share yours .dll's? Is it possible you are using newest stuff on
trunk, not visible for me?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I have compiled with MS Visual Studio latest stuff from pocketsphinx trunk.
When I try it with CD/LDA model, I receive only silence result, perhaps with
old pocketsphinx.dll compiled in September 2010 from trunk, I have with the
same model and audio data correct recognition result. I have tried recognition
from the applicateion, pocketsphinx_batch and pocketsphinx_continuous and got
the same behaviour. What can be a problem?
Can you find out exact revision that broke decoding?
Revision: 10350
Last Changed Author: dhdfu
Last Changed date: 06.09.10 07:47
Last Changed revision: 10349
Hm, interesting. I wonder if your language model causes this. Does it have
as a transition word? Something like
. Models trained with cmuclmtkare usually like that.
Model is trained with SRILM and has
andas transition words.LM model is converted than in .dmp format. The same model works well with
previous pocketsphinx version.
I vave doublechecked again everything and have the same strange results, with
the old sphinxbase/pocketsphinx version compiled using VS on Win7 I have good
results and with newes version from trunk I have NOMATCH
There are log files from the old version:
===========================================
INFO: cmd_ln.c(512): Parsing command line:
Current configuration:
-agc none none
-agcthresh 2.0 2.000000e+000
-alpha 0.97 9.700000e-001
-ascale 20.0 2.000000e+001
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-048
-bestpath yes yes
-bestpathlw 9.5 9.500000e+000
-bghist no no
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-debug 0
-dict
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-008
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-064
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+000
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-029
-fwdtree yes yes
-hmm
-input_endian little little
-jsgf
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latsize 5000 5000
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm
-lmctl
-lmname default default
-logbase 1.0001 1.000100e+000
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+002
-lpbeam 1e-40 1.000000e-040
-lponlybeam 7e-29 7.000000e-029
-lw 6.5 6.500000e+000
-maxhmmpf -1 -1
-maxnewoov 20 20
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-007
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1.000000e+000
-pbeam 1e-48 1.000000e-048
-pip 1.0 1.000000e+000
-pl_beam 1e-10 1.000000e-010
-pl_pbeam 1e-5 1.000000e-005
-pl_window 0 0
-rawlogdir
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+004
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-003
-smoothspec no no
-svspec
-tmat
-tmatfloor 0.0001 1.000000e-004
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+003
-usewdphones no no
-uw 1.0 1.000000e+000
-var
-varfloor 0.0001 1.000000e-004
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-029
-wip 0.65 6.500000e-001
-wlen 0.025625 2.562500e-002
INFO: cmd_ln.c(512): Parsing command line:
\
-alpha 0.97 \
-doublebw no \
-nfilt 31 \
-ncep 13 \
-lowerf 200 \
-upperf 3500 \
-nfft 256 \
-wlen 0.0256 \
-samprate 8000 \
-transform legacy \
-feat 1s_c_d_dd \
-agc none \
-cmn current \
-varnorm no
Current configuration:
-agc none none
-agcthresh 2.0 2.000000e+000
-alpha 0.97 9.700000e-001
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-dither no no
-doublebw no no
-feat 1s_c_d_dd 1s_c_d_dd
-frate 100 100
-input_endian little little
-lda D:\eclipse\workspace\ASRWebService\WebContent\WEB-INF\classes/hmm/feature_transform
-ldadim 0 0
-lifter 0 0
-logspec no no
-lowerf 133.33334 2.000000e+002
-ncep 13 13
-nfft 512 256
-nfilt 40 31
-remove_dc no no
-round_filters yes yes
-samprate 16000 8.000000e+003
-seed -1 -1
-smoothspec no no
-svspec
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 3.500000e+003
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2.560000e-002
INFO: acmod.c(238): Parsed model-specific feature parameters from
D:\eclipse\workspace\ASRWebService\WebContent\WEB-INF\classes/hmm/feat.params
INFO: feat.c(848): Initializing feature stream to type: '1s_c_d_dd',
ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean= 12.00, mean= 0.0
INFO: acmod.c(153): Reading linear feature transformation from
D:\eclipse\workspace\ASRWebService\WebContent\WEB-
INF\classes/hmm/feature_transform
INFO: mdef.c(520): Reading model definition:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-INF\classes/hmm/mdef
INFO: bin_mdef.c(173): Allocating 45839 * 8 bytes (358 KiB) for CD tree
INFO: tmat.c(205): Reading HMM transition probability matrices:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-
INF\classes/hmm/transition_matrices
INFO: acmod.c(117): Attempting to use SCHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-INF\classes/hmm/means
INFO: ms_gauden.c(292): 16117 codebook, 1 feature, size: INFO:
ms_gauden.c(294): 32x29INFO: ms_gauden.c(295):
INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-INF\classes/hmm/variances
INFO: ms_gauden.c(292): 16117 codebook, 1 feature, size: INFO:
ms_gauden.c(294): 32x29INFO: ms_gauden.c(295):
INFO: ms_gauden.c(356): 3526105 variance values floored
INFO: acmod.c(119): Attempting to use PTHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-INF\classes/hmm/means
INFO: ms_gauden.c(292): 16117 codebook, 1 feature, size: INFO:
ms_gauden.c(294): 32x29INFO: ms_gauden.c(295):
INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-INF\classes/hmm/variances
INFO: ms_gauden.c(292): 16117 codebook, 1 feature, size: INFO:
ms_gauden.c(294): 32x29INFO: ms_gauden.c(295):
INFO: ms_gauden.c(356): 3526105 variance values floored
ERROR: "ptm_mgau.c", line 800: Number of codebooks exceeds 256: 16117
INFO: acmod.c(121): Falling back to general multi-stream GMM computation
INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-INF\classes/hmm/means
INFO: ms_gauden.c(292): 16117 codebook, 1 feature, size: INFO:
ms_gauden.c(294): 32x29INFO: ms_gauden.c(295):
INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-INF\classes/hmm/variances
INFO: ms_gauden.c(292): 16117 codebook, 1 feature, size: INFO:
ms_gauden.c(294): 32x29INFO: ms_gauden.c(295):
INFO: ms_gauden.c(356): 3526105 variance values floored
INFO: ms_senone.c(160): Reading senone mixture weights:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-
INF\classes/hmm/mixture_weights
INFO: ms_senone.c(211): Truncating senone logs3(pdf) values by 10 bits
INFO: ms_senone.c(218): Not transposing mixture weights in memory
WARNING: "ms_senone.c", line 265: Weight normalization failed for 4 senones
INFO: ms_senone.c(277): Read mixture weights for 16117 senones: 1 features x
32 codewords
INFO: ms_senone.c(331): Mapping senones to individual codebooks
INFO: ms_mgau.c(122): The value of topn: 2
INFO: phone_loop_search.c(105): State beam -230231 Phone exit beam -115115
Insertion penalty 0
INFO: dict.c(306): Allocating 11219 * 20 bytes (219 KiB) for word entries
INFO: dict.c(321): Reading main dictionary:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-
INF\classes/berlinHaltestellen.dic
INFO: dict.c(212): Allocated 71 KiB for strings, 125 KiB for phones
INFO: dict.c(324): 7119 words read
INFO: dict.c(330): Reading filler dictionary:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-INF\classes/de8KHz.filler
INFO: dict.c(212): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(333): 4 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(404): Allocating 39^3 * 2 bytes (115 KiB) for word-initial
triphones
INFO: dict2pid.c(131): Allocated 18408 bytes (17 KiB) for word-final triphones
INFO: dict2pid.c(195): Allocated 18408 bytes (17 KiB) for single-phone word
triphones
INFO: ngram_model_arpa.c(77): No \data\ mark in LM file
INFO: ngram_model_dmp.c(142): Will use memory-mapped I/O for LM file
INFO: ngram_model_dmp.c(196): ngrams 1=7121, 2=16257, 3=2015
INFO: ngram_model_dmp.c(242): 7121 = LM.unigrams(+trailer) read
INFO: ngram_model_dmp.c(290): 16257 = LM.bigrams(+trailer) read
INFO: ngram_model_dmp.c(315): 2015 = LM.trigrams read
INFO: ngram_model_dmp.c(339): 424 = LM.prob2 entries read
INFO: ngram_model_dmp.c(358): 473 = LM.bo_wt2 entries read
INFO: ngram_model_dmp.c(378): 218 = LM.prob3 entries read
INFO: ngram_model_dmp.c(406): 32 = LM.tseg_base entries read
INFO: ngram_model_dmp.c(462): 7121 = ascii word strings read
INFO: ngram_search_fwdtree.c(99): 419 unique initial diphones
INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 13 single-
phone words
INFO: ngram_search_fwdtree.c(186): Creating search tree
INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 13
single-phone words
INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 35436
INFO: ngram_search_fwdtree.c(335): after: 419 root, 35308 non-root channels,
12 single-phone words
INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25
INFO: pocketsphinx.c(668): Writing raw audio log file:
D:\eclipse\workspace\ASRWebService\ps_test/000000000.raw
INFO: cmn.c(175): CMN: 7.50 -0.16 -0.08 -0.17 -0.21 -0.18 -0.10 -0.11 -0.13
-0.07 -0.07 -0.06 -0.07
INFO: ngram_search_fwdtree.c(1534): 1044 words recognized (3/fr)
INFO: ngram_search_fwdtree.c(1536): 511531 senones evaluated (1496/fr)
INFO: ngram_search_fwdtree.c(1538): 247153 channels searched (722/fr), 100243
1st, 6411 last
INFO: ngram_search_fwdtree.c(1542): 2809 words for which last channels
evaluated (8/fr)
INFO: ngram_search_fwdtree.c(1545): 1399 candidate words for entering last
phone (4/fr)
INFO: ngram_search_fwdflat.c(295): Utterance vocabulary contains 14 words
INFO: ngram_search_fwdflat.c(925): 607 words recognized (2/fr)
INFO: ngram_search_fwdflat.c(927): 28302 senones evaluated (83/fr)
INFO: ngram_search_fwdflat.c(929): 8189 channels searched (23/fr)
INFO: ngram_search_fwdflat.c(931): 1526 words searched (4/fr)
INFO: ngram_search_fwdflat.c(933): 676 word transitions (1/fr)
INFO: ngram_search.c(1133): lattice start node
.0 end node.268INFO: ps_lattice.c(1351): Normalizer P(O) = alpha(:268:340) = -504410
INFO: ps_lattice.c(1389): Joint P(O,S) = -504410 P(S|O) = 0
result: wiebestrasse ecke huttenstrasse
conf: 0.4644093327096144
RT: 0.9444281953787658
INFO: pocketsphinx.c(668): Writing raw audio log file:
D:\eclipse\workspace\ASRWebService\ps_test/000000001.raw
INFO: cmn.c(175): CMN: 7.50 -0.16 -0.08 -0.17 -0.21 -0.18 -0.10 -0.11 -0.13
-0.07 -0.07 -0.06 -0.07
INFO: ngram_search_fwdtree.c(1534): 1044 words recognized (3/fr)
INFO: ngram_search_fwdtree.c(1536): 512279 senones evaluated (1498/fr)
INFO: ngram_search_fwdtree.c(1538): 247138 channels searched (722/fr), 100243
1st, 6411 last
INFO: ngram_search_fwdtree.c(1542): 2809 words for which last channels
evaluated (8/fr)
INFO: ngram_search_fwdtree.c(1545): 1399 candidate words for entering last
phone (4/fr)
INFO: ngram_search_fwdflat.c(295): Utterance vocabulary contains 14 words
INFO: ngram_search_fwdflat.c(925): 607 words recognized (2/fr)
INFO: ngram_search_fwdflat.c(927): 28302 senones evaluated (83/fr)
INFO: ngram_search_fwdflat.c(929): 8189 channels searched (23/fr)
INFO: ngram_search_fwdflat.c(931): 1526 words searched (4/fr)
INFO: ngram_search_fwdflat.c(933): 676 word transitions (1/fr)
INFO: ngram_search.c(1133): lattice start node
.0 end node.268INFO: ps_lattice.c(1351): Normalizer P(O) = alpha(:268:340) = -504410
INFO: ps_lattice.c(1389): Joint P(O,S) = -504410 P(S|O) = 0
result: wiebestrasse ecke huttenstrasse
conf: 0.4644093327096144
RT: 0.9420883299210295
===========================================
and from the new one:
===========================================
INFO: cmd_ln.c(512): Parsing command line:
Current configuration:
-agc none none
-agcthresh 2.0 2.000000e+000
-alpha 0.97 9.700000e-001
-ascale 20.0 2.000000e+001
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-048
-bestpath yes yes
-bestpathlw 9.5 9.500000e+000
-bghist no no
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-debug 0
-dict
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-008
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-064
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+000
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-029
-fwdtree yes yes
-hmm
-input_endian little little
-jsgf
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latsize 5000 5000
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm
-lmctl
-lmname default default
-logbase 1.0001 1.000100e+000
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+002
-lpbeam 1e-40 1.000000e-040
-lponlybeam 7e-29 7.000000e-029
-lw 6.5 6.500000e+000
-maxhmmpf -1 -1
-maxnewoov 20 20
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-007
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1.000000e+000
-pbeam 1e-48 1.000000e-048
-pip 1.0 1.000000e+000
-pl_beam 1e-10 1.000000e-010
-pl_pbeam 1e-5 1.000000e-005
-pl_window 0 0
-rawlogdir
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+004
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-003
-smoothspec no no
-svspec
-tmat
-tmatfloor 0.0001 1.000000e-004
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+003
-usewdphones no no
-uw 1.0 1.000000e+000
-var
-varfloor 0.0001 1.000000e-004
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-029
-wip 0.65 6.500000e-001
-wlen 0.025625 2.562500e-002
INFO: cmd_ln.c(512): Parsing command line:
\
-alpha 0.97 \
-doublebw no \
-nfilt 31 \
-ncep 13 \
-lowerf 200 \
-upperf 3500 \
-nfft 256 \
-wlen 0.0256 \
-samprate 8000 \
-transform legacy \
-feat 1s_c_d_dd \
-agc none \
-cmn current \
-varnorm no
Current configuration:
-agc none none
-agcthresh 2.0 2.000000e+000
-alpha 0.97 9.700000e-001
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-dither no no
-doublebw no no
-feat 1s_c_d_dd 1s_c_d_dd
-frate 100 100
-input_endian little little
-lda D:\eclipse\workspace\ASRWebService\WebContent\WEB-INF\classes/hmm/feature_transform
-ldadim 0 0
-lifter 0 0
-logspec no no
-lowerf 133.33334 2.000000e+002
-ncep 13 13
-nfft 512 256
-nfilt 40 31
-remove_dc no no
-round_filters yes yes
-samprate 16000 8.000000e+003
-seed -1 -1
-smoothspec no no
-svspec
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 3.500000e+003
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2.560000e-002
INFO: acmod.c(238): Parsed model-specific feature parameters from
D:\eclipse\workspace\ASRWebService\WebContent\WEB-INF\classes/hmm/feat.params
INFO: feat.c(860): Initializing feature stream to type: '1s_c_d_dd',
ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean= 12.00, mean= 0.0
INFO: acmod.c(153): Reading linear feature transformation from
D:\eclipse\workspace\ASRWebService\WebContent\WEB-
INF\classes/hmm/feature_transform
INFO: mdef.c(520): Reading model definition:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-INF\classes/hmm/mdef
INFO: bin_mdef.c(173): Allocating 45839 * 8 bytes (358 KiB) for CD tree
INFO: tmat.c(205): Reading HMM transition probability matrices:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-
INF\classes/hmm/transition_matrices
INFO: acmod.c(117): Attempting to use SCHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-INF\classes/hmm/means
INFO: ms_gauden.c(292): 16117 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 32x29
INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-INF\classes/hmm/variances
INFO: ms_gauden.c(292): 16117 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 32x29
INFO: ms_gauden.c(354): 3526105 variance values floored
INFO: acmod.c(119): Attempting to use PTHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-INF\classes/hmm/means
INFO: ms_gauden.c(292): 16117 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 32x29
INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-INF\classes/hmm/variances
INFO: ms_gauden.c(292): 16117 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 32x29
INFO: ms_gauden.c(354): 3526105 variance values floored
INFO: ptm_mgau.c(800): Number of codebooks exceeds 256: 16117
INFO: acmod.c(121): Falling back to general multi-stream GMM computation
INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-INF\classes/hmm/means
INFO: ms_gauden.c(292): 16117 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 32x29
INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-INF\classes/hmm/variances
INFO: ms_gauden.c(292): 16117 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 32x29
INFO: ms_gauden.c(354): 3526105 variance values floored
INFO: ms_senone.c(160): Reading senone mixture weights:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-
INF\classes/hmm/mixture_weights
INFO: ms_senone.c(211): Truncating senone logs3(pdf) values by 10 bits
INFO: ms_senone.c(218): Not transposing mixture weights in memory
WARNING: "ms_senone.c", line 265: Weight normalization failed for 4 senones
INFO: ms_senone.c(277): Read mixture weights for 16117 senones: 1 features x
32 codewords
INFO: ms_senone.c(331): Mapping senones to individual codebooks
INFO: ms_mgau.c(122): The value of topn: 2
INFO: phone_loop_search.c(105): State beam -230231 Phone exit beam -115115
Insertion penalty 0
INFO: dict.c(306): Allocating 11219 * 20 bytes (219 KiB) for word entries
INFO: dict.c(321): Reading main dictionary:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-
INF\classes/berlinHaltestellen.dic
INFO: dict.c(212): Allocated 71 KiB for strings, 125 KiB for phones
INFO: dict.c(324): 7119 words read
INFO: dict.c(330): Reading filler dictionary:
D:\eclipse\workspace\ASRWebService\WebContent\WEB-INF\classes/de8KHz.filler
INFO: dict.c(212): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(333): 4 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(404): Allocating 39^3 * 2 bytes (115 KiB) for word-initial
triphones
INFO: dict2pid.c(131): Allocated 18408 bytes (17 KiB) for word-final triphones
INFO: dict2pid.c(195): Allocated 18408 bytes (17 KiB) for single-phone word
triphones
INFO: ngram_model_arpa.c(77): No \data\ mark in LM file
INFO: ngram_model_dmp.c(142): Will use memory-mapped I/O for LM file
INFO: ngram_model_dmp.c(196): ngrams 1=7121, 2=16257, 3=2015
INFO: ngram_model_dmp.c(242): 7121 = LM.unigrams(+trailer) read
INFO: ngram_model_dmp.c(291): 16257 = LM.bigrams(+trailer) read
INFO: ngram_model_dmp.c(317): 2015 = LM.trigrams read
INFO: ngram_model_dmp.c(342): 424 = LM.prob2 entries read
INFO: ngram_model_dmp.c(362): 473 = LM.bo_wt2 entries read
INFO: ngram_model_dmp.c(382): 218 = LM.prob3 entries read
INFO: ngram_model_dmp.c(410): 32 = LM.tseg_base entries read
INFO: ngram_model_dmp.c(466): 7121 = ascii word strings read
INFO: ngram_search_fwdtree.c(99): 419 unique initial diphones
INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 13 single-
phone words
INFO: ngram_search_fwdtree.c(186): Creating search tree
INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 13
single-phone words
INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 35436
INFO: ngram_search_fwdtree.c(338): after: 419 root, 35308 non-root channels,
12 single-phone words
INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25
INFO: pocketsphinx.c(672): Writing raw audio log file:
D:\eclipse\workspace\ASRWebService\ps_test/000000000.raw
INFO: cmn.c(175): CMN: 7.50 -0.16 -0.08 -0.17 -0.21 -0.18 -0.10 -0.11 -0.13
-0.07 -0.07 -0.06 -0.07
INFO: ngram_search.c(463): Resized backpointer table to 10000 entries
INFO: ngram_search.c(471): Resized score stack to 200000 entries
INFO: ngram_search_fwdtree.c(1549): 6868 words recognized (20/fr)
INFO: ngram_search_fwdtree.c(1551): 2692788 senones evaluated (7874/fr)
INFO: ngram_search_fwdtree.c(1553): 2657689 channels searched (7771/fr),
141622 1st, 133752 last
INFO: ngram_search_fwdtree.c(1557): 7549 words for which last channels
evaluated (22/fr)
INFO: ngram_search_fwdtree.c(1560): 108988 candidate words for entering last
phone (318/fr)
INFO: ngram_search_fwdtree.c(1562): fwdtree 8.00 CPU 2.340 xRT
INFO: ngram_search_fwdtree.c(1565): fwdtree 8.14 wall 2.380 xRT
INFO: ngram_search_fwdflat.c(305): Utterance vocabulary contains 24 words
INFO: ngram_search_fwdflat.c(940): 2009 words recognized (6/fr)
INFO: ngram_search_fwdflat.c(942): 346630 senones evaluated (1014/fr)
INFO: ngram_search_fwdflat.c(944): 155636 channels searched (455/fr)
INFO: ngram_search_fwdflat.c(946): 8967 words searched (26/fr)
INFO: ngram_search_fwdflat.c(948): 837 word transitions (2/fr)
INFO: ngram_search_fwdflat.c(951): fwdflat 0.56 CPU 0.164 xRT
INFO: ngram_search_fwdflat.c(954): fwdflat 0.58 wall 0.168 xRT
INFO: ngram_search.c(1198): not found in last frame, using
.340instead
INFO: ngram_search.c(1250): lattice start node
.0 end node.0INFO: ngram_search.c(1278): Eliminated 1041 nodes before end node
INFO: ngram_search.c(1383): Lattice has 1042 nodes, 0 links
INFO: ps_lattice.c(1352): Normalizer P(O) = alpha(
:0:340) = -536874342not found in last frame, usingNOMATCH
RT: 0.0
INFO: pocketsphinx.c(672): Writing raw audio log file:
D:\eclipse\workspace\ASRWebService\ps_test/000000001.raw
INFO: cmn.c(175): CMN: 7.50 -0.16 -0.08 -0.17 -0.21 -0.18 -0.10 -0.11 -0.13
-0.07 -0.07 -0.06 -0.07
INFO: ngram_search_fwdtree.c(1549): 6868 words recognized (20/fr)
INFO: ngram_search_fwdtree.c(1551): 2693563 senones evaluated (7876/fr)
INFO: ngram_search_fwdtree.c(1553): 2657689 channels searched (7771/fr),
141622 1st, 133752 last
INFO: ngram_search_fwdtree.c(1557): 7549 words for which last channels
evaluated (22/fr)
INFO: ngram_search_fwdtree.c(1560): 108988 candidate words for entering last
phone (318/fr)
INFO: ngram_search_fwdtree.c(1562): fwdtree 7.96 CPU 2.326 xRT
INFO: ngram_search_fwdtree.c(1565): fwdtree 8.04 wall 2.350 xRT
INFO: ngram_search_fwdflat.c(305): Utterance vocabulary contains 24 words
INFO: ngram_search_fwdflat.c(940): 2009 words recognized (6/fr)
INFO: ngram_search_fwdflat.c(942): 346630 senones evaluated (1014/fr)
INFO: ngram_search_fwdflat.c(944): 155636 channels searched (455/fr)
INFO: ngram_search_fwdflat.c(946): 8967 words searched (26/fr)
INFO: ngram_search_fwdflat.c(948): 837 word transitions (2/fr)
INFO: ngram_search_fwdflat.c(951): fwdflat 0.58 CPU 0.169 xRT
INFO: ngram_search_fwdflat.c(954): fwdflat 0.58 wall 0.169 xRT
INFO: ngram_search.c(1198):
.340instead
INFO: ngram_search.c(1250): lattice start node
.0 end node.0INFO: ngram_search.c(1278): Eliminated 1041 nodes before end node
INFO: ngram_search.c(1383): Lattice has 1042 nodes, 0 links
INFO: ps_lattice.c(1352): Normalizer P(O) = alpha(
:0:340) = -536874342NOMATCH
INFO: ngram_search_fwdtree.c(430): TOTAL fwdtree 15.96 CPU 2.340 xRT
INFO: ngram_search_fwdtree.c(433): TOTAL fwdtree 16.18 wall 2.372 xRT
RT: 0.0
INFO: ngram_search_fwdflat.c(174): TOTAL fwdflat 1.14 CPU 0.167 xRT
INFO: ngram_search_fwdflat.c(177): TOTAL fwdflat 1.15 wall 0.169 xRT
INFO: ngram_search.c(314): TOTAL bestpath 0.00 CPU 0.000 xRT
INFO: ngram_search.c(317): TOTAL bestpath 0.00 wall 0.000 xRT
===========================================Hi
Maybe you can share the data to run pocketsphinx so I can look.
I have uploaded data on the next link:
http://www.file-upload.net/download-3178998/test_ps.zip.html
You can run test using test_ps.bat
The differences you can dee replacing sphinxbase.dll and pocketsphinx.dll from
old_lib or new_lib subfolders.
P.S. I have used in this test example voxforge english 8KHz acoustic model...
That's strange because I've just compiled the trunk on WinXP with VS2008
Express and everything runs smoothly. It does go out-of-memory with new dlls
you submitted.
Maybe it's some kind of memory corruption issue caused by different
optimization paramaters. Do you change something like that?
Hm, interesting.. I have compiled on Win7 using VS2008 Professional. I don't
have any problems like out-of-memory, but the old .dll's give me normal result
and new just silence. I will try to compile now on WinXP using VS2008 Express.
Can you share yours .dll's? Is it possible you are using newest stuff on
trunk, not visible for me?