Your CMN values are abnormal, so it's something wrong with the input data and previous steps. Make sure your input data has correct format, it must be 16khz 16bit mono file.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I use this command to create a file mfc
sphinx_fe -argfile hub4wsj_sc_8k/feat.params \
-samprate 16000 -c adaptation-test.fileids \
-di . -do . -ei wav -eo mfc -mswav yes
If there is something wrong with the command?
I also have to download a wav file from this forum, and have tried it with the same error. I send you the file. Thanks for your attention.
overall> stats 1428 (-0) -6.119801e+01 -8.739076e+04 0.024x 1.455e
WARNING: "accum.c", line 617: Over 500 senones never occur in the input data. This is normal for context-dependent untied senone training or for adaptation, but could indicate a serious problem otherwise.
INFO: s3mixw_io.c(232): Wrote ./mixw_counts [5150x3x256 array]
INFO: s3tmat_io.c(174): Wrote ./tmat_counts [50x3x4 array]
INFO: s3gau_io.c(478): Wrote ./gauden_counts with means with vars [1x3x256 vector arrays]
INFO: main.c(1014): Counts saved to .
I see some warning like this
WARNING: "accum.c", line 617: Over 500 senones never occur in the input data. This is normal for context-dependent untied senone training or for adaptation, but could indicate a serious problem otherwise.
Is that OK? Because I got this when running ./word_align.pl adaptation-test.transcription adaptation-test.hyp
Because I got this when running ./word_align.pl adaptation-test.transcription adaptation-test.hyp
NYALAKAN LAMPU RUANG TAMU SATU (ARCTIC_0001)
There is an error in your decoding setup. You need to provide more information on that including commands you are running, information about your system, data files to get help on this issue.
Last edit: Nickolay V. Shmyrev 2014-06-03
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi, Nickolay. Thanks for helping me.
I'm using Linux Backtrack 5r3 running on Virtual Machine (VMWare). I'm using :
1. SphinxBase-0.8
2. SphinxTrain-1.0.8
3. PocketSphinx-0.8
I'm using following command
-> Create Wav File
for i in seq 1 4; do
fn=printf arctic_%04d $i;
read sent; echo $sent;
rec -r 16000 -e signed-integer -b 16 -c 1 $fn.wav 2>/dev/null;
done < arctic20.txt
Current configuration: [NAME][DEFLT][VALUE]
-alpha 0.97 9.700000e-01
-argfile hub4wsj_sc_8k/feat.params
-blocksize 2048 2048
-build_outdirs yes yes
-c arctic20.fileids
-cep2spec no no
-di .
-dither no no
-do .
-doublebw no no
-ei wav
-eo mfc
-example no no
-frate 100 100
-help no no
-i
-input_endian little little
-lifter 0 0
-logspec no no
-lowerf 133.33334 1.333333e+02
-mach_endian little little
-mswav no yes
-ncep 13 13
-nchans 1 1
-nfft 512 512
-nfilt 40 40
-nist no no
-npart 0 0
-nskip 0 0
-o
-ofmt sphinx sphinx
-part 0 0
-raw no no
-remove_dc no no
-round_filters yes yes
-runlen -1 -1
-samprate 16000 1.600000e+04
-seed -1 -1
-smoothspec no no
-spec2cep no no
-sph2pipe no no
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-whichchan 0 0
-wlen 0.025625 2.562500e-02
Current configuration: [NAME][DEFLT][VALUE]
-alpha 0.97 9.700000e-01
-argfile hub4wsj_sc_8k/feat.params
-blocksize 2048 2048
-build_outdirs yes yes
-c arctic20.fileids
-cep2spec no no
-di .
-dither no no
-do .
-doublebw no no
-ei wav
-eo mfc
-example no no
-frate 100 100
-help no no
-i
-input_endian little little
-lifter 0 0
-logspec no no
-lowerf 133.33334 1.000000e+00
-mach_endian little little
-mswav no yes
-ncep 13 13
-nchans 1 1
-nfft 512 512
-nfilt 40 20
-nist no no
-npart 0 0
-nskip 0 0
-o
-ofmt sphinx sphinx
-part 0 0
-raw no no
-remove_dc no yes
-round_filters yes no
-runlen -1 -1
-samprate 16000 1.600000e+04
-seed -1 -1
-smoothspec no no
-spec2cep no no
-sph2pipe no no
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 4.000000e+03
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-whichchan 0 0
-wlen 0.025625 2.500000e-02
INFO: sphinx_fe.c(1043): Processing all remaining utterances at position 0
overall> stats 1173 (-0) -9.364835e+01 -1.098495e+05 0.033x 1.379e
WARNING: "accum.c", line 617: Over 500 senones never occur in the input data. This is normal for context-dependent untied senone training or for adaptation, but could indicate a serious problem otherwise.
INFO: s3mixw_io.c(232): Wrote ./mixw_counts [5150x3x256 array]
INFO: s3tmat_io.c(174): Wrote ./tmat_counts [50x3x4 array]
INFO: s3gau_io.c(478): Wrote ./gauden_counts with means with vars [1x3x256 vector arrays]
INFO: main.c(1014): Counts saved to .
Current configuration: [NAME][DEFLT][VALUE]
-accumdir .,
-cb2mllrfn .1cls. .1cls.
-cdonly no no
-example no no
-fullvar no no
-help no no
-meanfn hub4wsj_sc_8k/means
-mllradd yes yes
-mllrmult yes yes
-moddeffn
-outmllrfn mllr_matrix
-varfloor 1e-3 1.000000e-03
-varfn hub4wsj_sc_8k/variances
INFO: main.c(382): -- 1. Read input mean, (var) and accumulation.
INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/means [1x3x256 array]
INFO: main.c(397): Reading and accumulating counts from .
INFO: s3gau_io.c(379): Read ./gauden_counts with means with vars [1x3x256 vector arrays]
INFO: main.c(208): ---- A. Accum regl, regr
INFO: main.c(209): No classes 1, no. stream 3
INFO: main.c(281): ---- B. Compute MLLR matrices (A,B)
INFO: mllr.c(182): Computing both multiplicative and additive part of MLLR
INFO: mllr.c(182): Computing both multiplicative and additive part of MLLR
INFO: mllr.c(182): Computing both multiplicative and additive part of MLLR
INFO: main.c(497): -- 4. Store mllr matrices (A,B) to mllr_matrix
Current configuration: [NAME][DEFLT][VALUE]
-example no no
-help no no
-mixwfn hub4wsj_sc_8kadapt/mixture_weights
-moddeffn hub4wsj_sc_8kadapt/mdef.txt
-mwfloor 0.00001 1.000000e-05
-pocketsphinx no yes
-sendumpfn hub4wsj_sc_8kadapt/sendump
INFO: model_def_io.c(573): Model definition info:
INFO: model_def_io.c(574): 143097 total models defined (50 base, 143047 tri)
INFO: model_def_io.c(575): 572388 total states
INFO: model_def_io.c(576): 5150 total tied states
INFO: model_def_io.c(577): 150 total tied CI states
INFO: model_def_io.c(578): 50 total tied transition matrices
INFO: model_def_io.c(579): 4 max state/model
INFO: model_def_io.c(580): 4 min state/model
INFO: senone.c(210): Reading senone mixture weights: hub4wsj_sc_8kadapt/mixture_weights
INFO: senone.c(330): Read mixture weights for 5150 senones: 3 features x 256 codewords
INFO: mk_s2sendump.c(207): Writing PocketSphinx format sendump file: hub4wsj_sc_8kadapt/sendump
After do it all, I copy the new acoustic model, new language model, and files needed for testing to new folder named Testing Adaptation. I run the following command :
Am I doing something wrong?
Is my new language model wrong?
I tried to adapt Indonesia languange with CMU Sphinx default acoustic. Can I do like that or I have to build a new one?
Thank you so much.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Your corpus.lm and your dictionary doesn't match on case. Dictionary is uppercase, lm is lowercase. This is what error during decoding tells you about:
ERROR: "ngram_search_fwdtree.c", line 336: No word from the language model has pronunciation in the dictionary
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I'm learning to adapt default acoustic model and I'm following from http://cmusphinx.sourceforge.net/wiki/tutorialadapt. When running bw I got segmentation fault. This is my command
./bw
-hmmdir hub4wsj_sc_8k
-moddeffn hub4wsj_sc_8k/mdef.txt
-ts2cbfn .semi. -feat 1s_c_d_dd
-svspec 0-12/13-25/26-38
-cmn current -agc none
-dictfn arctic.dic
-ctlfn adaptation-test.fileids
-lsnfn adaptation-test.transcription
-accumdir .
INFO: main.c(229): Compiled on May 24 2014 at 23:28:52
INFO: cmd_ln.c(691): Parsing command line:
./bw \ -hmmdir hub4wsj_sc_8k \ -moddeffn hub4wsj_sc_8k/mdef.txt \ -ts2cbfn .semi. \ -feat 1s_c_d_dd \ -svspec 0-12/13-25/26-38 \ -cmn current \ -agc none \ -dictfn arctic.dic \ -ctlfn adaptation-test.fileids \ -lsnfn adaptation-test.transcription \ -accumdir .
Current configuration:
[NAME] [DEFLT] [VALUE]
-2passvar no no
-abeam 1e-100 1.000000e-100
-accumdir .
-agc none none
-agcthresh 2.0 2.000000e+00
-bbeam 1e-100 1.000000e-100
-cb2mllrfn .1cls. .1cls.
-cepdir
-cepext mfc mfc
-ceplen 13 13
-ckptintv 0
-cmn current current
-cmninit 8.0 8.0
-ctlfn adaptation-test.fileids
-diagfull no no
-dictfn arctic.dic
-example no no
-fdictfn
-feat 1s_c_d_dd 1s_c_d_dd
-fullsuffixmatch no no
-fullvar no no
-help no no
-hmmdir hub4wsj_sc_8k
-latdir
-latext
-lda
-ldaaccum no no
-ldadim 0 0
-lsnfn adaptation-test.transcription
-lw 11.5 1.150000e+01
-maxuttlen 0 0
-meanfn
-meanreest yes yes
-mixwfn
-mixwreest yes yes
-mllrmat
-mmie no no
-mmie_type rand rand
-moddeffn hub4wsj_sc_8k/mdef.txt
-mwfloor 0.00001 1.000000e-05
-npart 0
-nskip 0
-outphsegdir
-outputfullpath no no
-part 0
-pdumpdir
-phsegdir
-phsegext phseg phseg
-runlen -1 -1
-sentdir
-sentext sent sent
-spthresh 0.0 0.000000e+00
-svspec 0-12/13-25/26-38
-timing yes yes
-tmatfn
-tmatreest yes yes
-topn 4 4
-tpfloor 0.0001 1.000000e-04
-ts2cbfn .semi.
-varfloor 0.00001 1.000000e-05
-varfn
-varnorm no no
-varreest yes yes
-viterbi no no
INFO: feat.c(713): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
INFO: main.c(254): Using subvector specification 0-12/13-25/26-38
INFO: main.c(318): Reading hub4wsj_sc_8k/mdef.txt
INFO: model_def_io.c(573): Model definition info:
INFO: model_def_io.c(574): 143097 total models defined (50 base, 143047 tri)
INFO: model_def_io.c(575): 572388 total states
INFO: model_def_io.c(576): 5150 total tied states
INFO: model_def_io.c(577): 150 total tied CI states
INFO: model_def_io.c(578): 50 total tied transition matrices
INFO: model_def_io.c(579): 4 max state/model
INFO: model_def_io.c(580): 4 min state/model
INFO: s3mixw_io.c(116): Read hub4wsj_sc_8k/mixture_weights [5150x3x256 array]
INFO: s3tmat_io.c(115): Read hub4wsj_sc_8k/transition_matrices [50x3x4 array]
INFO: mod_inv.c(300): inserting tprob floor 1.000000e-04 and renormalizing
INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/means [1x3x256 array]
INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/variances [1x3x256 array]
INFO: gauden.c(183): 1 total mgau
INFO: gauden.c(157): 3 feature streams (|0|=13 |1|=13 |2|=13 )
INFO: gauden.c(194): 256 total densities
INFO: gauden.c(97): min_var=1.000000e-05
INFO: gauden.c(172): compute 4 densities/frame
INFO: main.c(430): Will reestimate mixing weights.
INFO: main.c(432): Will reestimate means.
INFO: main.c(434): Will reestimate variances.
INFO: main.c(442): Will reestimate transition matrices
INFO: main.c(455): Reading main lexicon: arctic.dic
INFO: lexicon.c(220): 179 entries added from arctic.dic
INFO: main.c(465): Reading filler lexicon: hub4wsj_sc_8k/noisedict
INFO: lexicon.c(220): 11 entries added from hub4wsj_sc_8k/noisedict
INFO: corpus.c(1086): Will process all remaining utts starting at 0
INFO: main.c(665): Reestimation: Baum-Welch
INFO: main.c(670): Generating profiling information consumes significant CPU resources.
INFO: main.c(671): If you are not interested in profiling, use -timing no
column defns
<seq>
<id>
<n_frame_in>
<n_frame_del>
<n_state_shmm>
<avg_states_alpha>
<avg_states_beta>
<avg_states_reest>
<avg_posterior_prune>
<frame_log_lik>
<utt_log_lik>
... timing info ...
INFO: cmn.c(175): CMN: 691.88 487.06 -1108.00 -1310.86 461.71 523.99 951.00 -399.06 -1273.62 -1192.80 -804.94 -260.02 -873.70
Segmentation fault
I'm using :
1. sphinxtrain-1.0.8
2. sphinxbase-0.8
3. pocketsphinx-0.8
Your CMN values are abnormal, so it's something wrong with the input data and previous steps. Make sure your input data has correct format, it must be 16khz 16bit mono file.
Thanks for your response Nickolay. I've tried your suggestion, after recording my voice, I tried with the following results
test1.wav:
File Size: 128k Bit Rate: 256k
Encoding: Signed PCM
Channels: 1 @ 16-bit
Samplerate: 16000Hz
Replaygain: off
Duration: 00:00:04.00
In:100% 00:00:04.00 [00:00:00.00] Out:64.0k [ | ] Hd:3.7 Clip:0
Done.
test2.wav:
File Size: 192k Bit Rate: 256k
Encoding: Signed PCM
Channels: 1 @ 16-bit
Samplerate: 16000Hz
Replaygain: off
Duration: 00:00:06.00
In:100% 00:00:06.00 [00:00:00.00] Out:96.0k [ | ] Clip:0
Done.
I use this command to create a file mfc
sphinx_fe -argfile hub4wsj_sc_8k/feat.params \ -samprate 16000 -c adaptation-test.fileids \ -di . -do . -ei wav -eo mfc -mswav yes
If there is something wrong with the command?
I also have to download a wav file from this forum, and have tried it with the same error. I send you the file. Thanks for your attention.
Your audio files are ok. Share your mfc files.
Make sure you installed cmusphinx tools properly, it might be that you somehow have old installation somewhere and it breaks the results.
Yes, you are right. After I reinstall, I run bw with the following result
INFO: main.c(229): Compiled on Jun 3 2014 at 09:05:01
INFO: cmd_ln.c(691): Parsing command line:
./bw \ -hmmdir hub4wsj_sc_8k \ -moddeffn hub4wsj_sc_8k/mdef.txt \ -ts2cbfn .semi. \ -feat 1s_c_d_dd \ -svspec 0-12/13-25/26-38 \ -cmn current \ -agc none \ -dictfn arctic20.dic \ -ctlfn arctic20.fileids \ -lsnfn arctic20.transcription \ -accumdir .
Current configuration:
[NAME] [DEFLT] [VALUE]
-2passvar no no
-abeam 1e-100 1.000000e-100
-accumdir .
-agc none none
-agcthresh 2.0 2.000000e+00
-bbeam 1e-100 1.000000e-100
-cb2mllrfn .1cls. .1cls.
-cepdir
-cepext mfc mfc
-ceplen 13 13
-ckptintv 0
-cmn current current
-cmninit 8.0 8.0
-ctlfn arctic20.fileids
-diagfull no no
-dictfn arctic20.dic
-example no no
-fdictfn
-feat 1s_c_d_dd 1s_c_d_dd
-fullsuffixmatch no no
-fullvar no no
-help no no
-hmmdir hub4wsj_sc_8k
-latdir
-latext
-lda
-ldaaccum no no
-ldadim 0 0
-lsnfn arctic20.transcription
-lw 11.5 1.150000e+01
-maxuttlen 0 0
-meanfn
-meanreest yes yes
-mixwfn
-mixwreest yes yes
-mllrmat
-mmie no no
-mmie_type rand rand
-moddeffn hub4wsj_sc_8k/mdef.txt
-mwfloor 0.00001 1.000000e-05
-npart 0
-nskip 0
-outphsegdir
-outputfullpath no no
-part 0
-pdumpdir
-phsegdir
-phsegext phseg phseg
-runlen -1 -1
-sentdir
-sentext sent sent
-spthresh 0.0 0.000000e+00
-svspec 0-12/13-25/26-38
-timing yes yes
-tmatfn
-tmatreest yes yes
-topn 4 4
-tpfloor 0.0001 1.000000e-04
-ts2cbfn .semi.
-varfloor 0.00001 1.000000e-05
-varfn
-varnorm no no
-varreest yes yes
-viterbi no no
INFO: feat.c(713): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
INFO: main.c(254): Using subvector specification 0-12/13-25/26-38
INFO: main.c(318): Reading hub4wsj_sc_8k/mdef.txt
INFO: model_def_io.c(573): Model definition info:
INFO: model_def_io.c(574): 143097 total models defined (50 base, 143047 tri)
INFO: model_def_io.c(575): 572388 total states
INFO: model_def_io.c(576): 5150 total tied states
INFO: model_def_io.c(577): 150 total tied CI states
INFO: model_def_io.c(578): 50 total tied transition matrices
INFO: model_def_io.c(579): 4 max state/model
INFO: model_def_io.c(580): 4 min state/model
INFO: s3mixw_io.c(116): Read hub4wsj_sc_8k/mixture_weights [5150x3x256 array]
INFO: s3tmat_io.c(115): Read hub4wsj_sc_8k/transition_matrices [50x3x4 array]
INFO: mod_inv.c(300): inserting tprob floor 1.000000e-04 and renormalizing
INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/means [1x3x256 array]
INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/variances [1x3x256 array]
INFO: gauden.c(183): 1 total mgau
INFO: gauden.c(157): 3 feature streams (|0|=13 |1|=13 |2|=13 )
INFO: gauden.c(194): 256 total densities
INFO: gauden.c(97): min_var=1.000000e-05
INFO: gauden.c(172): compute 4 densities/frame
INFO: main.c(430): Will reestimate mixing weights.
INFO: main.c(432): Will reestimate means.
INFO: main.c(434): Will reestimate variances.
INFO: main.c(442): Will reestimate transition matrices
INFO: main.c(455): Reading main lexicon: arctic20.dic
INFO: lexicon.c(220): 189 entries added from arctic20.dic
INFO: main.c(465): Reading filler lexicon: hub4wsj_sc_8k/noisedict
INFO: lexicon.c(220): 11 entries added from hub4wsj_sc_8k/noisedict
INFO: corpus.c(1086): Will process all remaining utts starting at 0
INFO: main.c(665): Reestimation: Baum-Welch
INFO: main.c(670): Generating profiling information consumes significant CPU resources.
INFO: main.c(671): If you are not interested in profiling, use -timing no
column defns
<seq>
<id>
<n_frame_in>
<n_frame_del>
<n_state_shmm>
<avg_states_alpha>
<avg_states_beta>
<avg_states_reest>
<avg_posterior_prune>
<frame_log_lik>
<utt_log_lik>
... timing info ...
INFO: cmn.c(175): CMN: 43.03 3.93 -0.29 -0.89 0.24 -0.78 -0.84 -0.19 0.82 0.69 -0.05 -0.26 -0.20
utt> 0 arctic_0001 306 0 108 28 9 8 3.616702e-102 -6.253867e+01 -1.913683e+04 utt 0.030x 2.190e upd 0.027x 2.078e fwd 0.007x 1.568e bwd 0.016x 0.929e gau 0.065x 1.185e rsts 0.004x 1.027e rstf 0.001x 0.218e rstu 0.000x 1381477023605.738e
INFO: cmn.c(175): CMN: 41.28 4.07 0.25 -0.85 0.18 -0.80 -1.14 -0.30 0.63 0.57 -0.06 -0.37 -0.20
utt> 1 arctic_0002 255 0 104 30 9 9 2.138527e-102 -6.126169e+01 -1.562173e+04 utt 0.019x 1.210e upd 0.019x 1.198e fwd 0.003x 1.223e bwd 0.014x 1.310e gau 0.093x 1.246e rsts 0.005x 0.852e rstf -0.000x 0.000e rstu 0.002x 0.092e
INFO: cmn.c(175): CMN: 42.80 3.33 1.01 -0.41 -0.41 -1.09 -1.01 -0.06 0.64 0.66 -0.13 -0.42 -0.04
utt> 2 arctic_0003 306 0 108 32 13 10 3.497573e-102 -6.041223e+01 -1.848614e+04 utt 0.021x 1.288e upd 0.021x 1.278e fwd 0.003x 1.270e bwd 0.016x 1.109e gau 0.154x 0.828e rsts 0.007x 1.015e rstf 0.000x 79175131460.121e rstu 0.000x 1297272376800.195e
INFO: cmn.c(175): CMN: 42.14 3.17 0.48 -0.32 -0.36 -1.11 -1.03 -0.26 0.67 0.57 -0.12 -0.36 -0.13
utt> 3 arctic_0004 306 0 104 32 11 11 2.008941e-102 -6.064926e+01 -1.855867e+04 utt 0.022x 1.098e upd 0.022x 1.089e fwd 0.004x 1.306e bwd 0.018x 1.032e gau 0.152x 0.873e rsts 0.008x 1.088e rstf -0.000x 0.000e rstu 0.000x 11504310568606.896e
INFO: cmn.c(175): CMN: 40.75 2.23 0.88 -0.09 -0.22 -0.88 -0.63 -0.28 0.47 0.75 -0.09 -0.40 0.11
utt> 4 arctic_0005 255 0 92 33 15 14 4.459699e-102 -6.112698e+01 -1.558738e+04 utt 0.025x 1.126e upd 0.025x 1.117e fwd 0.005x 0.830e bwd 0.020x 1.171e gau 0.227x 1.042e rsts 0.008x 0.910e rstf -0.000x 0.000e rstu -0.000x 0.000e
overall> stats 1428 (-0) -6.119801e+01 -8.739076e+04 0.024x 1.455e
WARNING: "accum.c", line 617: Over 500 senones never occur in the input data. This is normal for context-dependent untied senone training or for adaptation, but could indicate a serious problem otherwise.
INFO: s3mixw_io.c(232): Wrote ./mixw_counts [5150x3x256 array]
INFO: s3tmat_io.c(174): Wrote ./tmat_counts [50x3x4 array]
INFO: s3gau_io.c(478): Wrote ./gauden_counts with means with vars [1x3x256 vector arrays]
INFO: main.c(1014): Counts saved to .
I see some warning like this
WARNING: "accum.c", line 617: Over 500 senones never occur in the input data. This is normal for context-dependent untied senone training or for adaptation, but could indicate a serious problem otherwise.
Is that OK? Because I got this when running ./word_align.pl adaptation-test.transcription adaptation-test.hyp
NYALAKAN LAMPU RUANG TAMU SATU (ARCTIC_0001)
(ARCTIC_0001)
Words: 5 Correct: 0 Errors: 5 Percent correct = 0.00% Error = 100.00% Accuracy = 0.00%
Insertions: 0 Deletions: 5 Substitutions: 0
MATIKAN LAMPU RUANG TAMU SATU (ARCTIC_0002) (ARCTIC_0002)
Words: 5 Correct: 0 Errors: 5 Percent correct = 0.00% Error = 100.00% Accuracy = 0.00%
Insertions: 0 Deletions: 5 Substitutions: 0
TOTAL Words: 10 Correct: 0 Errors: 10
TOTAL Percent correct = 0.00% Error = 100.00% Accuracy = 0.00%
TOTAL Insertions: 0 Deletions: 10 Substitutions: 0
I tried to adapt Indonesia languange with CMU Sphinx default acoustic. Can I do like that or I have to build a new one?
There is an error in your decoding setup. You need to provide more information on that including commands you are running, information about your system, data files to get help on this issue.
Last edit: Nickolay V. Shmyrev 2014-06-03
Hi, Nickolay. Thanks for helping me.
I'm using Linux Backtrack 5r3 running on Virtual Machine (VMWare). I'm using :
1. SphinxBase-0.8
2. SphinxTrain-1.0.8
3. PocketSphinx-0.8
I'm using following command
-> Create Wav File
for i in
seq 1 4
; dofn=
printf arctic_%04d $i
;read sent; echo $sent;
rec -r 16000 -e signed-integer -b 16 -c 1 $fn.wav 2>/dev/null;
done < arctic20.txt
-> Generating acoustic feature files (MFC File)
sphinx_fe -argfile hub4wsj_sc_8k/feat.params \ -samprate 16000 -c arctic20.fileids \ -di . -do . -ei wav -eo mfc -mswav yes
Result
INFO: cmd_ln.c(691): Parsing command line:
sphinx_fe \ -argfile hub4wsj_sc_8k/feat.params \ -samprate 16000 \ -c arctic20.fileids \ -di . \ -do . \ -ei wav \ -eo mfc \ -mswav yes
Current configuration:
[NAME] [DEFLT] [VALUE]
-alpha 0.97 9.700000e-01
-argfile hub4wsj_sc_8k/feat.params
-blocksize 2048 2048
-build_outdirs yes yes
-c arctic20.fileids
-cep2spec no no
-di .
-dither no no
-do .
-doublebw no no
-ei wav
-eo mfc
-example no no
-frate 100 100
-help no no
-i
-input_endian little little
-lifter 0 0
-logspec no no
-lowerf 133.33334 1.333333e+02
-mach_endian little little
-mswav no yes
-ncep 13 13
-nchans 1 1
-nfft 512 512
-nfilt 40 40
-nist no no
-npart 0 0
-nskip 0 0
-o
-ofmt sphinx sphinx
-part 0 0
-raw no no
-remove_dc no no
-round_filters yes yes
-runlen -1 -1
-samprate 16000 1.600000e+04
-seed -1 -1
-smoothspec no no
-spec2cep no no
-sph2pipe no no
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-whichchan 0 0
-wlen 0.025625 2.562500e-02
INFO: cmd_ln.c(691): Parsing command line:
\ -nfilt 20 \ -lowerf 1 \ -upperf 4000 \ -wlen 0.025 \ -transform dct \ -round_filters no \ -remove_dc yes \ -svspec 0-12/13-25/26-38 \ -feat 1s_c_d_dd \ -agc none \ -cmn current \ -cmninit 56,-3,1 \ -varnorm no
Current configuration:
[NAME] [DEFLT] [VALUE]
-alpha 0.97 9.700000e-01
-argfile hub4wsj_sc_8k/feat.params
-blocksize 2048 2048
-build_outdirs yes yes
-c arctic20.fileids
-cep2spec no no
-di .
-dither no no
-do .
-doublebw no no
-ei wav
-eo mfc
-example no no
-frate 100 100
-help no no
-i
-input_endian little little
-lifter 0 0
-logspec no no
-lowerf 133.33334 1.000000e+00
-mach_endian little little
-mswav no yes
-ncep 13 13
-nchans 1 1
-nfft 512 512
-nfilt 40 20
-nist no no
-npart 0 0
-nskip 0 0
-o
-ofmt sphinx sphinx
-part 0 0
-raw no no
-remove_dc no yes
-round_filters yes no
-runlen -1 -1
-samprate 16000 1.600000e+04
-seed -1 -1
-smoothspec no no
-spec2cep no no
-sph2pipe no no
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 4.000000e+03
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-whichchan 0 0
-wlen 0.025625 2.500000e-02
INFO: sphinx_fe.c(1043): Processing all remaining utterances at position 0
-> Accumulating observation counts
./bw \ -hmmdir hub4wsj_sc_8k \ -moddeffn hub4wsj_sc_8k/mdef.txt \ -ts2cbfn .semi. \ -feat 1s_c_d_dd \ -svspec 0-12/13-25/26-38 \ -cmn current \ -agc none \ -dictfn arctic20.dic \ -ctlfn arctic20.fileids \ -lsnfn arctic20.transcription \ -accumdir .
Result
INFO: main.c(229): Compiled on Jun 3 2014 at 09:05:01
INFO: cmd_ln.c(691): Parsing command line:
./bw \ -hmmdir hub4wsj_sc_8k \ -moddeffn hub4wsj_sc_8k/mdef.txt \ -ts2cbfn .semi. \ -feat 1s_c_d_dd \ -svspec 0-12/13-25/26-38 \ -cmn current \ -agc none \ -dictfn arctic20.dic \ -ctlfn arctic20.fileids \ -lsnfn arctic20.transcription \ -accumdir .
Current configuration:
[NAME] [DEFLT] [VALUE]
-2passvar no no
-abeam 1e-100 1.000000e-100
-accumdir .
-agc none none
-agcthresh 2.0 2.000000e+00
-bbeam 1e-100 1.000000e-100
-cb2mllrfn .1cls. .1cls.
-cepdir
-cepext mfc mfc
-ceplen 13 13
-ckptintv 0
-cmn current current
-cmninit 8.0 8.0
-ctlfn arctic20.fileids
-diagfull no no
-dictfn arctic20.dic
-example no no
-fdictfn
-feat 1s_c_d_dd 1s_c_d_dd
-fullsuffixmatch no no
-fullvar no no
-help no no
-hmmdir hub4wsj_sc_8k
-latdir
-latext
-lda
-ldaaccum no no
-ldadim 0 0
-lsnfn arctic20.transcription
-lw 11.5 1.150000e+01
-maxuttlen 0 0
-meanfn
-meanreest yes yes
-mixwfn
-mixwreest yes yes
-mllrmat
-mmie no no
-mmie_type rand rand
-moddeffn hub4wsj_sc_8k/mdef.txt
-mwfloor 0.00001 1.000000e-05
-npart 0
-nskip 0
-outphsegdir
-outputfullpath no no
-part 0
-pdumpdir
-phsegdir
-phsegext phseg phseg
-runlen -1 -1
-sentdir
-sentext sent sent
-spthresh 0.0 0.000000e+00
-svspec 0-12/13-25/26-38
-timing yes yes
-tmatfn
-tmatreest yes yes
-topn 4 4
-tpfloor 0.0001 1.000000e-04
-ts2cbfn .semi.
-varfloor 0.00001 1.000000e-05
-varfn
-varnorm no no
-varreest yes yes
-viterbi no no
INFO: feat.c(713): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
INFO: main.c(254): Using subvector specification 0-12/13-25/26-38
INFO: main.c(318): Reading hub4wsj_sc_8k/mdef.txt
INFO: model_def_io.c(573): Model definition info:
INFO: model_def_io.c(574): 143097 total models defined (50 base, 143047 tri)
INFO: model_def_io.c(575): 572388 total states
INFO: model_def_io.c(576): 5150 total tied states
INFO: model_def_io.c(577): 150 total tied CI states
INFO: model_def_io.c(578): 50 total tied transition matrices
INFO: model_def_io.c(579): 4 max state/model
INFO: model_def_io.c(580): 4 min state/model
INFO: s3mixw_io.c(116): Read hub4wsj_sc_8k/mixture_weights [5150x3x256 array]
INFO: s3tmat_io.c(115): Read hub4wsj_sc_8k/transition_matrices [50x3x4 array]
INFO: mod_inv.c(300): inserting tprob floor 1.000000e-04 and renormalizing
INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/means [1x3x256 array]
INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/variances [1x3x256 array]
INFO: gauden.c(183): 1 total mgau
INFO: gauden.c(157): 3 feature streams (|0|=13 |1|=13 |2|=13 )
INFO: gauden.c(194): 256 total densities
INFO: gauden.c(97): min_var=1.000000e-05
INFO: gauden.c(172): compute 4 densities/frame
INFO: main.c(430): Will reestimate mixing weights.
INFO: main.c(432): Will reestimate means.
INFO: main.c(434): Will reestimate variances.
INFO: main.c(442): Will reestimate transition matrices
INFO: main.c(455): Reading main lexicon: arctic20.dic
WARNING: "lexicon.c", line 176: Lexicon arctic20.dic has a blank line at line 0
INFO: lexicon.c(220): 189 entries added from arctic20.dic
INFO: main.c(465): Reading filler lexicon: hub4wsj_sc_8k/noisedict
INFO: lexicon.c(220): 11 entries added from hub4wsj_sc_8k/noisedict
INFO: corpus.c(1086): Will process all remaining utts starting at 0
INFO: main.c(665): Reestimation: Baum-Welch
INFO: main.c(670): Generating profiling information consumes significant CPU resources.
INFO: main.c(671): If you are not interested in profiling, use -timing no
column defns
<seq>
<id>
<n_frame_in>
<n_frame_del>
<n_state_shmm>
<avg_states_alpha>
<avg_states_beta>
<avg_states_reest>
<avg_posterior_prune>
<frame_log_lik>
<utt_log_lik>
... timing info ...
INFO: cmn.c(175): CMN: 14.51 7.46 0.27 -1.64 -0.72 -1.37 -0.95 -1.66 -0.27 0.05 -0.30 -1.16 -0.39
utt> 0 arctic_0001 306 0 108 33 10 10 2.562535e-102 -9.040966e+01 -2.766536e+04 utt 0.026x 1.895e upd 0.026x 1.886e fwd 0.004x 1.340e bwd 0.022x 1.969e gau 0.103x 3.025e rsts 0.012x 1.289e rstf -0.000x 0.000e rstu 0.000x 575571466439884.750e
INFO: cmn.c(175): CMN: 12.90 6.86 1.07 -1.22 -0.73 -1.25 -0.73 -1.06 -0.12 -0.31 0.14 -0.82 -0.57
utt> 1 arctic_0002 306 0 104 39 12 10 3.030896e-102 -9.233116e+01 -2.825333e+04 utt 0.025x 1.314e upd 0.025x 1.306e fwd 0.005x 1.422e bwd 0.020x 1.252e gau 0.157x 1.364e rsts 0.004x 1.827e rstf 0.000x 958099437935.158e rstu 0.000x 131265353625996.391e
INFO: cmn.c(175): CMN: 11.70 7.07 2.65 0.30 -1.12 -0.42 -0.49 -1.97 -0.41 -0.71 -0.43 -0.53 0.05
utt> 2 arctic_0003 306 0 104 53 16 12 4.827730e-102 -9.495314e+01 -2.905566e+04 utt 0.043x 1.157e upd 0.043x 1.151e fwd 0.005x 1.042e bwd 0.024x 1.259e gau 0.241x 1.365e rsts 0.008x 1.089e rstf 0.000x 1936019416417.208e rstu 0.014x 1.008e
INFO: cmn.c(175): CMN: 9.52 3.98 0.63 -0.09 -0.22 -0.73 0.19 -2.02 -0.43 -0.11 0.27 -0.43 -0.22
utt> 3 arctic_0004 255 0 92 48 24 17 3.436941e-102 -9.754965e+01 -2.487516e+04 utt 0.039x 1.311e upd 0.039x 1.304e fwd 0.003x 1.800e bwd 0.035x 1.310e gau 0.529x 1.509e rsts 0.011x 1.776e rstf 0.002x 0.236e rstu 0.000x 8933300837518.277e
overall> stats 1173 (-0) -9.364835e+01 -1.098495e+05 0.033x 1.379e
WARNING: "accum.c", line 617: Over 500 senones never occur in the input data. This is normal for context-dependent untied senone training or for adaptation, but could indicate a serious problem otherwise.
INFO: s3mixw_io.c(232): Wrote ./mixw_counts [5150x3x256 array]
INFO: s3tmat_io.c(174): Wrote ./tmat_counts [50x3x4 array]
INFO: s3gau_io.c(478): Wrote ./gauden_counts with means with vars [1x3x256 vector arrays]
INFO: main.c(1014): Counts saved to .
-> Creating transformation with MLLR
./mllr_solve \ -meanfn hub4wsj_sc_8k/means \ -varfn hub4wsj_sc_8k/variances \ -outmllrfn mllr_matrix -accumdir .
Result
INFO: cmd_ln.c(691): Parsing command line:
./mllr_solve \ -meanfn hub4wsj_sc_8k/means \ -varfn hub4wsj_sc_8k/variances \ -outmllrfn mllr_matrix \ -accumdir .
Current configuration:
[NAME] [DEFLT] [VALUE]
-accumdir .,
-cb2mllrfn .1cls. .1cls.
-cdonly no no
-example no no
-fullvar no no
-help no no
-meanfn hub4wsj_sc_8k/means
-mllradd yes yes
-mllrmult yes yes
-moddeffn
-outmllrfn mllr_matrix
-varfloor 1e-3 1.000000e-03
-varfn hub4wsj_sc_8k/variances
INFO: main.c(382): -- 1. Read input mean, (var) and accumulation.
INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/means [1x3x256 array]
INFO: main.c(397): Reading and accumulating counts from .
INFO: s3gau_io.c(379): Read ./gauden_counts with means with vars [1x3x256 vector arrays]
INFO: main.c(436): -- 2. Read cb2mllrfn
INFO: main.c(455): n_mllr_class = 1
INFO: main.c(475): -- 3. Calculate mllr matrices
INFO: main.c(127):
INFO: main.c(128): ---- mllr_solve(): Conventional MLLR method
INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/variances [1x3x256 array]
INFO: main.c(208): ---- A. Accum regl, regr
INFO: main.c(209): No classes 1, no. stream 3
INFO: main.c(281): ---- B. Compute MLLR matrices (A,B)
INFO: mllr.c(182): Computing both multiplicative and additive part of MLLR
INFO: mllr.c(182): Computing both multiplicative and additive part of MLLR
INFO: mllr.c(182): Computing both multiplicative and additive part of MLLR
INFO: main.c(497): -- 4. Store mllr matrices (A,B) to mllr_matrix
-> Updating the acoustic model files with MAP
cp -a hub4wsj_sc_8k hub4wsj_sc_8kadapt
./map_adapt \ -meanfn hub4wsj_sc_8k/means \ -varfn hub4wsj_sc_8k/variances \ -mixwfn hub4wsj_sc_8k/mixture_weights \ -tmatfn hub4wsj_sc_8k/transition_matrices \ -accumdir . \ -mapmeanfn hub4wsj_sc_8kadapt/means \ -mapvarfn hub4wsj_sc_8kadapt/variances \ -mapmixwfn hub4wsj_sc_8kadapt/mixture_weights \ -maptmatfn hub4wsj_sc_8kadapt/transition_matrices
Result
INFO: cmd_ln.c(691): Parsing command line:
./map_adapt \ -meanfn hub4wsj_sc_8k/means \ -varfn hub4wsj_sc_8k/variances \ -mixwfn hub4wsj_sc_8k/mixture_weights \ -tmatfn hub4wsj_sc_8k/transition_matrices \ -accumdir . \ -mapmeanfn hub4wsj_sc_8kadapt/means \ -mapvarfn hub4wsj_sc_8kadapt/variances \ -mapmixwfn hub4wsj_sc_8kadapt/mixture_weights \ -maptmatfn hub4wsj_sc_8kadapt/transition_matrices
Current configuration:
[NAME] [DEFLT] [VALUE]
-accumdir .,
-bayesmean yes yes
-example no no
-fixedtau no no
-help no no
-mapmeanfn hub4wsj_sc_8kadapt/means
-mapmixwfn hub4wsj_sc_8kadapt/mixture_weights
-maptmatfn hub4wsj_sc_8kadapt/transition_matrices
-mapvarfn hub4wsj_sc_8kadapt/variances
-meanfn hub4wsj_sc_8k/means
-mixwfn hub4wsj_sc_8k/mixture_weights
-mwfloor 0.00001 1.000000e-05
-tau 10.0 1.000000e+01
-tmatfn hub4wsj_sc_8k/transition_matrices
-tpfloor 0.0001 1.000000e-04
-varfloor 0.00001 1.000000e-05
-varfn hub4wsj_sc_8k/variances
INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/means [1x3x256 array]
INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/variances [1x3x256 array]
INFO: s3mixw_io.c(116): Read hub4wsj_sc_8k/mixture_weights [5150x3x256 array]
INFO: s3tmat_io.c(115): Read hub4wsj_sc_8k/transition_matrices [50x3x4 array]
INFO: main.c(425): Reading and accumulating observation counts from .
INFO: s3gau_io.c(379): Read ./gauden_counts with means with vars [1x3x256 vector arrays]
INFO: s3mixw_io.c(116): Read ./mixw_counts [5150x3x256 array]
INFO: s3tmat_io.c(115): Read ./tmat_counts [50x3x4 array]
INFO: main.c(77): Estimating tau hyperparameter from variances and observations
INFO: main.c(139): Re-estimating mixture weights using MAP
INFO: main.c(194): Re-estimating transition probabilities using MAP
INFO: main.c(496): Re-estimating means using Bayesian interpolation
INFO: main.c(500): Interpolating tau hyperparameter for semi-continuous models
INFO: main.c(502): Re-estimating variances using MAP
INFO: s3gau_io.c(226): Wrote hub4wsj_sc_8kadapt/means [1x3x256 array]
INFO: s3gau_io.c(226): Wrote hub4wsj_sc_8kadapt/variances [1x3x256 array]
INFO: s3mixw_io.c(232): Wrote hub4wsj_sc_8kadapt/mixture_weights [5150x3x256 array]
INFO: s3tmat_io.c(174): Wrote hub4wsj_sc_8kadapt/transition_matrices [50x3x4 array]
-> Recreating the adapted sendump file
./mk_s2sendump \ -pocketsphinx yes \ -moddeffn hub4wsj_sc_8kadapt/mdef.txt \ -mixwfn hub4wsj_sc_8kadapt/mixture_weights \ -sendumpfn hub4wsj_sc_8kadapt/sendump
Result
INFO: cmd_ln.c(691): Parsing command line:
./mk_s2sendump \ -pocketsphinx yes \ -moddeffn hub4wsj_sc_8kadapt/mdef.txt \ -mixwfn hub4wsj_sc_8kadapt/mixture_weights \ -sendumpfn hub4wsj_sc_8kadapt/sendump
Current configuration:
[NAME] [DEFLT] [VALUE]
-example no no
-help no no
-mixwfn hub4wsj_sc_8kadapt/mixture_weights
-moddeffn hub4wsj_sc_8kadapt/mdef.txt
-mwfloor 0.00001 1.000000e-05
-pocketsphinx no yes
-sendumpfn hub4wsj_sc_8kadapt/sendump
INFO: model_def_io.c(573): Model definition info:
INFO: model_def_io.c(574): 143097 total models defined (50 base, 143047 tri)
INFO: model_def_io.c(575): 572388 total states
INFO: model_def_io.c(576): 5150 total tied states
INFO: model_def_io.c(577): 150 total tied CI states
INFO: model_def_io.c(578): 50 total tied transition matrices
INFO: model_def_io.c(579): 4 max state/model
INFO: model_def_io.c(580): 4 min state/model
INFO: senone.c(210): Reading senone mixture weights: hub4wsj_sc_8kadapt/mixture_weights
INFO: senone.c(330): Read mixture weights for 5150 senones: 3 features x 256 codewords
INFO: mk_s2sendump.c(207): Writing PocketSphinx format sendump file: hub4wsj_sc_8kadapt/sendump
After do it all, I copy the new acoustic model, new language model, and files needed for testing to new folder named Testing Adaptation. I run the following command :
pocketsphinx_batch \ -adcin yes \ -cepdir wav \ -cepext .wav \ -ctl adaptation-test.fileids \ -lm <your.lm> \ -dict <your.dic, for="" example="" arctic.dic=""> \ -hmm <your_new_adapted_model, for="" example="" hub4wsj_sc_8kadapt=""> \ -hyp adapation-test.hyp
Result
INFO: cmd_ln.c(691): Parsing command line:
pocketsphinx_batch \ -adcin yes \ -cepdir wav \ -cepext .wav \ -ctl adaptation-test.fileids \ -lm corpus.lm \ -dict arctic20.dic \ -hmm hub4wsj_sc_8kadapt \ -hyp adaptation-test.hyp
Current configuration:
[NAME] [DEFLT] [VALUE]
-adchdr 0 0
-adcin no yes
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-argfile
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-bghist no no
-build_outdirs yes yes
-cepdir wav
-cepext .mfc .wav
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-ctl adaptation-test.fileids
-ctlcount -1 -1
-ctlincr 1 1
-ctloffset 0 0
-ctm
-debug 0
-dict arctic20.dic
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgctl
-fsgdir
-fsgext
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm hub4wsj_sc_8kadapt
-hyp adaptation-test.hyp
-hypseg
-input_endian little little
-jsgf
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latsize 5000 5000
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm corpus.lm
-lmctl
-lmname default default
-lmnamectl
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 6.500000e+00
-maxhmmpf -1 -1
-maxnewoov 20 20
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mllrctl
-mllrdir
-mllrext
-mmap yes yes
-nbest 0 0
-nbestdir
-nbestext .hyp .hyp
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1.000000e+00
-outlatbeam 1e-5 1.000000e-05
-outlatdir
-outlatext .lat .lat
-outlatfmt s3 s3
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-5 1.000000e-05
-pl_window 0 0
-rawlogdir
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump
-senin no no
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-usewdphones no no
-uw 1.0 1.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 0.65 6.500000e-01
-wlen 0.025625 2.562500e-02
INFO: cmd_ln.c(691): Parsing command line:
\ -nfilt 20 \ -lowerf 1 \ -upperf 4000 \ -wlen 0.025 \ -transform dct \ -round_filters no \ -remove_dc yes \ -svspec 0-12/13-25/26-38 \ -feat 1s_c_d_dd \ -agc none \ -cmn current \ -cmninit 56,-3,1 \ -varnorm no
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-ceplen 13 13
-cmn current current
-cmninit 8.0 56,-3,1
-dither no no
-doublebw no no
-feat 1s_c_d_dd 1s_c_d_dd
-frate 100 100
-input_endian little little
-lda
-ldadim 0 0
-lifter 0 0
-logspec no no
-lowerf 133.33334 1.000000e+00
-ncep 13 13
-nfft 512 512
-nfilt 40 20
-remove_dc no yes
-round_filters yes no
-samprate 16000 1.600000e+04
-seed -1 -1
-smoothspec no no
-svspec 0-12/13-25/26-38
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 4.000000e+03
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2.500000e-02
INFO: acmod.c(246): Parsed model-specific feature parameters from hub4wsj_sc_8kadapt/feat.params
INFO: feat.c(713): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(167): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(517): Reading model definition: hub4wsj_sc_8kadapt/mdef
INFO: mdef.c(528): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: hub4wsj_sc_8kadapt/mdef
INFO: bin_mdef.c(513): 50 CI-phone, 143047 CD-phone, 3 emitstate/phone, 150 CI-sen, 5150 Sen, 27135 Sen-Seq
INFO: tmat.c(205): Reading HMM transition probability matrices: hub4wsj_sc_8kadapt/transition_matrices
INFO: acmod.c(121): Attempting to use SCHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: hub4wsj_sc_8kadapt/means
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: hub4wsj_sc_8kadapt/variances
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(354): 0 variance values floored
INFO: s2_semi_mgau.c(903): Loading senones from dump file hub4wsj_sc_8kadapt/sendump
INFO: s2_semi_mgau.c(927): BEGIN FILE FORMAT DESCRIPTION
INFO: s2_semi_mgau.c(990): Rows: 256, Columns: 5150
INFO: s2_semi_mgau.c(1022): Using memory-mapped I/O for senones
INFO: s2_semi_mgau.c(1296): Maximum top-N: 4 Top-N beams: 0 0 0
INFO: dict.c(317): Allocating 4297 * 20 bytes (83 KiB) for word entries
INFO: dict.c(332): Reading main dictionary: arctic20.dic
INFO: dict.c(211): Allocated 1 KiB for strings, 1 KiB for phones
INFO: dict.c(335): 189 words read
INFO: dict.c(341): Reading filler dictionary: hub4wsj_sc_8kadapt/noisedict
INFO: dict.c(211): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(344): 11 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(404): Allocating 50^3 * 2 bytes (244 KiB) for word-initial triphones
INFO: dict2pid.c(131): Allocated 30200 bytes (29 KiB) for word-final triphones
INFO: dict2pid.c(195): Allocated 30200 bytes (29 KiB) for single-phone word triphones
INFO: ngram_model_arpa.c(477): ngrams 1=16, 2=14, 3=14
INFO: ngram_model_arpa.c(135): Reading unigrams
INFO: ngram_model_arpa.c(516): 16 = #unigrams created
INFO: ngram_model_arpa.c(195): Reading bigrams
INFO: ngram_model_arpa.c(533): 14 = #bigrams created
INFO: ngram_model_arpa.c(534): 2 = #prob2 entries
INFO: ngram_model_arpa.c(542): 3 = #bo_wt2 entries
INFO: ngram_model_arpa.c(292): Reading trigrams
INFO: ngram_model_arpa.c(555): 14 = #trigrams created
INFO: ngram_model_arpa.c(556): 2 = #prob3 entries
INFO: ngram_search_fwdtree.c(99): 125 unique initial diphones
INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 15 single-phone words
INFO: ngram_search_fwdtree.c(186): Creating search tree
INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 15 single-phone words
INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 128
ERROR: "ngram_search_fwdtree.c", line 336: No word from the language model has pronunciation in the dictionary
INFO: ngram_search_fwdtree.c(338): after: 0 root, 0 non-root channels, 11 single-phone words
INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25
INFO: cmn.c(175): CMN: 14.69 7.46 0.27 -1.64 -0.73 -1.39 -0.95 -1.66 -0.28 0.04 -0.31 -1.17 -0.39
INFO: ngram_search_fwdtree.c(1549): 3018 words recognized (10/fr)
INFO: ngram_search_fwdtree.c(1551): 8163 senones evaluated (27/fr)
INFO: ngram_search_fwdtree.c(1553): 3133 channels searched (10/fr), 0 1st, 3133 last
INFO: ngram_search_fwdtree.c(1557): 3133 words for which last channels evaluated (10/fr)
INFO: ngram_search_fwdtree.c(1560): 0 candidate words for entering last phone (0/fr)
INFO: ngram_search_fwdtree.c(1562): fwdtree 0.04 CPU 0.012 xRT
INFO: ngram_search_fwdtree.c(1565): fwdtree 0.04 wall 0.014 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 1 words
INFO: ngram_search_fwdflat.c(937): 308 words recognized (1/fr)
INFO: ngram_search_fwdflat.c(939): 918 senones evaluated (3/fr)
INFO: ngram_search_fwdflat.c(941): 373 channels searched (1/fr)
INFO: ngram_search_fwdflat.c(943): 373 words searched (1/fr)
INFO: ngram_search_fwdflat.c(945): 26 word transitions (0/fr)
INFO: ngram_search_fwdflat.c(948): fwdflat 0.01 CPU 0.003 xRT
INFO: ngram_search_fwdflat.c(951): fwdflat 0.02 wall 0.005 xRT
INFO: ngram_search.c(1214): not found in last frame, using <sil>.305 instead
INFO: ngram_search.c(1266): lattice start node
.0 end node <sil>.2not found in last frame, using <sil>.305 insteadINFO: ngram_search.c(1294): Eliminated 1 nodes before end node
INFO: ngram_search.c(1399): Lattice has 3 nodes, 1 links
INFO: ps_lattice.c(1365): Normalizer P(O) = alpha(<sil>:2:305) = -2882525
INFO: ps_lattice.c(1403): Joint P(O,S) = -2882525 P(S|O) = 0
INFO: ngram_search.c(888): bestpath -0.00 CPU -0.000 xRT
INFO: ngram_search.c(891): bestpath 0.00 wall 0.000 xRT
INFO: batch.c(760): test1: 3.06 seconds speech, 0.04 seconds CPU, 0.06 seconds wall
INFO: batch.c(762): test1: 0.01 xRT (CPU), 0.02 xRT (elapsed)
INFO: cmn.c(175): CMN: 13.08 6.85 1.08 -1.26 -0.74 -1.25 -0.75 -1.07 -0.12 -0.31 0.13 -0.81 -0.57
INFO: ngram_search_fwdtree.c(1549): 3035 words recognized (10/fr)
INFO: ngram_search_fwdtree.c(1551): 8190 senones evaluated (27/fr)
INFO: ngram_search_fwdtree.c(1553): 3163 channels searched (10/fr), 0 1st, 3163 last
INFO: ngram_search_fwdtree.c(1557): 3163 words for which last channels evaluated (10/fr)
INFO: ngram_search_fwdtree.c(1560): 0 candidate words for entering last phone (0/fr)
INFO: ngram_search_fwdtree.c(1562): fwdtree 0.02 CPU 0.007 xRT
INFO: ngram_search_fwdtree.c(1565): fwdtree 0.03 wall 0.009 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 1 words
INFO: ngram_search_fwdflat.c(937): 308 words recognized (1/fr)
INFO: ngram_search_fwdflat.c(939): 918 senones evaluated (3/fr)
INFO: ngram_search_fwdflat.c(941): 369 channels searched (1/fr)
INFO: ngram_search_fwdflat.c(943): 369 words searched (1/fr)
INFO: ngram_search_fwdflat.c(945): 26 word transitions (0/fr)
INFO: ngram_search_fwdflat.c(948): fwdflat 0.01 CPU 0.003 xRT
INFO: ngram_search_fwdflat.c(951): fwdflat 0.01 wall 0.003 xRT
INFO: ngram_search.c(1214):
INFO: ngram_search.c(1266): lattice start node
.0 end node <sil>.242INFO: ngram_search.c(1294): Eliminated 0 nodes before end node
INFO: ngram_search.c(1399): Lattice has 3 nodes, 2 links
INFO: ps_lattice.c(1365): Normalizer P(O) = alpha(<sil>:242:305) = -2856771
INFO: ps_lattice.c(1403): Joint P(O,S) = -2856771 P(S|O) = 0
INFO: ngram_search.c(888): bestpath 0.00 CPU 0.000 xRT
INFO: ngram_search.c(891): bestpath 0.00 wall 0.000 xRT
INFO: batch.c(760): test2: 3.06 seconds speech, 0.03 seconds CPU, 0.04 seconds wall
INFO: batch.c(762): test2: 0.01 xRT (CPU), 0.01 xRT (elapsed)
INFO: batch.c(774): TOTAL 6.12 seconds speech, 0.07 seconds CPU, 0.10 seconds wall
INFO: batch.c(776): AVERAGE 0.01 xRT (CPU), 0.02 xRT (elapsed)
INFO: ngram_search_fwdtree.c(430): TOTAL fwdtree 0.06 CPU 0.009 xRT
INFO: ngram_search_fwdtree.c(433): TOTAL fwdtree 0.07 wall 0.012 xRT
INFO: ngram_search_fwdflat.c(174): TOTAL fwdflat 0.02 CPU 0.003 xRT
INFO: ngram_search_fwdflat.c(177): TOTAL fwdflat 0.02 wall 0.004 xRT
INFO: ngram_search.c(317): TOTAL bestpath 0.00 CPU 0.000 xRT
INFO: ngram_search.c(320): TOTAL bestpath 0.00 wall 0.000 xRT
And get error when running this command
word_align.pl adaptation-test.transcription adapation-test.hyp
Result
NYALAKAN LAMPU RUANG TAMU SATU (TEST1)
(TEST1)
Words: 5 Correct: 0 Errors: 5 Percent correct = 0.00% Error = 100.00% Accuracy = 0.00%
Insertions: 0 Deletions: 5 Substitutions: 0
MATIKAN LAMPU RUANG TAMU SATU (TEST2) (TEST2)
Words: 5 Correct: 0 Errors: 5 Percent correct = 0.00% Error = 100.00% Accuracy = 0.00%
Insertions: 0 Deletions: 5 Substitutions: 0
TOTAL Words: 10 Correct: 0 Errors: 10
TOTAL Percent correct = 0.00% Error = 100.00% Accuracy = 0.00%
TOTAL Insertions: 0 Deletions: 10 Substitutions: 0
Am I doing something wrong?
Is my new language model wrong?
I tried to adapt Indonesia languange with CMU Sphinx default acoustic. Can I do like that or I have to build a new one?
Thank you so much.You need to provide data files, not just the logs.
Last edit: Nickolay V. Shmyrev 2014-06-04
I send you Adapting Files
Testing Adaptation and LM
Thanks
Your corpus.lm and your dictionary doesn't match on case. Dictionary is uppercase, lm is lowercase. This is what error during decoding tells you about:
It works now. After executing word_align.pl adaptation-test.transcription adapation-test.hyp, I got following result.
nyalakan lampu ruang tamu satu (TEST1)
ANAK nyalakan lampu ruang tamu satu (TEST1)
Words: 5 Correct: 5 Errors: 1 Percent correct = 100.00% Error = 20.00% Accuracy = 80.00%
Insertions: 1 Deletions: 0 Substitutions: 0
matikan lampu RUANG TAMU satu (TEST2)
matikan lampu MATIKAN satu (TEST2)
Words: 5 Correct: 3 Errors: 2 Percent correct = 60.00% Error = 40.00% Accuracy = 60.00%
Insertions: 0 Deletions: 1 Substitutions: 1
TOTAL Words: 10 Correct: 8 Errors: 3
TOTAL Percent correct = 80.00% Error = 30.00% Accuracy = 70.00%
TOTAL Insertions: 1 Deletions: 1 Substitutions: 1
Thanks for your help.