After I have completed the adaptation, I am getting random results. I am
confident that I have followed your steps correctly. I am uploading my
adaptation files and testdata. Can you please tell where the mistake is?
Some info about data.
largevoctel contains the models trained on 20 hours of 40 speakers speech
data.
Sorry, to make it easier for me can you please provide ready-to-run setup?
What exact command do I need to invoke (try to make it a single command). What
result I'm expected to see? What do you think is wrong in this result? What is
your expectation?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Dear Sit,
I am planning to map adaptation. In the balmwelch step I am getting some
error.
./bw -hmmdir largevoctel -moddeffn largevoctel/mdef -ts2cbfn .cont. -feat
1s_c_d_dd -cmn current -agc none -dictfn adapt.dic -ctlfn adapt.fileids -lsnfn
adapt.transcription -accumdir .
Output:
INFO: main.c(194): Compiled on May 13 2011 at 11:30:51
INFO: cmd_ln.c(559): Parsing command line:
./bw \
-hmmdir largevoctel \
-moddeffn largevoctel/mdef \
-ts2cbfn .cont. \
-feat 1s_c_d_dd \
-cmn current \
-agc none \
-dictfn adapt.dic \
-ctlfn adapt.fileids \
-lsnfn adapt.transcription \
-accumdir .
Current configuration:
-2passvar no no
-abeam 1e-100 1.000000e-100
-accumdir .
-agc none none
-agcthresh 2.0 2.000000e+00
-bbeam 1e-100 1.000000e-100
-cb2mllrfn .1cls. .1cls.
-cepdir
-cepext .mfc .mfc
-ceplen 13 13
-ckptintv 0
-cmn current current
-cmninit 8.0 8.0
-ctlfn adapt.fileids
-diagfull no no
-dictfn adapt.dic
-example no no
-fdictfn
-feat 1s_c_d_dd 1s_c_d_dd
-fullsuffixmatch no no
-fullvar no no
-help no no
-hmmdir largevoctel
-latdir
-latext
-lda
-ldaaccum no no
-ldadim 0 0
-lsnfn adapt.transcription
-ltsoov no no
-lw 11.5 1.150000e+01
-maxuttlen 0 0
-meanfn
-meanreest yes yes
-mixwfn
-mixwreest yes yes
-mllrmat
-mmie no no
-mmie_type rand rand
-moddeffn largevoctel/mdef
-mwfloor 0.00001 1.000000e-05
-npart 0
-nskip 0
-outphsegdir
-outputfullpath no no
-part 0
-pdumpdir
-phsegdir
-phsegext phseg phseg
-runlen -1 -1
-sentdir
-sentext sent sent
-spthresh 0.0 0.000000e+00
-svspec
-timing yes yes
-tmatfn
-tmatreest yes yes
-topn 4 4
-tpfloor 0.0001 1.000000e-04
-ts2cbfn .cont.
-varfloor 0.00001 1.000000e-05
-varfn
-varnorm no no
-varreest yes yes
-viterbi no no
INFO: feat.c(697): Initializing feature stream to type: '1s_c_d_dd',
ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean= 12.00, mean= 0.0
INFO: main.c(283): Reading largevoctel/mdef
INFO: model_def_io.c(587): Model definition info:
INFO: model_def_io.c(588): 46469 total models defined (53 base, 46416 tri)
INFO: model_def_io.c(589): 185876 total states
INFO: model_def_io.c(590): 1159 total tied states
INFO: model_def_io.c(591): 159 total tied CI states
INFO: model_def_io.c(592): 53 total tied transition matrices
INFO: model_def_io.c(593): 4 max state/model
INFO: model_def_io.c(594): 4 min state/model
INFO: s3mixw_io.c(116): Read largevoctel/mixture_weights
INFO: s3tmat_io.c(115): Read largevoctel/transition_matrices
INFO: mod_inv.c(300): inserting tprob floor 1.000000e-04 and renormalizing
INFO: s3gau_io.c(166): Read largevoctel/means
INFO: s3gau_io.c(166): Read largevoctel/variances
INFO: gauden.c(180): 1159 total mgau
INFO: gauden.c(154): 1 feature streams (|0|=39 )
INFO: gauden.c(191): 8 total densities
INFO: gauden.c(97): min_var=1.000000e-05
INFO: gauden.c(169): compute 4 densities/frame
INFO: main.c(395): Will reestimate mixing weights.
INFO: main.c(397): Will reestimate means.
INFO: main.c(399): Will reestimate variances.
INFO: main.c(407): Will reestimate transition matrices
INFO: main.c(420): Reading main lexicon: adapt.dic
INFO: lexicon.c(233): 1261 entries added from adapt.dic
INFO: main.c(432): Reading filler lexicon: largevoctel/noisedict
INFO: lexicon.c(233): 4 entries added from largevoctel/noisedict
INFO: corpus.c(1281): Will process all remaining utts starting at 0
INFO: main.c(639): Reestimation: Baum-Welch
INFO: main.c(644): Generating profiling information consumes significant CPU
resources.
INFO: main.c(645): If you are not interested in profiling, use -timing no
column defns
<seq>
<id>
<n_frame_in>
<n_frame_del>
<n_state_shmm>
<avg_states_alpha>
<avg_states_beta>
<avg_states_reest>
<avg_posterior_prune>
<frame_log_lik>
<utt_log_lik>
... timing info ...
utt> 0 820-6stat_retry(820-6..mfc) failed
ERROR: "corpus.c", line 1555: MFCC read of 820-6..mfc failed. Retrying after
sleep...
stat_retry(820-6..mfc) failed
ERROR: "corpus.c", line 1555: MFCC read of 820-6..mfc failed. Retrying after
sleep...
stat_retry(820-6..mfc) failed
ERROR: "corpus.c", line 1555: MFCC read of 820-6..mfc failed. Retrying after
sleep...
stat_retry(820-6..mfc) failed
ERROR: "corpus.c", line 1555: MFCC read of 820-6..mfc failed. Retrying after
sleep...
stat_retry(820-6..mfc) failed
ERROR: "corpus.c", line 1555: MFCC read of 820-6..mfc failed. Retrying after
sleep...
stat_retry(820-6..mfc) failed
ERROR: "corpus.c", line 1555: MFCC read of 820-6..mfc failed. Retrying after
sleep...
stat_retry(820-6..mfc) failed
ERROR: "corpus.c", line 1555: MFCC read of 820-6..mfc failed. Retrying after
sleep...
stat_retry(820-6..mfc) failed
ERROR: "corpus.c", line 1555: MFCC read of 820-6..mfc failed. Retrying after
sleep...
stat_retry(820-6..mfc) failed
ERROR: "corpus.c", line 1555: MFCC read of 820-6..mfc failed. Retrying after
sleep...
stat_retry(820-6..mfc) failed
ERROR: "corpus.c", line 1555: MFCC read of 820-6..mfc failed. Retrying after
sleep...
stat_retry(820-6..mfc) failed
ERROR: "corpus.c", line 1555: MFCC read of 820-6..mfc failed. Retrying after
sleep...
stat_retry(820-6..mfc) failed
ERROR: "corpus.c", line 1555: MFCC read of 820-6..mfc failed. Retrying after
sleep...
^C </utt_log_lik></frame_log_lik></avg_posterior_prune></avg_states_reest></avg_states_beta></avg_states_alpha></n_state_shmm></n_frame_del></n_frame_in></id></seq>
But I have the following files in current directory
drwxrwxr-x 3 lahari lahari 4096 May 26 23:17 .
drwxrwxr-x 9 lahari lahari 4096 May 26 22:01 ..
-rw-rw-r-- 1 lahari lahari 407112 May 26 22:42 820-6.mfc
-rw-rw-r-- 1 lahari lahari 2505680 May 25 17:42 820-6.wav
-rw-rw-r-- 1 lahari lahari 404980 May 26 22:42 830-6.mfc
-rw-rw-r-- 1 lahari lahari 2492682 May 25 17:50 830-6.wav
-rw-rw-r-- 1 lahari lahari 395880 May 26 22:42 840-6.mfc
-rw-rw-r-- 1 lahari lahari 2436648 May 25 17:57 840-6.wav
-rw-rw-r-- 1 lahari lahari 352200 May 26 22:42 850-6.mfc
-rw-rw-r-- 1 lahari lahari 2167860 May 25 18:03 850-6.wav
-rw-rw-r-- 1 lahari lahari 445644 May 26 22:42 860-6.mfc
-rw-rw-r-- 1 lahari lahari 2742738 May 26 11:03 860-6.wav
-rw-rw-r-- 1 lahari lahari 472840 May 26 22:43 870-6.mfc
-rw-rw-r-- 1 lahari lahari 2910106 May 26 11:10 870-6.wav
-rw-rw-r-- 1 lahari lahari 528428 May 26 22:43 880-6.mfc
-rw-rw-r-- 1 lahari lahari 3252144 May 26 11:17 880-6.wav
-rw-rw-r-- 1 lahari lahari 494212 May 26 22:43 890-6.mfc
-rw-rw-r-- 1 lahari lahari 3041726 May 26 11:26 890-6.wav
-rw-rw-r-- 1 lahari lahari 529364 May 26 22:43 900-6.mfc
-rw-rw-r-- 1 lahari lahari 3258030 May 26 11:36 900-6.wav
-rw-rw-r-- 1 lahari lahari 515740 May 26 22:43 910-6.mfc
-rw-rw-r-- 1 lahari lahari 3174028 May 26 11:46 910-6.wav
-rw-rw-r-- 1 lahari lahari 545068 May 26 22:43 920-6.mfc
-rw-rw-r-- 1 lahari lahari 3354758 May 26 11:58 920-6.wav
-rw-rw-r-- 1 lahari lahari 467068 May 26 22:43 930-6.mfc
-rw-rw-r-- 1 lahari lahari 2874652 May 26 12:08 930-6.wav
-rw-rw-r-- 1 lahari lahari 50261 May 26 22:13 adapt.dic
-rw-rw-r-- 1 lahari lahari 72 May 26 22:41 adapt.fileids
-rw-rw-r-- 1 lahari lahari 34187 May 25 16:43 adapting-text
-rw-rw-r-- 1 lahari lahari 35111 May 26 22:17 adapt.transcription
-rw-rw-r-- 1 lahari lahari 34187 May 26 22:44 adapt.txt
-rwxrwxr-x 1 lahari lahari 587716 May 13 11:30 bw
-rwxrwxrwx 1 lahari lahari 1157 May 26 22:02 get_uni_words.pl
drwxrwxr-x 3 lahari lahari 4096 May 22 00:24 largevoctel
-rwxrwxr-x 1 lahari lahari 175266 May 13 11:30 map_adapt
-rwxrwxr-x 1 lahari lahari 94425 May 13 11:31 mk_s2sendump
-rwxrwxr-x 1 lahari lahari 232040 May 13 11:31 mllr_solve
-rwxrwxr-x 1 lahari lahari 132424 May 13 11:31 mllr_transform
-rw-rw-r-- 1 lahari lahari 37 May 22 00:24 noisedict
-rwxrwxrwx 1 lahari lahari 503 Dec 29 2009 tel-mapping-table
-rwxrwxrwx 1 lahari lahari 673 May 26 22:17 telugu-transcription.pl
-rw-rw-r-- 1 lahari lahari 34183 May 26 22:02 total-corpus.txt
-rw-rw-r-- 1 lahari lahari 30845 May 26 22:02 total-corpus-words
-rw-rw-r-- 1 lahari lahari 37145 May 26 22:02 total-corpus-words-1
-rw-rw-r-- 1 lahari lahari 105 May 26 22:14 total-phones
-rwxrwxrwx 1 lahari lahari 7383 May 12 13:54 unigraphemetophoneme.pl
My adapt.fileids following lines
820-6
830-6
840-6
850-6
860-6
870-6
880-6
890-6
900-6
910-6
920-6
930-6
Please let me know where I did mistake?
Sorry for typing mistake It should be "sir" not sit
Sir, Can you please tell me the mistake. AFAIK, the source code has some
problem because it is showing "820-6..mfc".
This bug has been fixed in trunk, please update
Dear Sir,
After I have completed the adaptation, I am getting random results. I am
confident that I have followed your steps correctly. I am uploading my
adaptation files and testdata. Can you please tell where the mistake is?
Some info about data.
largevoctel contains the models trained on 20 hours of 40 speakers speech
data.
largenewmodel contains files after adaption.
These are the adptation files.
adapt.fileids
adapt.dic
adapt.transcription
adapt.txt
http://www.4shared.com/file/g-bpRa6b/testdatasettar.html
http://www.4shared.com/file/BKLW4zcv/adaptationtar.html
Sir, If you get time, please let me know the error. I think MAP adaptation has
some problem.
Sorry, to make it easier for me can you please provide ready-to-run setup?
What exact command do I need to invoke (try to make it a single command). What
result I'm expected to see? What do you think is wrong in this result? What is
your expectation?
Dear Sir,
I have solved the problem. I have done some mistake while testing after
adaptation.
Before adaptation:
Percentage of words correctly recognized = 79.94% , Word Error Rate = 26.07% ,
Accuracy = 73.93%
After Adaptation:
Percentage of words correctly recognized = 95.63% , Word Error Rate = 5.54% ,
Accuracy = 94.46%
There is an excellent improvement after adaptation. I am very thankful to you
for your pointers.
I am very sorry for doing the mistake. I am very thankful to your quick
solutions and support.