1) What is the sampling rate at which the acoustic mode lhub4wsj_sc_8k was
built?
2) How to get the mixture_weights file for this acoustic model?
3) What is the sampling rate at which I should record the input audio?
Thank you.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
What is the sampling rate at which the acoustic mode lhub4wsj_sc_8k was
built?
WSJ audio is 16kHz, however, hub4wsj filters strip everything above 4 kHz (see
upperf in feat.params). That basically means you can use any audio even 8kHz
one.
How to get the mixture_weights file for this acoustic model?
Ask the developer (David Huggins-Daines) to share it. It was available in SVN
history I remember but was removed to save memory.
What is the sampling rate at which I should record the input audio?
It depends on your recording capabilities. If you are not at telephone line
and can record at 16 kHz, do that.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I have a problem training the model given at the svn repo as per the wiki..
1) I recorded the audio at 16khz and saved in raw format.
2) The features where extracted using
-help no no
-example no no
-hmmdir adapt
-moddeffn adapt/mdef.txt
-tmatfn
-mixwfn
-meanfn
-varfn
-fullvar no no
-diagfull no no
-mwfloor 0.00001 1.000000e-05
-tpfloor 0.0001 1.000000e-04
-varfloor 0.00001 1.000000e-05
-topn 4 4
-dictfn ../doc/a.dic
-fdictfn
-ltsoov no no
-ctlfn ../doc/a.listoffiles
-nskip
-runlen -1 -1
-part
-npart
-cepext mfc mfc
-cepdir
-phsegext phseg phseg
-phsegdir
-outphsegdir
-sentdir
-sentext sent sent
-lsnfn ../doc/a.transcription
-accumdir .
-ceplen 13 13
-cepwin 0 0
-agc max none
-cmn current current
-varnorm no no
-silcomp none none
-sildel no no
-siltag SIL SIL
-abeam 1e-100 1.000000e-100
-bbeam 1e-100 1.000000e-100
-varreest yes yes
-meanreest yes yes
-mixwreest yes yes
-tmatreest yes yes
-mllrmat
-cb2mllrfn .1cls. .1cls.
-ts2cbfn .semi.
-feat 1s_c_d_dd 1s_c_d_dd
-svspec
-ldafn
-ldadim 29 29
-ldaaccum no no
-timing yes yes
-viterbi no no
-2passvar no no
-sildelfn
-spthresh 0.0 0.000000e+00
-maxuttlen 0 0
-ckptintv
-outputfullpath no no
-fullsuffixmatch no no
-pdumpdir
INFO: main.c(255): Reading adapt/mdef.txt
INFO: model_def_io.c(587): Model definition info:
INFO: model_def_io.c(588): 143097 total models defined (50 base, 143047 tri)
INFO: model_def_io.c(589): 572388 total states
INFO: model_def_io.c(590): 5150 total tied states
INFO: model_def_io.c(591): 150 total tied CI states
INFO: model_def_io.c(592): 50 total tied transition matrices
INFO: model_def_io.c(593): 4 max state/model
INFO: model_def_io.c(594): 4 min state/model
INFO: s3mixw_io.c(116): Read adapt/mixture_weights
FATAL_ERROR: "mod_inv.c", line 354: # of features in mixw file, 3, is
inconsistent w/ prior setting, 1
Try to follow feat.params of the model more precisly. In particular I see you
miss -svspec 0-12/13-25/26-38 option and that makes bw think there is one
stream instead of 3.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I need to adapt the hub4wsj_sc_8k acoustic model provided with pocketsphinx
0.6. I have some doubts regarding the adaptation process given at
http://cmusphinx.sourceforge.net/wiki/AcousticModelAdaptation
1) What is the sampling rate at which the acoustic mode lhub4wsj_sc_8k was
built?
2) How to get the mixture_weights file for this acoustic model?
3) What is the sampling rate at which I should record the input audio?
Thank you.
Hi
WSJ audio is 16kHz, however, hub4wsj filters strip everything above 4 kHz (see
upperf in feat.params). That basically means you can use any audio even 8kHz
one.
Ask the developer (David Huggins-Daines) to share it. It was available in SVN
history I remember but was removed to save memory.
It depends on your recording capabilities. If you are not at telephone line
and can record at 16 kHz, do that.
Thank you.
Does this mean that I should sample at 16khz and then the procedure given in
the wiki should be followed?
Missing files were added back to trunk today
Yes
Thank you.. As per your suggestion I had contacted David Huggins-Daines.
Hello,
I have a problem training the model given at the svn repo as per the wiki..
1) I recorded the audio at 16khz and saved in raw format.
2) The features where extracted using
3) Now when i run
i get
../doc/bw -hmmdir adapt -moddeffn adapt/mdef.txt -ts2cbfn .semi. -feat
1s_c_d_dd -cmn current -agc none -dictfn ../doc/a.dic -ctlfn
../doc/a.listoffiles -lsnfn ../doc/a.transcription -accumdir .
INFO: main.c(196): Compiled on Apr 17 2010 at 15:02:54
../doc/bw \
-hmmdir adapt \
-moddeffn adapt/mdef.txt \
-ts2cbfn .semi. \
-feat 1s_c_d_dd \
-cmn current \
-agc none \
-dictfn ../doc/a.dic \
-ctlfn ../doc/a.listoffiles \
-lsnfn ../doc/a.transcription \
-accumdir .
-help no no
-example no no
-hmmdir adapt
-moddeffn adapt/mdef.txt
-tmatfn
-mixwfn
-meanfn
-varfn
-fullvar no no
-diagfull no no
-mwfloor 0.00001 1.000000e-05
-tpfloor 0.0001 1.000000e-04
-varfloor 0.00001 1.000000e-05
-topn 4 4
-dictfn ../doc/a.dic
-fdictfn
-ltsoov no no
-ctlfn ../doc/a.listoffiles
-nskip
-runlen -1 -1
-part
-npart
-cepext mfc mfc
-cepdir
-phsegext phseg phseg
-phsegdir
-outphsegdir
-sentdir
-sentext sent sent
-lsnfn ../doc/a.transcription
-accumdir .
-ceplen 13 13
-cepwin 0 0
-agc max none
-cmn current current
-varnorm no no
-silcomp none none
-sildel no no
-siltag SIL SIL
-abeam 1e-100 1.000000e-100
-bbeam 1e-100 1.000000e-100
-varreest yes yes
-meanreest yes yes
-mixwreest yes yes
-tmatreest yes yes
-mllrmat
-cb2mllrfn .1cls. .1cls.
-ts2cbfn .semi.
-feat 1s_c_d_dd 1s_c_d_dd
-svspec
-ldafn
-ldadim 29 29
-ldaaccum no no
-timing yes yes
-viterbi no no
-2passvar no no
-sildelfn
-spthresh 0.0 0.000000e+00
-maxuttlen 0 0
-ckptintv
-outputfullpath no no
-fullsuffixmatch no no
-pdumpdir
INFO: main.c(255): Reading adapt/mdef.txt
INFO: model_def_io.c(587): Model definition info:
INFO: model_def_io.c(588): 143097 total models defined (50 base, 143047 tri)
INFO: model_def_io.c(589): 572388 total states
INFO: model_def_io.c(590): 5150 total tied states
INFO: model_def_io.c(591): 150 total tied CI states
INFO: model_def_io.c(592): 50 total tied transition matrices
INFO: model_def_io.c(593): 4 max state/model
INFO: model_def_io.c(594): 4 min state/model
INFO: s3mixw_io.c(116): Read adapt/mixture_weights
FATAL_ERROR: "mod_inv.c", line 354: # of features in mixw file, 3, is
inconsistent w/ prior setting, 1
Audio files are here
http://www.mediafire.com/file/oghy10t2jrt/gk1.tar.gz
Try to follow feat.params of the model more precisly. In particular I see you
miss -svspec 0-12/13-25/26-38 option and that makes bw think there is one
stream instead of 3.
Thanks, that helped :)