Menu

Adapting french acoustic model errors

Help
2011-11-10
2012-09-22
  • Boris Mansencal

    Boris Mansencal - 2011-11-10

    I am following the tutorial here http://cmusphinx.sourceforge.net/wiki/tutori
    aladapt

    to adapt the French acoustic model
    (provided here http://sourceforge.net/projects/cmusphinx/files/Acoustic%20and
    %20Language%20Models/
    )
    for a particular speaker.

    I am using latest sphinxbase/sphinxtrain/pocketsphinx from svn (rev 11259)
    I have three files:
    - phrases_s_u.txt : with the utterances in and with utt_ids
    - phrases_files.txt : with the wav files named as utt_ids, except .wav extension. Wav files are 16bit 16kHz Mono.
    - phrases.dic : a dictionary with only words in utterances

    I run :
    ./bw -hmmdir lium_french_f0 -moddeffn lium_french_f0/mdef.txt -ts2cbfn .semi.
    -feat 1s_c_d_dd -cmn current -agc max -dictfn phrases.dic -ctlfn
    phrases_files.txt -lsnfn phrases_s_u.txt -accumdir .

    I get the following output:

    INFO: main.c(194): Compiled on Nov 9 2011 at 14:42:35
    INFO: cmd_ln.c(691): Parsing command line:
    ./bw \
    -hmmdir lium_french_f0 \
    -moddeffn lium_french_f0/mdef.txt \
    -ts2cbfn .semi. \
    -feat 1s_c_d_dd \
    -cmn current \
    -agc max \
    -dictfn phrases.dic \
    -ctlfn phrases_files.txt \
    -lsnfn phrases_s_u.txt \
    -accumdir .

    Current configuration:

    -2passvar no no
    -abeam 1e-100 1.000000e-100
    -accumdir .
    -agc none max
    -agcthresh 2.0 2.000000e+00
    -bbeam 1e-100 1.000000e-100
    -cb2mllrfn .1cls. .1cls.
    -cepdir
    -cepext mfc mfc
    -ceplen 13 13
    -ckptintv 0
    -cmn current current
    -cmninit 8.0 8.0
    -ctlfn phrases_files.txt
    -diagfull no no
    -dictfn phrases.dic
    -example no no
    -fdictfn
    -feat 1s_c_d_dd 1s_c_d_dd
    -fullsuffixmatch no no
    -fullvar no no
    -help no no
    -hmmdir lium_french_f0
    -latdir
    -latext
    -lda
    -ldaaccum no no
    -ldadim 0 0
    -lsnfn phrases_s_u.txt
    -ltsoov no no
    -lw 11.5 1.150000e+01
    -maxuttlen 0 0
    -meanfn
    -meanreest yes yes
    -mixwfn
    -mixwreest yes yes
    -mllrmat
    -mmie no no
    -mmie_type rand rand
    -moddeffn lium_french_f0/mdef.txt
    -mwfloor 0.00001 1.000000e-05
    -npart 0
    -nskip 0
    -outphsegdir
    -outputfullpath no no
    -part 0
    -pdumpdir
    -phsegdir
    -phsegext phseg phseg
    -runlen -1 -1
    -sentdir
    -sentext sent sent
    -spthresh 0.0 0.000000e+00
    -svspec
    -timing yes yes
    -tmatfn
    -tmatreest yes yes
    -topn 4 4
    -tpfloor 0.0001 1.000000e-04
    -ts2cbfn .semi.
    -varfloor 0.00001 1.000000e-05
    -varfn
    -varnorm no no
    -varreest yes yes
    -viterbi no no

    INFO: feat.c(684): Initializing feature stream to type: '1s_c_d_dd',
    ceplen=13, CMN='current', VARNORM='no', AGC='max'
    INFO: cmn.c(142): mean= 12.00, mean= 0.0
    INFO: agc.c(132): AGCEMax: max= 5.00
    INFO: main.c(283): Reading lium_french_f0/mdef.txt
    INFO: model_def_io.c(573): Model definition info:
    INFO: model_def_io.c(574): 82134 total models defined (45 base, 82089 tri)
    INFO: model_def_io.c(575): 492804 total states
    INFO: model_def_io.c(576): 5725 total tied states
    INFO: model_def_io.c(577): 225 total tied CI states
    INFO: model_def_io.c(578): 45 total tied transition matrices
    INFO: model_def_io.c(579): 6 max state/model
    INFO: model_def_io.c(580): 6 min state/model
    INFO: s3mixw_io.c(116): Read lium_french_f0/mixture_weights
    INFO: s3tmat_io.c(115): Read lium_french_f0/transition_matrices
    INFO: mod_inv.c(301): inserting tprob floor 1.000000e-04 and renormalizing
    INFO: s3gau_io.c(166): Read lium_french_f0/means
    INFO: s3gau_io.c(166): Read lium_french_f0/variances
    INFO: gauden.c(183): 5725 total mgau
    INFO: gauden.c(157): 1 feature streams (|0|=39 )
    INFO: gauden.c(194): 22 total densities
    INFO: gauden.c(97): min_var=1.000000e-05
    INFO: gauden.c(172): compute 4 densities/frame
    INFO: main.c(395): Will reestimate mixing weights.
    INFO: main.c(397): Will reestimate means.
    INFO: main.c(399): Will reestimate variances.
    INFO: main.c(407): Will reestimate transition matrices
    INFO: main.c(420): Reading main lexicon: phrases.dic
    INFO: lexicon.c(218): 631 entries added from phrases.dic
    INFO: main.c(432): Reading filler lexicon: lium_french_f0/noisedict
    INFO: lexicon.c(218): 8 entries added from lium_french_f0/noisedict
    INFO: corpus.c(1078): Will process all remaining utts starting at 0
    INFO: main.c(639): Reestimation: Baum-Welch
    INFO: main.c(644): Generating profiling information consumes significant CPU
    resources.
    INFO: main.c(645): If you are not interested in profiling, use -timing no
    INFO: cmn.c(175): CMN: 5.28 -0.19 0.28 0.24 -0.04 -0.15 0.01 -0.11 -0.10 -0.05
    -0.09 -0.09 -0.13
    INFO: agc.c(123): AGCMax: obs=max= 9.20
    WARNING: "gauden.c", line 1342: Scaling factor too small: -1535020.927524
    ERROR: "backward.c", line 1019: alpha(5.548316e-02) <> sum of alphas * betas
    (0.000000e+00) in frame 278
    ERROR: "baum_welch.c", line 333: actor1_001 ignored

    of codebooks in mean/var files, 5725, inconsistent with ts2cb mapping 1

    column defns
    <seq>
    <id>
    <n_frame_in>
    <n_frame_del>
    <n_state_shmm>
    <avg_states_alpha>
    <avg_states_beta>
    <avg_states_reest>
    <avg_posterior_prune>
    <frame_log_lik>
    <utt_log_lik>
    ... timing info ...
    utt> 0 actor1_001 280 0 54 51 utt 0.001x 1.194e upd 0.001x 1.109e fwd 0.001x
    1.047e bwd 0.000x 0.000e gau 0.000x 0.000e rsts 0.000x 0.000e rstf 0.000x
    0.000e rstu 0.000x 0.000e
    INFO: cmn.c(175): CMN: 5.12 -0.24 0.15 0.19 -0.03 -0.14 -0.10 -0.06 -0.13
    -0.01 -0.07 -0.09 -0.13
    INFO: agc.c(123): AGCMax: obs=max= 9.36
    WARNING: "gauden.c", line 1342: Scaling factor too small: -1813250.056768
    ERROR: "backward.c", line 1019: alpha(1.046842e-01) <> sum of alphas * betas
    (0.000000e+00) in frame 332
    ERROR: "baum_welch.c", line 333: actor1_002 ignored
    utt> 1 actor1_002 334 0 90 83 utt 0.001x 1.993e upd 0.001x 1.285e fwd 0.001x
    1.239e bwd 0.000x 0.000e gau 0.000x 0.000e rsts 0.000x 0.000e rstf 0.000x
    0.000e rstu 0.000x 0.000e
    INFO: cmn.c(175): CMN: 4.88 0.12 0.25 0.13 -0.02 -0.22 0.02 -0.13 -0.12 -0.01
    -0.12 -0.11 -0.12
    INFO: agc.c(123): AGCMax: obs=max= 8.96
    WARNING: "gauden.c", line 1342: Scaling factor too small: -1441763.240420
    ERROR: "backward.c", line 1019: alpha(6.752134e-02) <> sum of alphas * betas
    (0.000000e+00) in frame 258
    ERROR: "baum_welch.c", line 333: actor1_003 ignored </utt_log_lik></frame_log_lik></avg_posterior_prune></avg_states_reest></avg_states_beta></avg_states_alpha></n_state_shmm></n_frame_del></n_frame_in></id></seq>

    utt> 32 actor1_033 1487 0 510 464 utt 0.007x 1.005e upd 0.007x 0.997e fwd
    0.007x 1.007e bwd 0.000x 0.146e gau 0.000x 0.728e rsts 0.000x 0.000e rstf
    0.000x 0.000e rstu 0.000x 0.000e
    INFO: cmn.c(175): CMN: 9.18 -0.14 0.37 0.24 -0.31 -0.23 -0.04 -0.26 -0.17
    -0.12 -0.14 -0.15 -0.20
    INFO: agc.c(123): AGCMax: obs=max= 6.47
    WARNING: "gauden.c", line 1342: Scaling factor too small: -2722683.598133
    ERROR: "backward.c", line 1019: alpha(1.191249e-01) <> sum of alphas * betas
    (0.000000e+00) in frame 855
    ERROR: "baum_welch.c", line 333: actor1_034 ignored
    utt> 33 actor1_034 857 0 372 330 utt 0.005x 1.183e upd 0.005x 0.991e fwd
    0.005x 0.980e bwd 0.000x 0.951e gau 0.000x 0.660e rsts 0.000x 0.000e rstf
    0.000x 0.000e rstu 0.000x 0.000e
    overall> sparrow 0 (-0) 0.000000e+00 0.000000e+00 0.000x 1.051e
    WARNING: "accum.c", line 618: Over 500 senones never occur in the input data.
    This is normal for context-dependent untied senone training or for adaptation,
    but could indicate a serious problem otherwise.
    INFO: s3mixw_io.c(232): Wrote ./mixw_counts
    INFO: s3tmat_io.c(174): Wrote ./tmat_counts
    INFO: s3gau_io.c(478): Wrote ./gauden_counts with means with vars
    INFO: main.c(1040): Counts saved to .

    I have an error for every file.
    Do you have an idea what causes theses errors ?

    After the first error, I have the message :
    "# of codebooks in mean/var files, 5725, inconsistent with ts2cb mapping 1"
    What does it mean ?

    Is my command line correct ?
    In particular, I am not sure of :
    -ts2cbfn .semi. : I have no .semi. file in the current directory or in the lium_french_f0 directory
    -svspec 0-12/13-25/26-38 : it is specified in the example, but not in lium_french_f0/feat.params so I dropped it.

    Thanks,

    Boris.

     
  • Nickolay V. Shmyrev

    Is my command line correct ?

    Most likely no. Once they will be correct everything will be ok.

    In particular, I am not sure of :
    -ts2cbfn .semi. : I have no .semi. file in the current directory or in the lium_french_f0 directory

    French model is continuous, it should be .cont.

    -svspec 0-12/13-25/26-38 : it is specified in the example, but not in
    lium_french_f0/feat.params so I dropped it.

    Correct, no svspec there

     
  • Boris Mansencal

    Boris Mansencal - 2011-11-14

    French model is continuous, it should be .cont.

    Where can I see that ?

    It is ok now, I have no more error. Thanks.

    I have then created both MLLR and MAP adaptations, with the following
    commands:

    ./mllr_solve -meanfn lium_french_f0/means -varfn lium_french_f0/variances
    -outmllrfn mllr_matrix -accumdir .

    ./map_adapt -meanfn lium_french_f0/means -varfn lium_french_f0/variances
    -mixwfn lium_french_f0/mixture_weights -tmatfn
    lium_french_f0/transition_matrices -accumdir . -mapmeanfn
    lium_french_f0_adapt/means -mapvarfn lium_french_f0_adapt/variances -mapmixwfn
    lium_french_f0_adapt/mixture_weights -maptmatfn
    lium_french_f0_adapt/transition_matrices

    When I use -mllr mllr_matrix, I improve my recognition results (on a test set
    containing adaptation set).
    But when I use my adapted acoustic model, results are worse than with the
    original model.
    If I combine -mllr and my adapted acoustic model, results are better than with
    the original model, but still worse than with -mllr alone.

    Do you have any idea why it is happening ?

    Boris.

     
  • Nickolay V. Shmyrev

    Where can I see that ?

    It's written in a model readme.

    Do you have any idea why it is happening ?

    MAP has smoothing parameter tau which needs to be optimized. If you want to
    understand the process you might want to read more on MLLR and MAP and on
    model adaptation.

    http://nsh.nexiwave.com/2009/09/adaptation-
    methods.html

     
  • Boris Mansencal

    Boris Mansencal - 2011-11-15

    I have read your blog post and the linked pdf.
    In the pdf, the method to combine MAP and MLLR is as follows :
    " 1) Compute an MLLR transformation with bw and mllr_solve
    2) Apply it to the baseline means with mllr_transform
    3) Re-run bw with the transformed means
    4) Run map_adapt to produce a MAP re-estimation "

    1) I think it is exactly what is described in the tutorial.
    I have done :
    ./bw -hmmdir lium_french_f0 -moddeffn lium_french_f0/mdef.txt -ts2cbfn .cont.
    -feat 1s_c_d_dd -cmn current -agc max -dictfn phrases.dic -ctlfn
    phrases_files.txt -lsnfn phrases_s_u.txt -accumdir .

    ./mllr_solve -meanfn lium_french_f0/means -varfn lium_french_f0/variances
    -outmllrfn mllr_matrix -accumdir .

    2) I suppose this is something like :
    cp -a lium_french_f0 lium_french_f0_adaptB
    ./mllr_transform -ingaucntfn ??? -inmeanfn lium_french_f0/means -mllrmat
    mllr_matrix -moddeffn lium_french_f0/mdef.txt -outgaucntfn ??? -outmeanfn
    lium_french_f0_adaptB/means

    What is the "Input Gaussian accumulation count file name" ?
    Is it one of the files produced by bw : mixw_counts, tmat_counts or
    gauden_counts ?

    3) Is it enough to do :
    ./bw -hmmdir lium_french_f0_adaptB -moddeffn lium_french_f0_adaptB/mdef.txt
    -ts2cbfn .cont. -feat 1s_c_d_dd -cmn current -agc max -dictfn phrases.dic
    -ctlfn phrases_files.txt -lsnfn phrases_s_u.txt -accumdir .
    that is just replacing the directory of the model by the adapted one ?

    I see that there is a -mllrmat option to bw. Is it useful somewhere ?

    4) Is map_adapt still broken (as reported in your blog post in 09/2009) ?
    Should i do
    ./map_adapt -meanfn lium_french_f0/means -varfn lium_french_f0/variances
    -mixwfn lium_french_f0/mixture_weights -tmatfn
    lium_french_f0/transition_matrices -accumdir . -mapmeanfn
    lium_french_f0_adaptB/means -mapvarfn lium_french_f0_adaptB/variances
    -mapmixwfn lium_french_f0_adaptB/mixture_weights -maptmatfn
    lium_french_f0_adaptB/transition_matrices

    or the same with -fixedtau 100 (or greater value) ?

    Is there a way to have a clue about which value of tau to use ?

    Thanks again,

    Boris.

     
  • Boris Mansencal

    Boris Mansencal - 2011-11-17

    Any help on this ?
    In particular, on the -ingaucntfn option of mllr_transform.

     
  • Nickolay V. Shmyrev

    What is the "Input Gaussian accumulation count file name" ?
    Is it one of the files produced by bw : mixw_counts, tmat_counts or
    gauden_counts ?

    It is gauden_counts, but it's not needed. mllr_transform can either transform
    model or gauden_counts. You only need to transform means with -inmean and
    -outmean.

    3) Is it enough to do :
    ./bw -hmmdir lium_french_f0_adaptB -moddeffn lium_french_f0_adaptB/mdef.txt
    -ts2cbfn .cont. -feat 1s_c_d_dd -cmn current -agc max -dictfn phrases.dic
    -ctlfn phrases_files.txt -lsnfn phrases_s_u.txt -accumdir .
    that is just replacing the directory of the model by the adapted one ?

    Not sure what you mean by "enough" but this step is needed.

    I see that there is a -mllrmat option to bw. Is it useful somewhere ?

    It's for different purposes

    Is map_adapt still broken (as reported in your blog post in 09/2009) ?

    It's not broken. It requires manually selected tau.

    Is there a way to have a clue about which value of tau to use ?

    You can try different values and find the best one.

     
  • Boris Mansencal

    Boris Mansencal - 2011-11-18

    To manually select tau for map_adapt, it is with -fixedtau or -tau ?

     
  • Nickolay V. Shmyrev

    Hello Boris

    To manually select tau for map_adapt, it is with -fixedtau or -tau ?

    Yes, exactly

    -fixedtau yes -tau 100
    
     
  • Osmoz

    Osmoz - 2011-12-07

    Hello,

    Do you have a French model working with pocketSphinx ?

    I try the two lium_french_f0 and lium_french_f2 and no result.

    When i speak there is no word recognized...

    Did you try it with freeswitch ?

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.