Menu

Inconsistent number of gaussian with inc_comp

Help
2005-03-09
2012-09-22
  • Tan Tien Ping

    Tan Tien Ping - 2005-03-09

    Hi,
    I am trying to train a 16 Gaussians context dependent acoustic model. So, if I'm not mistaken. I may split the gaussian in the order 1->2->4->8->16 using inc_comp command by setting the -ninc parameter. However, when I have reached 4 gaussians, and I tried to split to 8 gaussians, it only produced 6 densities, which was reported when I executed the subsequent bw command. I have check it using printp command, it also showed the number of densities as 6. Subsequently, I tried to create a 8 gaussians from 1 gaussian, it also produced 2 guassians. Is there anything wrong with my understanding? Please advice. Below are some of the log produced. Thank you very much.

    1. means, variances and mixture weights with 1 Gaussian initially
      ./inc_comp -ninc 8 -ceplen 13 -dcountfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.cd_tied_mix_weight_6 -inmixwfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.cd_tied_mix_weight_6 -outmixwfn mixture -inmeanfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.cd_tied_mean_6 -outmeanfn mean -invarfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.cd_tied_variance_6 -outvarfn variance -feat 1s_c_d_dd

    2. [tan@homere bin.i686-pc-linux-gnu]$ ./bw -moddeffn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.cd_tied.mdef -ts2cbfn .cont. -mixwfn mixture -tmatfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.cd_tied_trans_matrix_6 -meanfn mean -varfn variance -dictfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.dict -fdictfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.filler -ctlfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.ctl -part 1 -npart 1 -cepdir /home/tan/sphinx3/BRAF100/CORPUS/Apprentissage/cepstra -cepext mfc -lsnfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.train.transcription -accumdir /home/tan/sphinx3/SphinxTrain/data/braf100_con -feat 1s_c_d_dd -ceplen 13 -topn 2 -meanreest yes -varreest yes -2passvar yes -tmatreest yes -abeam 1e-90 -bbeam 1e-40
      INFO: main.c(167): Compiled on Feb 14 2005 at 15:43:09
      ./bw \
      -moddeffn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.cd_tied.mdef \
      -ts2cbfn .cont. \
      -mixwfn mixture \
      -tmatfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.cd_tied_trans_matrix_6 \
      -meanfn mean \
      -varfn variance \
      -dictfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.dict \
      -fdictfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.filler \
      -ctlfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.ctl \
      -part 1 \
      -npart 1 \
      -cepdir /home/tan/sphinx3/BRAF100/CORPUS/Apprentissage/cepstra \
      -cepext mfc \
      -lsnfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.train.transcription \
      -accumdir /home/tan/sphinx3/SphinxTrain/data/braf100_con \
      -feat 1s_c_d_dd \
      -ceplen 13 \
      -topn 2 \
      -meanreest yes \
      -varreest yes -2passvar yes \
      -tmatreest yes \
      -abeam 1e-90 \
      -bbeam 1e-40

    [Switch] [Default] [Value]
    -help no no
    -example no no
    -moddeffn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.cd_tied.mdef
    -tmatfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.cd_tied_trans_matrix_6
    -mixwfn mixture
    -meanfn mean
    -varfn variance
    -mwfloor 0.00001 1.000000e-05
    -tpfloor 0.0001 1.000000e-04
    -varfloor 0.00001 1.000000e-05
    -topn 4 2
    -dictfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.dict
    -fdictfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.filler
    -ctlfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.ctl
    -nskip
    -runlen -1 -1
    -part 1
    -npart 1
    -cepext mfc mfc
    -cepdir /home/tan/sphinx3/BRAF100/CORPUS/Apprentissage/cepstra
    -segext v8_seg v8_seg
    -segdir
    -sentdir
    -sentext sent sent
    -lsnfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.train.transcription
    -accumdir /home/tan/sphinx3/SphinxTrain/data/braf100_con
    -ceplen 13 13
    -agc max max
    -cmn current current
    -varnorm no no
    -silcomp none none
    -sildel no no
    -siltag SIL SIL
    -abeam 1e-100 1.000000e-90
    -bbeam 1e-100 1.000000e-40
    -varreest yes yes
    -meanreest yes yes
    -mixwreest yes yes
    -tmatreest yes yes
    -spkrxfrm
    -mllrmult no no
    -mllradd no no
    -ts2cbfn .cont.
    -feat 1s_c_d_dd
    -timing yes yes
    -viterbi no no
    -2passvar no yes
    -sildelfn
    -cb2mllrfn
    -spthresh 0.0 0.000000e+00
    -maxuttlen 0 0
    -ckptintv
    INFO: main.c(184): Reading /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.cd_tied.mdef
    INFO: model_def_io.c(587): Model definition info:
    INFO: model_def_io.c(588): 61652 total models defined (39 base, 61613 tri)
    INFO: model_def_io.c(589): 246608 total states
    INFO: model_def_io.c(590): 8117 total tied states
    INFO: model_def_io.c(591): 117 total tied CI states
    INFO: model_def_io.c(592): 39 total tied transition matrices
    INFO: model_def_io.c(593): 4 max state/model
    INFO: model_def_io.c(594): 4 min state/model
    INFO: s3mixw_io.c(116): Read mixture [8117x1x2 array]
    WARNING: "mod_inv.c", line 352: Model inventory n_density not set; setting to value in mixw file, 2.
    INFO: mod_inv.c(387): Norm failed for 2 mixw: 3451 6314
    INFO: s3tmat_io.c(115): Read /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.cd_tied_trans_matrix_6 [39x3x4 array]
    INFO: mod_inv.c(280): inserting tprob floor 1.000000e-04 and renormalizing
    INFO: s3gau_io.c(162): Read mean [8117x1x2 array]
    INFO: s3gau_io.c(162): Read variance [8117x1x2 array]
    INFO: gauden.c(167): 8117 total mgau
    INFO: gauden.c(139): 1 feature streams (|0|=39 )
    INFO: gauden.c(178): 2 total densities
    INFO: gauden.c(86): min_var=1.000000e-05
    INFO: gauden.c(156): compute 2 densities/frame
    INFO: main.c(281): Will reestimate mixing weights.
    ...

    TP

     
    • Anonymous

      Anonymous - 2005-03-09

      It is true that in order to produce a model using N gaussians, you must first produce a 1-gaussian model, then split that 1 into 2 and retrain, then using the 2-gaussian model, split each of the 2, yielding 4, etc. The program "inc_comp" is used to do the splitting. Its "-ninc" parameter specifies the number of gaussians that you wish to split; its value must be less than or equal to the number in the current model. Therefore if you have a 1-gaussian model, -ninc should be 1 in order to split the 1 gaussian into 2. After you have trained a 2-gaussian model, -ninc should be 2 in order to split each gaussian, yielding 4, etc.

      You have described 2 experiments. In the first, you had trained a 4-gaussian model and tried to split into 8. That should have worked (-ninc should have been 4, right?), so if it yielded only 6, that suggests that -ninc must have had the value 2. I suggest that you check that.

      In the second experiment, you said that you started with a 1-gaussian model and attempted to split to 8 in just one step. That doesn't work. As you said at the top of your posting, you must do it in the order 1->2->4->8. The -ninc parameter must be different each time: 1, 2, 4, ...

      cheers,
      jerry

       
    • Tan Tien Ping

      Tan Tien Ping - 2005-03-10

      Hi,
      Thanks for the reply. When I have redo the training and reached 4 guassians, I have set the -ninc to 4, however when I checked the resulted mixtures, it still contain only 6 densities. I wonder what is wrong. The command I use is as following:

      ./inc_comp -ninc 4 -ceplen 13 -dcountfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.2gau.cd_tied_mix_weight_6 -inmixwfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.4gau.cd_tied_mix_weight_6 -outmixwfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.cd_tied_init_weight_8gau -inmeanfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.4gau.cd_tied_mean_6 -outmeanfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.cd_tied_init_mean_8gau -invarfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.4gau.cd_tied_variance_6 -outvarfn /home/tan/sphinx3/SphinxTrain/data/braf100_con/braf100.cd_tied_init_variance_8gau -feat 1s_c_d_dd

      It actually produced the relevant files with some warning as below, I'm not sure whether the warning message is really important or not. If it is important, what is the possible things I have done which is wrong.

      8116:1(2.22e+02)0(2.04e+02)WARNING: "inc_densities.c", line 143: (mgau= 8116, feat= 0, density= 2) never observed skipping
      WARNING: "inc_densities.c", line 143: (mgau= 8116, feat= 0, density= 2) never observed skipping

      Anyone have any idea? Thank you in advance.

      TP

       
    • Anonymous

      Anonymous - 2005-03-10

      TP -- It looks to me as if your -dcountfn value is wrong -- it refers to the 2-gaussian file.

      cheers,
      jerry

       
    • Tan Tien Ping

      Tan Tien Ping - 2005-03-10

      hi,
      Thanks again Jerry. How can I do this kind of silly mistake :( !

      TP

       

Log in to post a comment.