Menu

ERROR adapting model zh_broadcastnews_ptm256

Help
yaksea
2011-12-06
2012-09-22
  • yaksea

    yaksea - 2011-12-06

    I'm trying to adapt the zh_broadcastnews_ptm256_8000. anytihng goes well until
    I get to the step which collects statistics using bw, I get the error
    "FATAL_ERROR: "mod_inv.c", line 357: Number of feature streams in
    mixture_weights file 4 differs from the configured value 3, check the command
    line options".

    the command i typed:

    ./bw \
     -hmmdir  zh_broadcastnews_ptm256_8000\
     -moddeffn zh_broadcastnews_ptm256_8000/mdef.txt \
     -ts2cbfn .semi. \
     -feat 1s_c_d_dd \
     -svspec 0-12/13-25/26-38 \
     -cmn current \
     -agc none \
     -dictfn kw.dic \
     -ctlfn kw.fileids \
     -lsnfn kw.transcription \
     -accumdir .
    

    and the entire command line output:

    INFO: cmd_ln.c(691): Parsing command line:
    sphinx_fe \
        -argfile zh_broadcastnews_ptm256_8000/feat.params \
        -samprate 16000 \
        -nfft 1024 \
        -c kw.fileids \
        -di . \
        -do . \
        -ei wav \
        -eo mfc \
        -mswav yes
    
    Current configuration:
    [NAME]      [DEFLT]     [VALUE]
    -alpha      0.97        9.700000e-01
    -argfile            zh_broadcastnews_ptm256_8000/feat.params
    -blocksize  2048        2048
    -build_outdirs  yes     yes
    -c              kw.fileids
    -cep2spec   no      no
    -di             .
    -dither     no      no
    -do             .
    -doublebw   no      no
    -ei             wav
    -eo             mfc
    -example    no      no
    -frate      100     100
    -help       no      no
    -i              
    -input_endian   little      little
    -lifter     0       0
    -logspec    no      no
    -lowerf     133.33334   1.333333e+02
    -mach_endian    little      little
    -mswav      no      yes
    -ncep       13      13
    -nchans     1       1
    -nfft       512     1024
    -nfilt      40      40
    -nist       no      no
    -npart      0       0
    -nskip      0       0
    -o              
    -ofmt       sphinx      sphinx
    -part       0       0
    -raw        no      no
    -remove_dc  no      no
    -round_filters  yes     yes
    -runlen     -1      -1
    -samprate   16000       1.600000e+04
    -seed       -1      -1
    -smoothspec no      no
    -spec2cep   no      no
    -sph2pipe   no      no
    -transform  legacy      legacy
    -unit_area  yes     yes
    -upperf     6855.4976   6.855498e+03
    -verbose    no      no
    -warp_params            
    -warp_type  inverse_linear  inverse_linear
    -whichchan  0       0
    -wlen       0.025625    2.562500e-02
    
    INFO: cmd_ln.c(691): Parsing command line:
    \
        -alpha 0.97 \
        -doublebw no \
        -nfilt 40 \
        -ncep 13 \
        -lowerf 133.33334 \
        -upperf 6855.4976 \
        -nfft 1024 \
        -wlen 0.0256 \
        -transform legacy \
        -feat s2_4x \
        -agc none \
        -cmn current \
        -varnorm no
    
    Current configuration:
    [NAME]      [DEFLT]     [VALUE]
    -alpha      0.97        9.700000e-01
    -argfile            zh_broadcastnews_ptm256_8000/feat.params
    -blocksize  2048        2048
    -build_outdirs  yes     yes
    -c              kw.fileids
    -cep2spec   no      no
    -di             .
    -dither     no      no
    -do             .
    -doublebw   no      no
    -ei             wav
    -eo             mfc
    -example    no      no
    -frate      100     100
    -help       no      no
    -i              
    -input_endian   little      little
    -lifter     0       0
    -logspec    no      no
    -lowerf     133.33334   1.333333e+02
    -mach_endian    little      little
    -mswav      no      yes
    -ncep       13      13
    -nchans     1       1
    -nfft       512     1024
    -nfilt      40      40
    -nist       no      no
    -npart      0       0
    -nskip      0       0
    -o              
    -ofmt       sphinx      sphinx
    -part       0       0
    -raw        no      no
    -remove_dc  no      no
    -round_filters  yes     yes
    -runlen     -1      -1
    -samprate   16000       1.600000e+04
    -seed       -1      -1
    -smoothspec no      no
    -spec2cep   no      no
    -sph2pipe   no      no
    -transform  legacy      legacy
    -unit_area  yes     yes
    -upperf     6855.4976   6.855498e+03
    -verbose    no      no
    -warp_params            
    -warp_type  inverse_linear  inverse_linear
    -whichchan  0       0
    -wlen       0.025625    2.560000e-02
    
    INFO: sphinx_fe.c(1016): Processing all remaining utterances at position 0
    INFO: mdef.c(520): Reading model definition: zh_broadcastnews_ptm256_8000/mdef
    INFO: bin_mdef.c(173): Allocating 68760 * 8 bytes (537 KiB) for CD tree
    INFO: main.c(194): Compiled on Dec  6 2011 at 19:23:25
    INFO: cmd_ln.c(691): Parsing command line:
    /root/work/sphinx/sphinxtrain/bin.i686-pc-linux-gnu/bw \
        -hmmdir zh_broadcastnews_ptm256_8000 \
        -moddeffn zh_broadcastnews_ptm256_8000/mdef.txt \
        -ts2cbfn .semi. \
        -feat 1s_c_d_dd \
        -svspec 0-12/13-25/26-38 \
        -cmn current \
        -agc none \
        -dictfn kw.dic \
        -ctlfn kw.fileids \
        -lsnfn kw.transcription \
        -accumdir .
    
    Current configuration:
    [NAME]          [DEFLT]     [VALUE]
    -2passvar       no      no
    -abeam          1e-100      1.000000e-100
    -accumdir               .
    -agc            none        none
    -agcthresh      2.0     2.000000e+00
    -bbeam          1e-100      1.000000e-100
    -cb2mllrfn      .1cls.      .1cls.
    -cepdir                 
    -cepext         .mfc        .mfc
    -ceplen         13      13
    -ckptintv               0
    -cmn            current     current
    -cmninit        8.0     8.0
    -ctlfn                  kw.fileids
    -diagfull       no      no
    -dictfn                 kw.dic
    -example        no      no
    -fdictfn                
    -feat           1s_c_d_dd   1s_c_d_dd
    -fullsuffixmatch    no      no
    -fullvar        no      no
    -help           no      no
    -hmmdir                 zh_broadcastnews_ptm256_8000
    -latdir                 
    -latext                 
    -lda                    
    -ldaaccum       no      no
    -ldadim         0       0
    -lsnfn                  kw.transcription
    -ltsoov         no      no
    -lw         11.5        1.150000e+01
    -maxuttlen      0       0
    -meanfn                 
    -meanreest      yes     yes
    -mixwfn                 
    -mixwreest      yes     yes
    -mllrmat                
    -mmie           no      no
    -mmie_type      rand        rand
    -moddeffn               zh_broadcastnews_ptm256_8000/mdef.txt
    -mwfloor        0.00001     1.000000e-05
    -npart                  0
    -nskip                  0
    -outphsegdir                
    -outputfullpath     no      no
    -part                   0
    -pdumpdir               
    -phsegdir               
    -phsegext       phseg       phseg
    -runlen         -1      -1
    -sentdir                
    -sentext        sent        sent
    -spthresh       0.0     0.000000e+00
    -svspec                 0-12/13-25/26-38
    -timing         yes     yes
    -tmatfn                 
    -tmatreest      yes     yes
    -topn           4       4
    -tpfloor        0.0001      1.000000e-04
    -ts2cbfn                .semi.
    -varfloor       0.00001     1.000000e-05
    -varfn                  
    -varnorm        no      no
    -varreest       yes     yes
    -viterbi        no      no
    
    INFO: feat.c(684): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
    INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
    INFO: main.c(219): Using subvector specification 0-12/13-25/26-38
    INFO: main.c(283): Reading zh_broadcastnews_ptm256_8000/mdef.txt
    INFO: model_def_io.c(587): Model definition info:
    INFO: model_def_io.c(588): 65091 total models defined (70 base, 65021 tri)
    INFO: model_def_io.c(589): 260364 total states
    INFO: model_def_io.c(590): 8210 total tied states
    INFO: model_def_io.c(591): 210 total tied CI states
    INFO: model_def_io.c(592): 70 total tied transition matrices
    INFO: model_def_io.c(593): 4 max state/model
    INFO: model_def_io.c(594): 4 min state/model
    INFO: s3mixw_io.c(116): Read zh_broadcastnews_ptm256_8000/mixture_weights [8210x4x256 array]
    FATAL_ERROR: "mod_inv.c", line 357: Number of feature streams in mixture_weights file 4 differs from the configured value 3, check the command line options
    
     
  • Nickolay V. Shmyrev

    Tutorial says:

    For different models, make sure that arguments to bw match the arguments
    from feat.params. For example, for most continuous models (like the ones used
    by Sphinx4) you don't need to include the svspec option.

     
  • yaksea

    yaksea - 2011-12-06

    Thanks for quick reply. now i change

    bw \
     -hmmdir zh_broadcastnews_ptm256_8000 \
     -moddeffn zh_broadcastnews_ptm256_8000/mdef.txt \
     -ts2cbfn .semi. \
     -feat s2_4x \
     -dictfn kw.dic \
     -ctlfn kw.fileids \
     -lsnfn kw.transcription
    

    which according the setting of feat.params in zh_broadcastnews_ptm256_8000:

    -alpha 0.97
    -doublebw no
    -nfilt 40
    -ncep 13
    -lowerf 133.33334
    -upperf 6855.4976
    -nfft 512
    -wlen 0.0256
    -transform legacy
    -feat s2_4x
    -agc none
    -cmn current
    -varnorm no
    

    but the other error occurs:

    # of codebooks in mean/var files, 70, inconsistent with ts2cb mapping 1
    INFO: main.c(395): Will reestimate mixing weights.
    INFO: main.c(397): Will reestimate means.
    INFO: main.c(399): Will reestimate variances.
    INFO: main.c(407): Will reestimate transition matrices
    INFO: main.c(420): Reading main lexicon: kw.dic
    ERROR: "lexicon.c", line 96: Unknown phone NI
    ERROR: "lexicon.c", line 223: pronunciation for 你的 has undefined phones; skipping.
    ERROR: "lexicon.c", line 96: Unknown phone ZHANG
    ERROR: "lexicon.c", line 223: pronunciation for 账户 has undefined phones; skipping.
    ERROR: "lexicon.c", line 96: Unknown phone YI
    ERROR: "lexicon.c", line 223: pronunciation for  has undefined phones; skipping.
    ERROR: "lexicon.c", line 96: Unknown phone QIAN
    ERROR: "lexicon.c", line 223: pronunciation for 欠费 has undefined phones; skipping.
    INFO: lexicon.c(233): 0 entries added from kw.dic
    INFO: main.c(432): Reading filler lexicon: zh_broadcastnews_ptm256_8000/noisedict
    INFO: lexicon.c(233): 8 entries added from zh_broadcastnews_ptm256_8000/noisedict
    INFO: corpus.c(1281): Will process all remaining utts starting at 0
    INFO: main.c(639): Reestimation: Baum-Welch
    INFO: main.c(644): Generating profiling information consumes significant CPU resources.
    INFO: main.c(645): If you are not interested in profiling, use -timing no
    WARNING: "main.c", line 686: NO ACCUMDIR SET.  No counts will be written; assuming debug
    column defns
        <seq>
        <id>
        <n_frame_in>
        <n_frame_del>
        <n_state_shmm>
        <avg_states_alpha>
        <avg_states_beta>
        <avg_states_reest>
        <avg_posterior_prune>
        <frame_log_lik>
        <utt_log_lik>
        ... timing info ... 
    utt>     0                   kw_0001stat_retry(kw_0001..mfc) failed
    ERROR: "corpus.c", line 1555: MFCC read of kw_0001..mfc failed.  Retrying after sleep...
    stat_retry(kw_0001..mfc) failed
    ERROR: "corpus.c", line 1555: MFCC read of kw_0001..mfc failed.  Retrying after sleep...
    stat_retry(kw_0001..mfc) failed
    ERROR: "corpus.c", line 1555: MFCC read of kw_0001..mfc failed.  Retrying after sleep...
    
     
  • Nickolay V. Shmyrev

    It failed to read the file according to the name you provided. You need to fix
    the contents of the file kw.fileids

    You also need to use -ts2cbfn .ptm. since you are trying to adapt PTM model

     
  • yaksea

    yaksea - 2011-12-07

    Thanks nshmyrev.

    it solved by changing as below:

    bw \
     -hmmdir zh_broadcastnews_ptm256_8000 \
     -moddeffn zh_broadcastnews_ptm256_8000/mdef.txt \
     -ts2cbfn .ptm. \
     -feat s2_4x \
     -dictfn kw.dic \
     -ctlfn kw.fileids \
     -lsnfn kw.transcription\
     -cepext mfc
    

    and i change the kw.dic as below

    你的  n i d e
    账户  zh ang h u
    已   y i
    欠费  q ian f ei
    

    to solve the problem before:

    ERROR: "lexicon.c", line 96: Unknown phone NI
    ERROR: "lexicon.c", line 223: pronunciation for 你的 has undefined phones; skipping.
    ERROR: "lexicon.c", line 96: Unknown phone ZHANG
    ERROR: "lexicon.c", line 223: pronunciation for 账户 has undefined phones; skipping.
    ERROR: "lexicon.c", line 96: Unknown phone YI
    ERROR: "lexicon.c", line 223: pronunciation for  has undefined phones; skipping.
    ERROR: "lexicon.c", line 96: Unknown phone QIAN
    ERROR: "lexicon.c", line 223: pronunciation for 欠费 has undefined phones; skipping.
    

    but still another problem:

    WARNING: "main.c", line 686: NO ACCUMDIR SET.  No counts will be written; assuming debug
    column defns
        <seq>
        <id>
        <n_frame_in>
        <n_frame_del>
        <n_state_shmm>
        <avg_states_alpha>
        <avg_states_beta>
        <avg_states_reest>
        <avg_posterior_prune>
        <frame_log_lik>
        <utt_log_lik>
        ... timing info ... 
    utt>     0                   kw_0001 1644INFO: cmn.c(175): CMN:  4.45 -0.30  0.07 -0.31  0.18 -0.11  0.14 -0.11 -0.07 -0.03 -0.04 -0.04 -0.01 
        0WARNING: "mk_phone_list.c", line 173: Unable to lookup word '你的账户已欠费' in the lexicon
    WARNING: "next_utt_states.c", line 79: Segmentation fault
    

    it seems the difference of word segment between chinese and english, so i
    change kw.transcription from

    <s> 你的账户已欠费 </s> (kw_0001)
    

    to

    <s> 你的 账户  欠费 </s> (kw_0001)
    

    now the error display as below, is there anything i missed

    WARNING: "main.c", line 686: NO ACCUMDIR SET.  No counts will be written; assuming debug
    column defns
        <seq>
        <id>
        <n_frame_in>
        <n_frame_del>
        <n_state_shmm>
        <avg_states_alpha>
        <avg_states_beta>
        <avg_states_reest>
        <avg_posterior_prune>
        <frame_log_lik>
        <utt_log_lik>
        ... timing info ... 
    utt>     0                   kw_0001 1644INFO: cmn.c(175): CMN:  4.45 -0.30  0.07 -0.31  0.18 -0.11  0.14 -0.11 -0.07 -0.03 -0.04 -0.04 -0.01 
        0    64 23 ERROR: "backward.c", line 430: Failed to align audio to trancript: final state of the search is not reached
    ERROR: "baum_welch.c", line 331: kw_0001 ignored
     utt 0.039x 1.100e upd 0.039x 1.091e fwd 0.038x 1.096e bwd 0.000x 0.026e gau 0.025x 1.385e rsts 0.000x 0.000e rstf 0.000x 0.000e rstu 0.000x 0.000e
    overall> alex-desktop 0 (-0) 0.000000e+00 0.000000e+00 0.000x 1.100e
    
     
  • yaksea

    yaksea - 2011-12-07

    Thanks nshmyrev's help.

    the problem has been solved by passing better quality voice file.

    i get the wav from microsoft's t2s system.

    Thanks & Regards.

     

Log in to post a comment.