Menu

problem adapting acoustic model (ptm) in pocketsphinx

Help
2017-02-22
2017-02-26
  • Juan J. Alonso

    Juan J. Alonso - 2017-02-22

    Hi,
    I'm trying to adapt spanish acoustic model (voxforge_es_sphinx.cd_ptm_4000) and I get an error when I collect statistics with bw.

    The files in workspace dir are:
    https://dl.dropboxusercontent.com/u/11313561/tmp/adaptacion.7z

    Audo files are 16 KHz mono and there is a silence less than 0.2 seconds at the begining and at the end. I have checked that all word In the sentences are in dict file.

    This is the output of bw with only 5 sentences (the problem is the same when using 130 but the files I send are for 5 for simplicity):

    ./bw -hmmdir voxforge_es_sphinx.cd_ptm_4000 -moddeffn voxforge_es_sphinx.cd_ptm_4000/mdef -ts2cbfn .ptm. -feat 1s_c_d_dd -svspec 0-12/13-25/26-38 -cmn current -agc none -dictfn es.dict -ctlfn frases.fileids -lsnfn frases.transcription -accumdir .
    INFO: main.c(229): Compiled on Jan 18 2017 at 23:54:13
    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -2passvar no no
    -abeam 1e-100 1.000000e-100
    -accumdir .
    -agc none none
    -agcthresh 2.0 2.000000e+00
    -bbeam 1e-100 1.000000e-100
    -cb2mllrfn .1cls. .1cls.
    -cepdir
    -cepext mfc mfc
    -ceplen 13 13
    -ckptintv 0
    -cmn live current
    -cmninit 40,3,-1 40,3,-1
    -ctlfn frases.fileids
    -diagfull no no
    -dictfn es.dict
    -example no no
    -fdictfn
    -feat 1s_c_d_dd 1s_c_d_dd
    -fullvar no no
    -help no no
    -hmmdir voxforge_es_sphinx.cd_ptm_4000
    -latdir
    -latext
    -lda
    -ldadim 0 0
    -lsnfn frases.transcription
    -lw 11.5 1.150000e+01
    -maxuttlen 0 0
    -meanfn
    -meanreest yes yes
    -mixwfn
    -mixwreest yes yes
    -mllrmat
    -mmie no no
    -mmie_type rand rand
    -moddeffn voxforge_es_sphinx.cd_ptm_4000/mdef
    -mwfloor 0.00001 1.000000e-05
    -npart 0
    -nskip 0
    -outphsegdir
    -outputfullpath no no
    -part 0
    -pdumpdir
    -phsegdir
    -phsegext phseg phseg
    -runlen -1 -1
    -sentdir
    -sentext sent sent
    -spthresh 0.0 0.000000e+00
    -svspec 0-12/13-25/26-38
    -timing yes yes
    -tmatfn
    -tmatreest yes yes
    -topn 4 4
    -tpfloor 0.0001 1.000000e-04
    -ts2cbfn .ptm.
    -varfloor 0.00001 1.000000e-05
    -varfn
    -varnorm no no
    -varreest yes yes
    -viterbi no no

    INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='batch', VARNORM='no', AGC='none'
    INFO: main.c(255): Using subvector specification 0-12/13-25/26-38
    INFO: main.c(318): Reading voxforge_es_sphinx.cd_ptm_4000/mdef
    INFO: model_def_io.c(573): Model definition info:
    INFO: model_def_io.c(574): 26506 total models defined (26 base, 26480 tri)
    INFO: model_def_io.c(575): 106024 total states
    INFO: model_def_io.c(576): 4078 total tied states
    INFO: model_def_io.c(577): 78 total tied CI states
    INFO: model_def_io.c(578): 26 total tied transition matrices
    INFO: model_def_io.c(579): 4 max state/model
    INFO: model_def_io.c(580): 4 min state/model
    INFO: s3mixw_io.c(117): Read voxforge_es_sphinx.cd_ptm_4000/mixture_weights [4078x3x128 array]
    INFO: s3tmat_io.c(118): Read voxforge_es_sphinx.cd_ptm_4000/transition_matrices [26x3x4 array]
    INFO: mod_inv.c(301): inserting tprob floor 1.000000e-04 and renormalizing
    INFO: s3gau_io.c(169): Read voxforge_es_sphinx.cd_ptm_4000/means [26x3x128 array]
    INFO: s3gau_io.c(169): Read voxforge_es_sphinx.cd_ptm_4000/variances [26x3x128 array]
    INFO: gauden.c(176): 26 total mgau
    INFO: gauden.c(150): 3 feature streams (|0|=13 |1|=13 |2|=13 )
    INFO: gauden.c(187): 128 total densities
    INFO: gauden.c(90): min_var=1.000000e-05
    INFO: gauden.c(165): compute 4 densities/frame
    INFO: main.c(431): Will reestimate mixing weights.
    INFO: main.c(433): Will reestimate means.
    INFO: main.c(435): Will reestimate variances.
    INFO: main.c(443): Will reestimate transition matrices
    INFO: main.c(456): Reading main dictionary: es.dict
    INFO: lexicon.c(221): 23498 entries added from es.dict
    INFO: main.c(466): Reading filler dictionary: voxforge_es_sphinx.cd_ptm_4000/noisedict
    INFO: lexicon.c(221): 3 entries added from voxforge_es_sphinx.cd_ptm_4000/noisedict
    INFO: corpus.c(1062): Will process all remaining utts starting at 0
    INFO: main.c(665): Reestimation: Baum-Welch
    INFO: main.c(669): Generating profiling information consumes significant CPU resources.
    INFO: main.c(670): If you are not interested in profiling, use -timing no
    column defns
    <seq>
    <id>
    <n_frame_in>
    <n_frame_del>
    <n_state_shmm>
    <avg_states_alpha>
    <avg_states_beta>
    <avg_states_reest>
    <avg_posterior_prune>
    <frame_log_lik>
    <utt_log_lik>
    ... timing info ...
    INFO: cmn.c(133): CMN: 90.23 -5.76 -2.74 -2.49 -1.62 0.68 3.97 2.71 0.05 2.42 1.37 1.07 0.77
    ERROR: "backward.c", line 421: Failed to align audio to trancript: final state of the search is not reached
    ERROR: "baum_welch.c", line 324: frase_0003 ignored
    utt> 0 frase_0003 94 0 96 34 utt 0.009x 1.005e upd 0.009x 0.968e fwd 0.009x 0.952e bwd 0.000x 0.000e gau 0.009x 0.808e rsts 0.000x 0.000e rstf 0.000x 0.000e rstu 0.000x 0.000e</utt_log_lik></frame_log_lik></avg_posterior_prune></avg_states_reest></avg_states_beta></avg_states_alpha></n_state_shmm></n_frame_del></n_frame_in></id></seq>

    INFO: cmn.c(133): CMN: 90.32 -3.99 -3.79 -3.77 -4.96 -2.44 1.35 3.25 3.86 4.16 1.11 -0.09 0.61
    ERROR: "backward.c", line 421: Failed to align audio to trancript: final state of the search is not reached
    ERROR: "baum_welch.c", line 324: frase_0005 ignored
    utt> 1 frase_0005 124 0 140 33 utt 0.010x 1.101e upd 0.010x 1.070e fwd 0.010x 1.058e bwd 0.000x 0.000e gau 0.010x 0.922e rsts 0.000x 0.000e rstf 0.000x 0.000e rstu 0.000x 0.000e

    INFO: cmn.c(133): CMN: 92.31 -6.31 -3.46 -4.67 -3.46 0.70 6.33 3.46 0.69 0.51 0.58 1.58 1.28
    ERROR: "backward.c", line 421: Failed to align audio to trancript: final state of the search is not reached
    ERROR: "baum_welch.c", line 324: frase_0010 ignored
    utt> 2 frase_0010 144 0 164 44 utt 0.011x 1.150e upd 0.011x 1.125e fwd 0.011x 1.114e bwd 0.000x 0.000e gau 0.008x 1.266e rsts 0.000x 0.000e rstf 0.000x 0.000e rstu 0.000x 0.000e

    INFO: cmn.c(133): CMN: 92.89 -7.46 -2.51 -3.84 -3.80 -2.24 3.13 3.43 2.24 1.53 0.75 1.23 1.23
    ERROR: "backward.c", line 421: Failed to align audio to trancript: final state of the search is not reached
    ERROR: "baum_welch.c", line 324: frase_0011 ignored
    utt> 3 frase_0011 106 0 132 38 utt 0.011x 0.901e upd 0.011x 0.863e fwd 0.011x 0.852e bwd 0.000x 0.000e gau 0.011x 0.755e rsts 0.000x 0.000e rstf 0.000x 0.000e rstu 0.000x 0.000e

    INFO: cmn.c(133): CMN: 88.79 -4.30 -2.94 -2.60 -4.40 -2.73 3.61 3.67 0.46 0.15 -0.74 0.09 1.01
    ERROR: "backward.c", line 421: Failed to align audio to trancript: final state of the search is not reached
    ERROR: "baum_welch.c", line 324: frase_0015 ignored
    utt> 4 frase_0015 116 0 116 30 utt 0.007x 1.184e upd 0.007x 1.133e fwd 0.007x 1.114e bwd 0.000x 0.000e gau 0.003x 1.955e rsts 0.000x 0.000e rstf 0.000x 0.000e rstu 0.000x 0.000e

    overall> stats 0 (-0) 0.000000e+00 0.000000e+00 0.000x 1.070e
    WARN: "accum.c", line 628: Over 500 senones never occur in the input data. This is normal for context-dependent untied senone training or for adaptation, but could indicate a serious problem otherwise.
    INFO: s3mixw_io.c(233): Wrote ./mixw_counts [4078x3x128 array]
    INFO: s3tmat_io.c(176): Wrote ./tmat_counts [26x3x4 array]
    INFO: s3gau_io.c(485): Wrote ./gauden_counts with means with vars [26x3x128 vector arrays]
    INFO: main.c(999): Counts saved to .

    Any help please
    thanks a lot in advance

     
    • Nickolay V. Shmyrev

      Your input data has 8bit sample width:

      frase_0003.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 8 bit, mono 16000 Hz

      it should be 16bit.

       
      • Juan J. Alonso

        Juan J. Alonso - 2017-02-26

        That was the point. Much better with adaptation. Now, I will try to add new words.
        Thanks a lot!!

         

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.