Menu

Segmentation Fault when running bw

Help
2014-06-01
2014-06-05
  • Joko Susanto

    Joko Susanto - 2014-06-01

    Hello,

    I'm learning to adapt default acoustic model and I'm following from http://cmusphinx.sourceforge.net/wiki/tutorialadapt. When running bw I got segmentation fault. This is my command

    ./bw
    -hmmdir hub4wsj_sc_8k
    -moddeffn hub4wsj_sc_8k/mdef.txt
    -ts2cbfn .semi. -feat 1s_c_d_dd
    -svspec 0-12/13-25/26-38
    -cmn current -agc none
    -dictfn arctic.dic
    -ctlfn adaptation-test.fileids
    -lsnfn adaptation-test.transcription
    -accumdir .

    INFO: main.c(229): Compiled on May 24 2014 at 23:28:52
    INFO: cmd_ln.c(691): Parsing command line:
    ./bw \ -hmmdir hub4wsj_sc_8k \ -moddeffn hub4wsj_sc_8k/mdef.txt \ -ts2cbfn .semi. \ -feat 1s_c_d_dd \ -svspec 0-12/13-25/26-38 \ -cmn current \ -agc none \ -dictfn arctic.dic \ -ctlfn adaptation-test.fileids \ -lsnfn adaptation-test.transcription \ -accumdir .

    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -2passvar no no
    -abeam 1e-100 1.000000e-100
    -accumdir .
    -agc none none
    -agcthresh 2.0 2.000000e+00
    -bbeam 1e-100 1.000000e-100
    -cb2mllrfn .1cls. .1cls.
    -cepdir
    -cepext mfc mfc
    -ceplen 13 13
    -ckptintv 0
    -cmn current current
    -cmninit 8.0 8.0
    -ctlfn adaptation-test.fileids
    -diagfull no no
    -dictfn arctic.dic
    -example no no
    -fdictfn
    -feat 1s_c_d_dd 1s_c_d_dd
    -fullsuffixmatch no no
    -fullvar no no
    -help no no
    -hmmdir hub4wsj_sc_8k
    -latdir
    -latext
    -lda
    -ldaaccum no no
    -ldadim 0 0
    -lsnfn adaptation-test.transcription
    -lw 11.5 1.150000e+01
    -maxuttlen 0 0
    -meanfn
    -meanreest yes yes
    -mixwfn
    -mixwreest yes yes
    -mllrmat
    -mmie no no
    -mmie_type rand rand
    -moddeffn hub4wsj_sc_8k/mdef.txt
    -mwfloor 0.00001 1.000000e-05
    -npart 0
    -nskip 0
    -outphsegdir
    -outputfullpath no no
    -part 0
    -pdumpdir
    -phsegdir
    -phsegext phseg phseg
    -runlen -1 -1
    -sentdir
    -sentext sent sent
    -spthresh 0.0 0.000000e+00
    -svspec 0-12/13-25/26-38
    -timing yes yes
    -tmatfn
    -tmatreest yes yes
    -topn 4 4
    -tpfloor 0.0001 1.000000e-04
    -ts2cbfn .semi.
    -varfloor 0.00001 1.000000e-05
    -varfn
    -varnorm no no
    -varreest yes yes
    -viterbi no no

    INFO: feat.c(713): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
    INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
    INFO: main.c(254): Using subvector specification 0-12/13-25/26-38
    INFO: main.c(318): Reading hub4wsj_sc_8k/mdef.txt
    INFO: model_def_io.c(573): Model definition info:
    INFO: model_def_io.c(574): 143097 total models defined (50 base, 143047 tri)
    INFO: model_def_io.c(575): 572388 total states
    INFO: model_def_io.c(576): 5150 total tied states
    INFO: model_def_io.c(577): 150 total tied CI states
    INFO: model_def_io.c(578): 50 total tied transition matrices
    INFO: model_def_io.c(579): 4 max state/model
    INFO: model_def_io.c(580): 4 min state/model
    INFO: s3mixw_io.c(116): Read hub4wsj_sc_8k/mixture_weights [5150x3x256 array]
    INFO: s3tmat_io.c(115): Read hub4wsj_sc_8k/transition_matrices [50x3x4 array]
    INFO: mod_inv.c(300): inserting tprob floor 1.000000e-04 and renormalizing
    INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/means [1x3x256 array]
    INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/variances [1x3x256 array]
    INFO: gauden.c(183): 1 total mgau
    INFO: gauden.c(157): 3 feature streams (|0|=13 |1|=13 |2|=13 )
    INFO: gauden.c(194): 256 total densities
    INFO: gauden.c(97): min_var=1.000000e-05
    INFO: gauden.c(172): compute 4 densities/frame
    INFO: main.c(430): Will reestimate mixing weights.
    INFO: main.c(432): Will reestimate means.
    INFO: main.c(434): Will reestimate variances.
    INFO: main.c(442): Will reestimate transition matrices
    INFO: main.c(455): Reading main lexicon: arctic.dic
    INFO: lexicon.c(220): 179 entries added from arctic.dic
    INFO: main.c(465): Reading filler lexicon: hub4wsj_sc_8k/noisedict
    INFO: lexicon.c(220): 11 entries added from hub4wsj_sc_8k/noisedict
    INFO: corpus.c(1086): Will process all remaining utts starting at 0
    INFO: main.c(665): Reestimation: Baum-Welch
    INFO: main.c(670): Generating profiling information consumes significant CPU resources.
    INFO: main.c(671): If you are not interested in profiling, use -timing no
    column defns
    <seq>
    <id>
    <n_frame_in>
    <n_frame_del>
    <n_state_shmm>
    <avg_states_alpha>
    <avg_states_beta>
    <avg_states_reest>
    <avg_posterior_prune>
    <frame_log_lik>
    <utt_log_lik>
    ... timing info ...
    INFO: cmn.c(175): CMN: 691.88 487.06 -1108.00 -1310.86 461.71 523.99 951.00 -399.06 -1273.62 -1192.80 -804.94 -260.02 -873.70
    Segmentation fault

    I'm using :
    1. sphinxtrain-1.0.8
    2. sphinxbase-0.8
    3. pocketsphinx-0.8

     
    • Nickolay V. Shmyrev

      Your CMN values are abnormal, so it's something wrong with the input data and previous steps. Make sure your input data has correct format, it must be 16khz 16bit mono file.

       
  • Joko Susanto

    Joko Susanto - 2014-06-02

    Thanks for your response Nickolay. I've tried your suggestion, after recording my voice, I tried with the following results
    test1.wav:

    File Size: 128k Bit Rate: 256k
    Encoding: Signed PCM
    Channels: 1 @ 16-bit
    Samplerate: 16000Hz
    Replaygain: off
    Duration: 00:00:04.00

    In:100% 00:00:04.00 [00:00:00.00] Out:64.0k [ | ] Hd:3.7 Clip:0
    Done.

    test2.wav:

    File Size: 192k Bit Rate: 256k
    Encoding: Signed PCM
    Channels: 1 @ 16-bit
    Samplerate: 16000Hz
    Replaygain: off
    Duration: 00:00:06.00

    In:100% 00:00:06.00 [00:00:00.00] Out:96.0k [ | ] Clip:0
    Done.

    I use this command to create a file mfc
    sphinx_fe -argfile hub4wsj_sc_8k/feat.params \ -samprate 16000 -c adaptation-test.fileids \ -di . -do . -ei wav -eo mfc -mswav yes
    If there is something wrong with the command?

    I also have to download a wav file from this forum, and have tried it with the same error. I send you the file. Thanks for your attention.

     
    • Nickolay V. Shmyrev

      Your audio files are ok. Share your mfc files.

      Make sure you installed cmusphinx tools properly, it might be that you somehow have old installation somewhere and it breaks the results.

       
  • Joko Susanto

    Joko Susanto - 2014-06-03

    Yes, you are right. After I reinstall, I run bw with the following result

    INFO: main.c(229): Compiled on Jun 3 2014 at 09:05:01
    INFO: cmd_ln.c(691): Parsing command line:
    ./bw \ -hmmdir hub4wsj_sc_8k \ -moddeffn hub4wsj_sc_8k/mdef.txt \ -ts2cbfn .semi. \ -feat 1s_c_d_dd \ -svspec 0-12/13-25/26-38 \ -cmn current \ -agc none \ -dictfn arctic20.dic \ -ctlfn arctic20.fileids \ -lsnfn arctic20.transcription \ -accumdir .

    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -2passvar no no
    -abeam 1e-100 1.000000e-100
    -accumdir .
    -agc none none
    -agcthresh 2.0 2.000000e+00
    -bbeam 1e-100 1.000000e-100
    -cb2mllrfn .1cls. .1cls.
    -cepdir
    -cepext mfc mfc
    -ceplen 13 13
    -ckptintv 0
    -cmn current current
    -cmninit 8.0 8.0
    -ctlfn arctic20.fileids
    -diagfull no no
    -dictfn arctic20.dic
    -example no no
    -fdictfn
    -feat 1s_c_d_dd 1s_c_d_dd
    -fullsuffixmatch no no
    -fullvar no no
    -help no no
    -hmmdir hub4wsj_sc_8k
    -latdir
    -latext
    -lda
    -ldaaccum no no
    -ldadim 0 0
    -lsnfn arctic20.transcription
    -lw 11.5 1.150000e+01
    -maxuttlen 0 0
    -meanfn
    -meanreest yes yes
    -mixwfn
    -mixwreest yes yes
    -mllrmat
    -mmie no no
    -mmie_type rand rand
    -moddeffn hub4wsj_sc_8k/mdef.txt
    -mwfloor 0.00001 1.000000e-05
    -npart 0
    -nskip 0
    -outphsegdir
    -outputfullpath no no
    -part 0
    -pdumpdir
    -phsegdir
    -phsegext phseg phseg
    -runlen -1 -1
    -sentdir
    -sentext sent sent
    -spthresh 0.0 0.000000e+00
    -svspec 0-12/13-25/26-38
    -timing yes yes
    -tmatfn
    -tmatreest yes yes
    -topn 4 4
    -tpfloor 0.0001 1.000000e-04
    -ts2cbfn .semi.
    -varfloor 0.00001 1.000000e-05
    -varfn
    -varnorm no no
    -varreest yes yes
    -viterbi no no

    INFO: feat.c(713): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
    INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
    INFO: main.c(254): Using subvector specification 0-12/13-25/26-38
    INFO: main.c(318): Reading hub4wsj_sc_8k/mdef.txt
    INFO: model_def_io.c(573): Model definition info:
    INFO: model_def_io.c(574): 143097 total models defined (50 base, 143047 tri)
    INFO: model_def_io.c(575): 572388 total states
    INFO: model_def_io.c(576): 5150 total tied states
    INFO: model_def_io.c(577): 150 total tied CI states
    INFO: model_def_io.c(578): 50 total tied transition matrices
    INFO: model_def_io.c(579): 4 max state/model
    INFO: model_def_io.c(580): 4 min state/model
    INFO: s3mixw_io.c(116): Read hub4wsj_sc_8k/mixture_weights [5150x3x256 array]
    INFO: s3tmat_io.c(115): Read hub4wsj_sc_8k/transition_matrices [50x3x4 array]
    INFO: mod_inv.c(300): inserting tprob floor 1.000000e-04 and renormalizing
    INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/means [1x3x256 array]
    INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/variances [1x3x256 array]
    INFO: gauden.c(183): 1 total mgau
    INFO: gauden.c(157): 3 feature streams (|0|=13 |1|=13 |2|=13 )
    INFO: gauden.c(194): 256 total densities
    INFO: gauden.c(97): min_var=1.000000e-05
    INFO: gauden.c(172): compute 4 densities/frame
    INFO: main.c(430): Will reestimate mixing weights.
    INFO: main.c(432): Will reestimate means.
    INFO: main.c(434): Will reestimate variances.
    INFO: main.c(442): Will reestimate transition matrices
    INFO: main.c(455): Reading main lexicon: arctic20.dic
    INFO: lexicon.c(220): 189 entries added from arctic20.dic
    INFO: main.c(465): Reading filler lexicon: hub4wsj_sc_8k/noisedict
    INFO: lexicon.c(220): 11 entries added from hub4wsj_sc_8k/noisedict
    INFO: corpus.c(1086): Will process all remaining utts starting at 0
    INFO: main.c(665): Reestimation: Baum-Welch
    INFO: main.c(670): Generating profiling information consumes significant CPU resources.
    INFO: main.c(671): If you are not interested in profiling, use -timing no
    column defns
    <seq>
    <id>
    <n_frame_in>
    <n_frame_del>
    <n_state_shmm>
    <avg_states_alpha>
    <avg_states_beta>
    <avg_states_reest>
    <avg_posterior_prune>
    <frame_log_lik>
    <utt_log_lik>
    ... timing info ...
    INFO: cmn.c(175): CMN: 43.03 3.93 -0.29 -0.89 0.24 -0.78 -0.84 -0.19 0.82 0.69 -0.05 -0.26 -0.20
    utt> 0 arctic_0001 306 0 108 28 9 8 3.616702e-102 -6.253867e+01 -1.913683e+04 utt 0.030x 2.190e upd 0.027x 2.078e fwd 0.007x 1.568e bwd 0.016x 0.929e gau 0.065x 1.185e rsts 0.004x 1.027e rstf 0.001x 0.218e rstu 0.000x 1381477023605.738e

    INFO: cmn.c(175): CMN: 41.28 4.07 0.25 -0.85 0.18 -0.80 -1.14 -0.30 0.63 0.57 -0.06 -0.37 -0.20
    utt> 1 arctic_0002 255 0 104 30 9 9 2.138527e-102 -6.126169e+01 -1.562173e+04 utt 0.019x 1.210e upd 0.019x 1.198e fwd 0.003x 1.223e bwd 0.014x 1.310e gau 0.093x 1.246e rsts 0.005x 0.852e rstf -0.000x 0.000e rstu 0.002x 0.092e

    INFO: cmn.c(175): CMN: 42.80 3.33 1.01 -0.41 -0.41 -1.09 -1.01 -0.06 0.64 0.66 -0.13 -0.42 -0.04
    utt> 2 arctic_0003 306 0 108 32 13 10 3.497573e-102 -6.041223e+01 -1.848614e+04 utt 0.021x 1.288e upd 0.021x 1.278e fwd 0.003x 1.270e bwd 0.016x 1.109e gau 0.154x 0.828e rsts 0.007x 1.015e rstf 0.000x 79175131460.121e rstu 0.000x 1297272376800.195e

    INFO: cmn.c(175): CMN: 42.14 3.17 0.48 -0.32 -0.36 -1.11 -1.03 -0.26 0.67 0.57 -0.12 -0.36 -0.13
    utt> 3 arctic_0004 306 0 104 32 11 11 2.008941e-102 -6.064926e+01 -1.855867e+04 utt 0.022x 1.098e upd 0.022x 1.089e fwd 0.004x 1.306e bwd 0.018x 1.032e gau 0.152x 0.873e rsts 0.008x 1.088e rstf -0.000x 0.000e rstu 0.000x 11504310568606.896e

    INFO: cmn.c(175): CMN: 40.75 2.23 0.88 -0.09 -0.22 -0.88 -0.63 -0.28 0.47 0.75 -0.09 -0.40 0.11
    utt> 4 arctic_0005 255 0 92 33 15 14 4.459699e-102 -6.112698e+01 -1.558738e+04 utt 0.025x 1.126e upd 0.025x 1.117e fwd 0.005x 0.830e bwd 0.020x 1.171e gau 0.227x 1.042e rsts 0.008x 0.910e rstf -0.000x 0.000e rstu -0.000x 0.000e

    overall> stats 1428 (-0) -6.119801e+01 -8.739076e+04 0.024x 1.455e
    WARNING: "accum.c", line 617: Over 500 senones never occur in the input data. This is normal for context-dependent untied senone training or for adaptation, but could indicate a serious problem otherwise.
    INFO: s3mixw_io.c(232): Wrote ./mixw_counts [5150x3x256 array]
    INFO: s3tmat_io.c(174): Wrote ./tmat_counts [50x3x4 array]
    INFO: s3gau_io.c(478): Wrote ./gauden_counts with means with vars [1x3x256 vector arrays]
    INFO: main.c(1014): Counts saved to .

    I see some warning like this
    WARNING: "accum.c", line 617: Over 500 senones never occur in the input data. This is normal for context-dependent untied senone training or for adaptation, but could indicate a serious problem otherwise.

    Is that OK? Because I got this when running ./word_align.pl adaptation-test.transcription adaptation-test.hyp

    NYALAKAN LAMPU RUANG TAMU SATU (ARCTIC_0001)
    (ARCTIC_0001)
    Words: 5 Correct: 0 Errors: 5 Percent correct = 0.00% Error = 100.00% Accuracy = 0.00%
    Insertions: 0 Deletions: 5 Substitutions: 0
    MATIKAN LAMPU RUANG TAMU SATU (ARCTIC_0002)
    (ARCTIC_0002)
    Words: 5 Correct: 0 Errors: 5 Percent correct = 0.00% Error = 100.00% Accuracy = 0.00%
    Insertions: 0 Deletions: 5 Substitutions: 0
    TOTAL Words: 10 Correct: 0 Errors: 10
    TOTAL Percent correct = 0.00% Error = 100.00% Accuracy = 0.00%
    TOTAL Insertions: 0 Deletions: 10 Substitutions: 0

    I tried to adapt Indonesia languange with CMU Sphinx default acoustic. Can I do like that or I have to build a new one?

     
  • Nickolay V. Shmyrev

    Because I got this when running ./word_align.pl adaptation-test.transcription adaptation-test.hyp
    NYALAKAN LAMPU RUANG TAMU SATU (ARCTIC_0001)

    There is an error in your decoding setup. You need to provide more information on that including commands you are running, information about your system, data files to get help on this issue.

     

    Last edit: Nickolay V. Shmyrev 2014-06-03
  • Joko Susanto

    Joko Susanto - 2014-06-04

    Hi, Nickolay. Thanks for helping me.
    I'm using Linux Backtrack 5r3 running on Virtual Machine (VMWare). I'm using :
    1. SphinxBase-0.8
    2. SphinxTrain-1.0.8
    3. PocketSphinx-0.8

    I'm using following command
    -> Create Wav File

    for i in seq 1 4; do
    fn=printf arctic_%04d $i;
    read sent; echo $sent;
    rec -r 16000 -e signed-integer -b 16 -c 1 $fn.wav 2>/dev/null;
    done < arctic20.txt

    -> Generating acoustic feature files (MFC File)

    sphinx_fe -argfile hub4wsj_sc_8k/feat.params \ -samprate 16000 -c arctic20.fileids \ -di . -do . -ei wav -eo mfc -mswav yes

    Result
    INFO: cmd_ln.c(691): Parsing command line:
    sphinx_fe \ -argfile hub4wsj_sc_8k/feat.params \ -samprate 16000 \ -c arctic20.fileids \ -di . \ -do . \ -ei wav \ -eo mfc \ -mswav yes

    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -alpha 0.97 9.700000e-01
    -argfile hub4wsj_sc_8k/feat.params
    -blocksize 2048 2048
    -build_outdirs yes yes
    -c arctic20.fileids
    -cep2spec no no
    -di .
    -dither no no
    -do .
    -doublebw no no
    -ei wav
    -eo mfc
    -example no no
    -frate 100 100
    -help no no
    -i
    -input_endian little little
    -lifter 0 0
    -logspec no no
    -lowerf 133.33334 1.333333e+02
    -mach_endian little little
    -mswav no yes
    -ncep 13 13
    -nchans 1 1
    -nfft 512 512
    -nfilt 40 40
    -nist no no
    -npart 0 0
    -nskip 0 0
    -o
    -ofmt sphinx sphinx
    -part 0 0
    -raw no no
    -remove_dc no no
    -round_filters yes yes
    -runlen -1 -1
    -samprate 16000 1.600000e+04
    -seed -1 -1
    -smoothspec no no
    -spec2cep no no
    -sph2pipe no no
    -transform legacy legacy
    -unit_area yes yes
    -upperf 6855.4976 6.855498e+03
    -verbose no no
    -warp_params
    -warp_type inverse_linear inverse_linear
    -whichchan 0 0
    -wlen 0.025625 2.562500e-02

    INFO: cmd_ln.c(691): Parsing command line:
    \ -nfilt 20 \ -lowerf 1 \ -upperf 4000 \ -wlen 0.025 \ -transform dct \ -round_filters no \ -remove_dc yes \ -svspec 0-12/13-25/26-38 \ -feat 1s_c_d_dd \ -agc none \ -cmn current \ -cmninit 56,-3,1 \ -varnorm no

    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -alpha 0.97 9.700000e-01
    -argfile hub4wsj_sc_8k/feat.params
    -blocksize 2048 2048
    -build_outdirs yes yes
    -c arctic20.fileids
    -cep2spec no no
    -di .
    -dither no no
    -do .
    -doublebw no no
    -ei wav
    -eo mfc
    -example no no
    -frate 100 100
    -help no no
    -i
    -input_endian little little
    -lifter 0 0
    -logspec no no
    -lowerf 133.33334 1.000000e+00
    -mach_endian little little
    -mswav no yes
    -ncep 13 13
    -nchans 1 1
    -nfft 512 512
    -nfilt 40 20
    -nist no no
    -npart 0 0
    -nskip 0 0
    -o
    -ofmt sphinx sphinx
    -part 0 0
    -raw no no
    -remove_dc no yes
    -round_filters yes no
    -runlen -1 -1
    -samprate 16000 1.600000e+04
    -seed -1 -1
    -smoothspec no no
    -spec2cep no no
    -sph2pipe no no
    -transform legacy dct
    -unit_area yes yes
    -upperf 6855.4976 4.000000e+03
    -verbose no no
    -warp_params
    -warp_type inverse_linear inverse_linear
    -whichchan 0 0
    -wlen 0.025625 2.500000e-02

    INFO: sphinx_fe.c(1043): Processing all remaining utterances at position 0

    -> Accumulating observation counts

    ./bw \ -hmmdir hub4wsj_sc_8k \ -moddeffn hub4wsj_sc_8k/mdef.txt \ -ts2cbfn .semi. \ -feat 1s_c_d_dd \ -svspec 0-12/13-25/26-38 \ -cmn current \ -agc none \ -dictfn arctic20.dic \ -ctlfn arctic20.fileids \ -lsnfn arctic20.transcription \ -accumdir .

    Result
    INFO: main.c(229): Compiled on Jun 3 2014 at 09:05:01
    INFO: cmd_ln.c(691): Parsing command line:
    ./bw \ -hmmdir hub4wsj_sc_8k \ -moddeffn hub4wsj_sc_8k/mdef.txt \ -ts2cbfn .semi. \ -feat 1s_c_d_dd \ -svspec 0-12/13-25/26-38 \ -cmn current \ -agc none \ -dictfn arctic20.dic \ -ctlfn arctic20.fileids \ -lsnfn arctic20.transcription \ -accumdir .

    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -2passvar no no
    -abeam 1e-100 1.000000e-100
    -accumdir .
    -agc none none
    -agcthresh 2.0 2.000000e+00
    -bbeam 1e-100 1.000000e-100
    -cb2mllrfn .1cls. .1cls.
    -cepdir
    -cepext mfc mfc
    -ceplen 13 13
    -ckptintv 0
    -cmn current current
    -cmninit 8.0 8.0
    -ctlfn arctic20.fileids
    -diagfull no no
    -dictfn arctic20.dic
    -example no no
    -fdictfn
    -feat 1s_c_d_dd 1s_c_d_dd
    -fullsuffixmatch no no
    -fullvar no no
    -help no no
    -hmmdir hub4wsj_sc_8k
    -latdir
    -latext
    -lda
    -ldaaccum no no
    -ldadim 0 0
    -lsnfn arctic20.transcription
    -lw 11.5 1.150000e+01
    -maxuttlen 0 0
    -meanfn
    -meanreest yes yes
    -mixwfn
    -mixwreest yes yes
    -mllrmat
    -mmie no no
    -mmie_type rand rand
    -moddeffn hub4wsj_sc_8k/mdef.txt
    -mwfloor 0.00001 1.000000e-05
    -npart 0
    -nskip 0
    -outphsegdir
    -outputfullpath no no
    -part 0
    -pdumpdir
    -phsegdir
    -phsegext phseg phseg
    -runlen -1 -1
    -sentdir
    -sentext sent sent
    -spthresh 0.0 0.000000e+00
    -svspec 0-12/13-25/26-38
    -timing yes yes
    -tmatfn
    -tmatreest yes yes
    -topn 4 4
    -tpfloor 0.0001 1.000000e-04
    -ts2cbfn .semi.
    -varfloor 0.00001 1.000000e-05
    -varfn
    -varnorm no no
    -varreest yes yes
    -viterbi no no

    INFO: feat.c(713): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
    INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
    INFO: main.c(254): Using subvector specification 0-12/13-25/26-38
    INFO: main.c(318): Reading hub4wsj_sc_8k/mdef.txt
    INFO: model_def_io.c(573): Model definition info:
    INFO: model_def_io.c(574): 143097 total models defined (50 base, 143047 tri)
    INFO: model_def_io.c(575): 572388 total states
    INFO: model_def_io.c(576): 5150 total tied states
    INFO: model_def_io.c(577): 150 total tied CI states
    INFO: model_def_io.c(578): 50 total tied transition matrices
    INFO: model_def_io.c(579): 4 max state/model
    INFO: model_def_io.c(580): 4 min state/model
    INFO: s3mixw_io.c(116): Read hub4wsj_sc_8k/mixture_weights [5150x3x256 array]
    INFO: s3tmat_io.c(115): Read hub4wsj_sc_8k/transition_matrices [50x3x4 array]
    INFO: mod_inv.c(300): inserting tprob floor 1.000000e-04 and renormalizing
    INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/means [1x3x256 array]
    INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/variances [1x3x256 array]
    INFO: gauden.c(183): 1 total mgau
    INFO: gauden.c(157): 3 feature streams (|0|=13 |1|=13 |2|=13 )
    INFO: gauden.c(194): 256 total densities
    INFO: gauden.c(97): min_var=1.000000e-05
    INFO: gauden.c(172): compute 4 densities/frame
    INFO: main.c(430): Will reestimate mixing weights.
    INFO: main.c(432): Will reestimate means.
    INFO: main.c(434): Will reestimate variances.
    INFO: main.c(442): Will reestimate transition matrices
    INFO: main.c(455): Reading main lexicon: arctic20.dic
    WARNING: "lexicon.c", line 176: Lexicon arctic20.dic has a blank line at line 0
    INFO: lexicon.c(220): 189 entries added from arctic20.dic
    INFO: main.c(465): Reading filler lexicon: hub4wsj_sc_8k/noisedict
    INFO: lexicon.c(220): 11 entries added from hub4wsj_sc_8k/noisedict
    INFO: corpus.c(1086): Will process all remaining utts starting at 0
    INFO: main.c(665): Reestimation: Baum-Welch
    INFO: main.c(670): Generating profiling information consumes significant CPU resources.
    INFO: main.c(671): If you are not interested in profiling, use -timing no
    column defns
    <seq>
    <id>
    <n_frame_in>
    <n_frame_del>
    <n_state_shmm>
    <avg_states_alpha>
    <avg_states_beta>
    <avg_states_reest>
    <avg_posterior_prune>
    <frame_log_lik>
    <utt_log_lik>
    ... timing info ...
    INFO: cmn.c(175): CMN: 14.51 7.46 0.27 -1.64 -0.72 -1.37 -0.95 -1.66 -0.27 0.05 -0.30 -1.16 -0.39
    utt> 0 arctic_0001 306 0 108 33 10 10 2.562535e-102 -9.040966e+01 -2.766536e+04 utt 0.026x 1.895e upd 0.026x 1.886e fwd 0.004x 1.340e bwd 0.022x 1.969e gau 0.103x 3.025e rsts 0.012x 1.289e rstf -0.000x 0.000e rstu 0.000x 575571466439884.750e

    INFO: cmn.c(175): CMN: 12.90 6.86 1.07 -1.22 -0.73 -1.25 -0.73 -1.06 -0.12 -0.31 0.14 -0.82 -0.57
    utt> 1 arctic_0002 306 0 104 39 12 10 3.030896e-102 -9.233116e+01 -2.825333e+04 utt 0.025x 1.314e upd 0.025x 1.306e fwd 0.005x 1.422e bwd 0.020x 1.252e gau 0.157x 1.364e rsts 0.004x 1.827e rstf 0.000x 958099437935.158e rstu 0.000x 131265353625996.391e

    INFO: cmn.c(175): CMN: 11.70 7.07 2.65 0.30 -1.12 -0.42 -0.49 -1.97 -0.41 -0.71 -0.43 -0.53 0.05
    utt> 2 arctic_0003 306 0 104 53 16 12 4.827730e-102 -9.495314e+01 -2.905566e+04 utt 0.043x 1.157e upd 0.043x 1.151e fwd 0.005x 1.042e bwd 0.024x 1.259e gau 0.241x 1.365e rsts 0.008x 1.089e rstf 0.000x 1936019416417.208e rstu 0.014x 1.008e

    INFO: cmn.c(175): CMN: 9.52 3.98 0.63 -0.09 -0.22 -0.73 0.19 -2.02 -0.43 -0.11 0.27 -0.43 -0.22
    utt> 3 arctic_0004 255 0 92 48 24 17 3.436941e-102 -9.754965e+01 -2.487516e+04 utt 0.039x 1.311e upd 0.039x 1.304e fwd 0.003x 1.800e bwd 0.035x 1.310e gau 0.529x 1.509e rsts 0.011x 1.776e rstf 0.002x 0.236e rstu 0.000x 8933300837518.277e

    overall> stats 1173 (-0) -9.364835e+01 -1.098495e+05 0.033x 1.379e
    WARNING: "accum.c", line 617: Over 500 senones never occur in the input data. This is normal for context-dependent untied senone training or for adaptation, but could indicate a serious problem otherwise.
    INFO: s3mixw_io.c(232): Wrote ./mixw_counts [5150x3x256 array]
    INFO: s3tmat_io.c(174): Wrote ./tmat_counts [50x3x4 array]
    INFO: s3gau_io.c(478): Wrote ./gauden_counts with means with vars [1x3x256 vector arrays]
    INFO: main.c(1014): Counts saved to .

    -> Creating transformation with MLLR

    ./mllr_solve \ -meanfn hub4wsj_sc_8k/means \ -varfn hub4wsj_sc_8k/variances \ -outmllrfn mllr_matrix -accumdir .

    Result
    INFO: cmd_ln.c(691): Parsing command line:
    ./mllr_solve \ -meanfn hub4wsj_sc_8k/means \ -varfn hub4wsj_sc_8k/variances \ -outmllrfn mllr_matrix \ -accumdir .

    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -accumdir .,
    -cb2mllrfn .1cls. .1cls.
    -cdonly no no
    -example no no
    -fullvar no no
    -help no no
    -meanfn hub4wsj_sc_8k/means
    -mllradd yes yes
    -mllrmult yes yes
    -moddeffn
    -outmllrfn mllr_matrix
    -varfloor 1e-3 1.000000e-03
    -varfn hub4wsj_sc_8k/variances

    INFO: main.c(382): -- 1. Read input mean, (var) and accumulation.
    INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/means [1x3x256 array]
    INFO: main.c(397): Reading and accumulating counts from .
    INFO: s3gau_io.c(379): Read ./gauden_counts with means with vars [1x3x256 vector arrays]

    INFO: main.c(436): -- 2. Read cb2mllrfn
    INFO: main.c(455): n_mllr_class = 1

    INFO: main.c(475): -- 3. Calculate mllr matrices
    INFO: main.c(127):
    INFO: main.c(128): ---- mllr_solve(): Conventional MLLR method
    INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/variances [1x3x256 array]

    INFO: main.c(208): ---- A. Accum regl, regr
    INFO: main.c(209): No classes 1, no. stream 3
    INFO: main.c(281): ---- B. Compute MLLR matrices (A,B)
    INFO: mllr.c(182): Computing both multiplicative and additive part of MLLR
    INFO: mllr.c(182): Computing both multiplicative and additive part of MLLR
    INFO: mllr.c(182): Computing both multiplicative and additive part of MLLR

    INFO: main.c(497): -- 4. Store mllr matrices (A,B) to mllr_matrix

    -> Updating the acoustic model files with MAP

    cp -a hub4wsj_sc_8k hub4wsj_sc_8kadapt
    ./map_adapt \ -meanfn hub4wsj_sc_8k/means \ -varfn hub4wsj_sc_8k/variances \ -mixwfn hub4wsj_sc_8k/mixture_weights \ -tmatfn hub4wsj_sc_8k/transition_matrices \ -accumdir . \ -mapmeanfn hub4wsj_sc_8kadapt/means \ -mapvarfn hub4wsj_sc_8kadapt/variances \ -mapmixwfn hub4wsj_sc_8kadapt/mixture_weights \ -maptmatfn hub4wsj_sc_8kadapt/transition_matrices

    Result
    INFO: cmd_ln.c(691): Parsing command line:
    ./map_adapt \ -meanfn hub4wsj_sc_8k/means \ -varfn hub4wsj_sc_8k/variances \ -mixwfn hub4wsj_sc_8k/mixture_weights \ -tmatfn hub4wsj_sc_8k/transition_matrices \ -accumdir . \ -mapmeanfn hub4wsj_sc_8kadapt/means \ -mapvarfn hub4wsj_sc_8kadapt/variances \ -mapmixwfn hub4wsj_sc_8kadapt/mixture_weights \ -maptmatfn hub4wsj_sc_8kadapt/transition_matrices

    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -accumdir .,
    -bayesmean yes yes
    -example no no
    -fixedtau no no
    -help no no
    -mapmeanfn hub4wsj_sc_8kadapt/means
    -mapmixwfn hub4wsj_sc_8kadapt/mixture_weights
    -maptmatfn hub4wsj_sc_8kadapt/transition_matrices
    -mapvarfn hub4wsj_sc_8kadapt/variances
    -meanfn hub4wsj_sc_8k/means
    -mixwfn hub4wsj_sc_8k/mixture_weights
    -mwfloor 0.00001 1.000000e-05
    -tau 10.0 1.000000e+01
    -tmatfn hub4wsj_sc_8k/transition_matrices
    -tpfloor 0.0001 1.000000e-04
    -varfloor 0.00001 1.000000e-05
    -varfn hub4wsj_sc_8k/variances

    INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/means [1x3x256 array]
    INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/variances [1x3x256 array]
    INFO: s3mixw_io.c(116): Read hub4wsj_sc_8k/mixture_weights [5150x3x256 array]
    INFO: s3tmat_io.c(115): Read hub4wsj_sc_8k/transition_matrices [50x3x4 array]
    INFO: main.c(425): Reading and accumulating observation counts from .
    INFO: s3gau_io.c(379): Read ./gauden_counts with means with vars [1x3x256 vector arrays]
    INFO: s3mixw_io.c(116): Read ./mixw_counts [5150x3x256 array]
    INFO: s3tmat_io.c(115): Read ./tmat_counts [50x3x4 array]
    INFO: main.c(77): Estimating tau hyperparameter from variances and observations
    INFO: main.c(139): Re-estimating mixture weights using MAP
    INFO: main.c(194): Re-estimating transition probabilities using MAP
    INFO: main.c(496): Re-estimating means using Bayesian interpolation
    INFO: main.c(500): Interpolating tau hyperparameter for semi-continuous models
    INFO: main.c(502): Re-estimating variances using MAP
    INFO: s3gau_io.c(226): Wrote hub4wsj_sc_8kadapt/means [1x3x256 array]
    INFO: s3gau_io.c(226): Wrote hub4wsj_sc_8kadapt/variances [1x3x256 array]
    INFO: s3mixw_io.c(232): Wrote hub4wsj_sc_8kadapt/mixture_weights [5150x3x256 array]
    INFO: s3tmat_io.c(174): Wrote hub4wsj_sc_8kadapt/transition_matrices [50x3x4 array]

    -> Recreating the adapted sendump file

    ./mk_s2sendump \ -pocketsphinx yes \ -moddeffn hub4wsj_sc_8kadapt/mdef.txt \ -mixwfn hub4wsj_sc_8kadapt/mixture_weights \ -sendumpfn hub4wsj_sc_8kadapt/sendump

    Result
    INFO: cmd_ln.c(691): Parsing command line:
    ./mk_s2sendump \ -pocketsphinx yes \ -moddeffn hub4wsj_sc_8kadapt/mdef.txt \ -mixwfn hub4wsj_sc_8kadapt/mixture_weights \ -sendumpfn hub4wsj_sc_8kadapt/sendump

    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -example no no
    -help no no
    -mixwfn hub4wsj_sc_8kadapt/mixture_weights
    -moddeffn hub4wsj_sc_8kadapt/mdef.txt
    -mwfloor 0.00001 1.000000e-05
    -pocketsphinx no yes
    -sendumpfn hub4wsj_sc_8kadapt/sendump

    INFO: model_def_io.c(573): Model definition info:
    INFO: model_def_io.c(574): 143097 total models defined (50 base, 143047 tri)
    INFO: model_def_io.c(575): 572388 total states
    INFO: model_def_io.c(576): 5150 total tied states
    INFO: model_def_io.c(577): 150 total tied CI states
    INFO: model_def_io.c(578): 50 total tied transition matrices
    INFO: model_def_io.c(579): 4 max state/model
    INFO: model_def_io.c(580): 4 min state/model
    INFO: senone.c(210): Reading senone mixture weights: hub4wsj_sc_8kadapt/mixture_weights
    INFO: senone.c(330): Read mixture weights for 5150 senones: 3 features x 256 codewords
    INFO: mk_s2sendump.c(207): Writing PocketSphinx format sendump file: hub4wsj_sc_8kadapt/sendump

    After do it all, I copy the new acoustic model, new language model, and files needed for testing to new folder named Testing Adaptation. I run the following command :

    pocketsphinx_batch \ -adcin yes \ -cepdir wav \ -cepext .wav \ -ctl adaptation-test.fileids \ -lm <your.lm> \ -dict <your.dic, for="" example="" arctic.dic=""> \ -hmm <your_new_adapted_model, for="" example="" hub4wsj_sc_8kadapt=""> \ -hyp adapation-test.hyp

    Result
    INFO: cmd_ln.c(691): Parsing command line:
    pocketsphinx_batch \ -adcin yes \ -cepdir wav \ -cepext .wav \ -ctl adaptation-test.fileids \ -lm corpus.lm \ -dict arctic20.dic \ -hmm hub4wsj_sc_8kadapt \ -hyp adaptation-test.hyp

    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -adchdr 0 0
    -adcin no yes
    -agc none none
    -agcthresh 2.0 2.000000e+00
    -alpha 0.97 9.700000e-01
    -argfile
    -ascale 20.0 2.000000e+01
    -aw 1 1
    -backtrace no no
    -beam 1e-48 1.000000e-48
    -bestpath yes yes
    -bestpathlw 9.5 9.500000e+00
    -bghist no no
    -build_outdirs yes yes
    -cepdir wav
    -cepext .mfc .wav
    -ceplen 13 13
    -cmn current current
    -cmninit 8.0 8.0
    -compallsen no no
    -ctl adaptation-test.fileids
    -ctlcount -1 -1
    -ctlincr 1 1
    -ctloffset 0 0
    -ctm
    -debug 0
    -dict arctic20.dic
    -dictcase no no
    -dither no no
    -doublebw no no
    -ds 1 1
    -fdict
    -feat 1s_c_d_dd 1s_c_d_dd
    -featparams
    -fillprob 1e-8 1.000000e-08
    -frate 100 100
    -fsg
    -fsgctl
    -fsgdir
    -fsgext
    -fsgusealtpron yes yes
    -fsgusefiller yes yes
    -fwdflat yes yes
    -fwdflatbeam 1e-64 1.000000e-64
    -fwdflatefwid 4 4
    -fwdflatlw 8.5 8.500000e+00
    -fwdflatsfwin 25 25
    -fwdflatwbeam 7e-29 7.000000e-29
    -fwdtree yes yes
    -hmm hub4wsj_sc_8kadapt
    -hyp adaptation-test.hyp
    -hypseg
    -input_endian little little
    -jsgf
    -kdmaxbbi -1 -1
    -kdmaxdepth 0 0
    -kdtree
    -latsize 5000 5000
    -lda
    -ldadim 0 0
    -lextreedump 0 0
    -lifter 0 0
    -lm corpus.lm
    -lmctl
    -lmname default default
    -lmnamectl
    -logbase 1.0001 1.000100e+00
    -logfn
    -logspec no no
    -lowerf 133.33334 1.333333e+02
    -lpbeam 1e-40 1.000000e-40
    -lponlybeam 7e-29 7.000000e-29
    -lw 6.5 6.500000e+00
    -maxhmmpf -1 -1
    -maxnewoov 20 20
    -maxwpf -1 -1
    -mdef
    -mean
    -mfclogdir
    -min_endfr 0 0
    -mixw
    -mixwfloor 0.0000001 1.000000e-07
    -mllr
    -mllrctl
    -mllrdir
    -mllrext
    -mmap yes yes
    -nbest 0 0
    -nbestdir
    -nbestext .hyp .hyp
    -ncep 13 13
    -nfft 512 512
    -nfilt 40 40
    -nwpen 1.0 1.000000e+00
    -outlatbeam 1e-5 1.000000e-05
    -outlatdir
    -outlatext .lat .lat
    -outlatfmt s3 s3
    -pbeam 1e-48 1.000000e-48
    -pip 1.0 1.000000e+00
    -pl_beam 1e-10 1.000000e-10
    -pl_pbeam 1e-5 1.000000e-05
    -pl_window 0 0
    -rawlogdir
    -remove_dc no no
    -round_filters yes yes
    -samprate 16000 1.600000e+04
    -seed -1 -1
    -sendump
    -senin no no
    -senlogdir
    -senmgau
    -silprob 0.005 5.000000e-03
    -smoothspec no no
    -svspec
    -tmat
    -tmatfloor 0.0001 1.000000e-04
    -topn 4 4
    -topn_beam 0 0
    -toprule
    -transform legacy legacy
    -unit_area yes yes
    -upperf 6855.4976 6.855498e+03
    -usewdphones no no
    -uw 1.0 1.000000e+00
    -var
    -varfloor 0.0001 1.000000e-04
    -varnorm no no
    -verbose no no
    -warp_params
    -warp_type inverse_linear inverse_linear
    -wbeam 7e-29 7.000000e-29
    -wip 0.65 6.500000e-01
    -wlen 0.025625 2.562500e-02

    INFO: cmd_ln.c(691): Parsing command line:
    \ -nfilt 20 \ -lowerf 1 \ -upperf 4000 \ -wlen 0.025 \ -transform dct \ -round_filters no \ -remove_dc yes \ -svspec 0-12/13-25/26-38 \ -feat 1s_c_d_dd \ -agc none \ -cmn current \ -cmninit 56,-3,1 \ -varnorm no

    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -agc none none
    -agcthresh 2.0 2.000000e+00
    -alpha 0.97 9.700000e-01
    -ceplen 13 13
    -cmn current current
    -cmninit 8.0 56,-3,1
    -dither no no
    -doublebw no no
    -feat 1s_c_d_dd 1s_c_d_dd
    -frate 100 100
    -input_endian little little
    -lda
    -ldadim 0 0
    -lifter 0 0
    -logspec no no
    -lowerf 133.33334 1.000000e+00
    -ncep 13 13
    -nfft 512 512
    -nfilt 40 20
    -remove_dc no yes
    -round_filters yes no
    -samprate 16000 1.600000e+04
    -seed -1 -1
    -smoothspec no no
    -svspec 0-12/13-25/26-38
    -transform legacy dct
    -unit_area yes yes
    -upperf 6855.4976 4.000000e+03
    -varnorm no no
    -verbose no no
    -warp_params
    -warp_type inverse_linear inverse_linear
    -wlen 0.025625 2.500000e-02

    INFO: acmod.c(246): Parsed model-specific feature parameters from hub4wsj_sc_8kadapt/feat.params
    INFO: feat.c(713): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
    INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
    INFO: acmod.c(167): Using subvector specification 0-12/13-25/26-38
    INFO: mdef.c(517): Reading model definition: hub4wsj_sc_8kadapt/mdef
    INFO: mdef.c(528): Found byte-order mark BMDF, assuming this is a binary mdef file
    INFO: bin_mdef.c(336): Reading binary model definition: hub4wsj_sc_8kadapt/mdef
    INFO: bin_mdef.c(513): 50 CI-phone, 143047 CD-phone, 3 emitstate/phone, 150 CI-sen, 5150 Sen, 27135 Sen-Seq
    INFO: tmat.c(205): Reading HMM transition probability matrices: hub4wsj_sc_8kadapt/transition_matrices
    INFO: acmod.c(121): Attempting to use SCHMM computation module
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: hub4wsj_sc_8kadapt/means
    INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
    INFO: ms_gauden.c(294): 256x13
    INFO: ms_gauden.c(294): 256x13
    INFO: ms_gauden.c(294): 256x13
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: hub4wsj_sc_8kadapt/variances
    INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
    INFO: ms_gauden.c(294): 256x13
    INFO: ms_gauden.c(294): 256x13
    INFO: ms_gauden.c(294): 256x13
    INFO: ms_gauden.c(354): 0 variance values floored
    INFO: s2_semi_mgau.c(903): Loading senones from dump file hub4wsj_sc_8kadapt/sendump
    INFO: s2_semi_mgau.c(927): BEGIN FILE FORMAT DESCRIPTION
    INFO: s2_semi_mgau.c(990): Rows: 256, Columns: 5150
    INFO: s2_semi_mgau.c(1022): Using memory-mapped I/O for senones
    INFO: s2_semi_mgau.c(1296): Maximum top-N: 4 Top-N beams: 0 0 0
    INFO: dict.c(317): Allocating 4297 * 20 bytes (83 KiB) for word entries
    INFO: dict.c(332): Reading main dictionary: arctic20.dic
    INFO: dict.c(211): Allocated 1 KiB for strings, 1 KiB for phones
    INFO: dict.c(335): 189 words read
    INFO: dict.c(341): Reading filler dictionary: hub4wsj_sc_8kadapt/noisedict
    INFO: dict.c(211): Allocated 0 KiB for strings, 0 KiB for phones
    INFO: dict.c(344): 11 words read
    INFO: dict2pid.c(396): Building PID tables for dictionary
    INFO: dict2pid.c(404): Allocating 50^3 * 2 bytes (244 KiB) for word-initial triphones
    INFO: dict2pid.c(131): Allocated 30200 bytes (29 KiB) for word-final triphones
    INFO: dict2pid.c(195): Allocated 30200 bytes (29 KiB) for single-phone word triphones
    INFO: ngram_model_arpa.c(477): ngrams 1=16, 2=14, 3=14
    INFO: ngram_model_arpa.c(135): Reading unigrams
    INFO: ngram_model_arpa.c(516): 16 = #unigrams created
    INFO: ngram_model_arpa.c(195): Reading bigrams
    INFO: ngram_model_arpa.c(533): 14 = #bigrams created
    INFO: ngram_model_arpa.c(534): 2 = #prob2 entries
    INFO: ngram_model_arpa.c(542): 3 = #bo_wt2 entries
    INFO: ngram_model_arpa.c(292): Reading trigrams
    INFO: ngram_model_arpa.c(555): 14 = #trigrams created
    INFO: ngram_model_arpa.c(556): 2 = #prob3 entries
    INFO: ngram_search_fwdtree.c(99): 125 unique initial diphones
    INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 15 single-phone words
    INFO: ngram_search_fwdtree.c(186): Creating search tree
    INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 15 single-phone words
    INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 128
    ERROR: "ngram_search_fwdtree.c", line 336: No word from the language model has pronunciation in the dictionary
    INFO: ngram_search_fwdtree.c(338): after: 0 root, 0 non-root channels, 11 single-phone words
    INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25
    INFO: cmn.c(175): CMN: 14.69 7.46 0.27 -1.64 -0.73 -1.39 -0.95 -1.66 -0.28 0.04 -0.31 -1.17 -0.39
    INFO: ngram_search_fwdtree.c(1549): 3018 words recognized (10/fr)
    INFO: ngram_search_fwdtree.c(1551): 8163 senones evaluated (27/fr)
    INFO: ngram_search_fwdtree.c(1553): 3133 channels searched (10/fr), 0 1st, 3133 last
    INFO: ngram_search_fwdtree.c(1557): 3133 words for which last channels evaluated (10/fr)
    INFO: ngram_search_fwdtree.c(1560): 0 candidate words for entering last phone (0/fr)
    INFO: ngram_search_fwdtree.c(1562): fwdtree 0.04 CPU 0.012 xRT
    INFO: ngram_search_fwdtree.c(1565): fwdtree 0.04 wall 0.014 xRT
    INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 1 words
    INFO: ngram_search_fwdflat.c(937): 308 words recognized (1/fr)
    INFO: ngram_search_fwdflat.c(939): 918 senones evaluated (3/fr)
    INFO: ngram_search_fwdflat.c(941): 373 channels searched (1/fr)
    INFO: ngram_search_fwdflat.c(943): 373 words searched (1/fr)
    INFO: ngram_search_fwdflat.c(945): 26 word transitions (0/fr)
    INFO: ngram_search_fwdflat.c(948): fwdflat 0.01 CPU 0.003 xRT
    INFO: ngram_search_fwdflat.c(951): fwdflat 0.02 wall 0.005 xRT
    INFO: ngram_search.c(1214): not found in last frame, using <sil>.305 instead
    INFO: ngram_search.c(1266): lattice start node .0 end node <sil>.2
    INFO: ngram_search.c(1294): Eliminated 1 nodes before end node
    INFO: ngram_search.c(1399): Lattice has 3 nodes, 1 links
    INFO: ps_lattice.c(1365): Normalizer P(O) = alpha(<sil>:2:305) = -2882525
    INFO: ps_lattice.c(1403): Joint P(O,S) = -2882525 P(S|O) = 0
    INFO: ngram_search.c(888): bestpath -0.00 CPU -0.000 xRT
    INFO: ngram_search.c(891): bestpath 0.00 wall 0.000 xRT
    INFO: batch.c(760): test1: 3.06 seconds speech, 0.04 seconds CPU, 0.06 seconds wall
    INFO: batch.c(762): test1: 0.01 xRT (CPU), 0.02 xRT (elapsed)
    INFO: cmn.c(175): CMN: 13.08 6.85 1.08 -1.26 -0.74 -1.25 -0.75 -1.07 -0.12 -0.31 0.13 -0.81 -0.57
    INFO: ngram_search_fwdtree.c(1549): 3035 words recognized (10/fr)
    INFO: ngram_search_fwdtree.c(1551): 8190 senones evaluated (27/fr)
    INFO: ngram_search_fwdtree.c(1553): 3163 channels searched (10/fr), 0 1st, 3163 last
    INFO: ngram_search_fwdtree.c(1557): 3163 words for which last channels evaluated (10/fr)
    INFO: ngram_search_fwdtree.c(1560): 0 candidate words for entering last phone (0/fr)
    INFO: ngram_search_fwdtree.c(1562): fwdtree 0.02 CPU 0.007 xRT
    INFO: ngram_search_fwdtree.c(1565): fwdtree 0.03 wall 0.009 xRT
    INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 1 words
    INFO: ngram_search_fwdflat.c(937): 308 words recognized (1/fr)
    INFO: ngram_search_fwdflat.c(939): 918 senones evaluated (3/fr)
    INFO: ngram_search_fwdflat.c(941): 369 channels searched (1/fr)
    INFO: ngram_search_fwdflat.c(943): 369 words searched (1/fr)
    INFO: ngram_search_fwdflat.c(945): 26 word transitions (0/fr)
    INFO: ngram_search_fwdflat.c(948): fwdflat 0.01 CPU 0.003 xRT
    INFO: ngram_search_fwdflat.c(951): fwdflat 0.01 wall 0.003 xRT
    INFO: ngram_search.c(1214):
    not found in last frame, using <sil>.305 instead
    INFO: ngram_search.c(1266): lattice start node .0 end node <sil>.242
    INFO: ngram_search.c(1294): Eliminated 0 nodes before end node
    INFO: ngram_search.c(1399): Lattice has 3 nodes, 2 links
    INFO: ps_lattice.c(1365): Normalizer P(O) = alpha(<sil>:242:305) = -2856771
    INFO: ps_lattice.c(1403): Joint P(O,S) = -2856771 P(S|O) = 0
    INFO: ngram_search.c(888): bestpath 0.00 CPU 0.000 xRT
    INFO: ngram_search.c(891): bestpath 0.00 wall 0.000 xRT
    INFO: batch.c(760): test2: 3.06 seconds speech, 0.03 seconds CPU, 0.04 seconds wall
    INFO: batch.c(762): test2: 0.01 xRT (CPU), 0.01 xRT (elapsed)
    INFO: batch.c(774): TOTAL 6.12 seconds speech, 0.07 seconds CPU, 0.10 seconds wall
    INFO: batch.c(776): AVERAGE 0.01 xRT (CPU), 0.02 xRT (elapsed)
    INFO: ngram_search_fwdtree.c(430): TOTAL fwdtree 0.06 CPU 0.009 xRT
    INFO: ngram_search_fwdtree.c(433): TOTAL fwdtree 0.07 wall 0.012 xRT
    INFO: ngram_search_fwdflat.c(174): TOTAL fwdflat 0.02 CPU 0.003 xRT
    INFO: ngram_search_fwdflat.c(177): TOTAL fwdflat 0.02 wall 0.004 xRT
    INFO: ngram_search.c(317): TOTAL bestpath 0.00 CPU 0.000 xRT
    INFO: ngram_search.c(320): TOTAL bestpath 0.00 wall 0.000 xRT

    And get error when running this command
    word_align.pl adaptation-test.transcription adapation-test.hyp

    Result
    NYALAKAN LAMPU RUANG TAMU SATU (TEST1)
    (TEST1)
    Words: 5 Correct: 0 Errors: 5 Percent correct = 0.00% Error = 100.00% Accuracy = 0.00%
    Insertions: 0 Deletions: 5 Substitutions: 0
    MATIKAN LAMPU RUANG TAMU SATU (TEST2)
    (TEST2)
    Words: 5 Correct: 0 Errors: 5 Percent correct = 0.00% Error = 100.00% Accuracy = 0.00%
    Insertions: 0 Deletions: 5 Substitutions: 0
    TOTAL Words: 10 Correct: 0 Errors: 10
    TOTAL Percent correct = 0.00% Error = 100.00% Accuracy = 0.00%
    TOTAL Insertions: 0 Deletions: 10 Substitutions: 0

    Am I doing something wrong?
    Is my new language model wrong?
    I tried to adapt Indonesia languange with CMU Sphinx default acoustic. Can I do like that or I have to build a new one?

    Thank you so much.

     
    • Nickolay V. Shmyrev

      You need to provide data files, not just the logs.

       

      Last edit: Nickolay V. Shmyrev 2014-06-04
  • Joko Susanto

    Joko Susanto - 2014-06-04

    I send you Adapting Files

     
  • Joko Susanto

    Joko Susanto - 2014-06-04

    Testing Adaptation and LM
    Thanks

     
    • Nickolay V. Shmyrev

      Your corpus.lm and your dictionary doesn't match on case. Dictionary is uppercase, lm is lowercase. This is what error during decoding tells you about:

       ERROR: "ngram_search_fwdtree.c", line 336: No word from the language model has pronunciation in the dictionary
      
       
  • Joko Susanto

    Joko Susanto - 2014-06-05

    It works now. After executing word_align.pl adaptation-test.transcription adapation-test.hyp, I got following result.

    nyalakan lampu ruang tamu satu (TEST1)
    ANAK nyalakan lampu ruang tamu satu (TEST1)
    Words: 5 Correct: 5 Errors: 1 Percent correct = 100.00% Error = 20.00% Accuracy = 80.00%
    Insertions: 1 Deletions: 0 Substitutions: 0
    matikan lampu RUANG TAMU satu (TEST2)
    matikan lampu
    MATIKAN satu (TEST2)
    Words: 5 Correct: 3 Errors: 2 Percent correct = 60.00% Error = 40.00% Accuracy = 60.00%
    Insertions: 0 Deletions: 1 Substitutions: 1
    TOTAL Words: 10 Correct: 8 Errors: 3
    TOTAL Percent correct = 80.00% Error = 30.00% Accuracy = 70.00%
    TOTAL Insertions: 1 Deletions: 1 Substitutions: 1

    Thanks for your help.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.