Carl Fisher - 2008-06-13

Per Nickolay's recommendation in this thread (http://sourceforge.net/forum/message.php?msg_id=4948563) we are trying Sphinx3 in search of better accuracy.

I downloaded sphinxbase and the latest sphinx3 source and built them, set up my files and entered the same basic command line args that Nickloay used, but I'm getting an error complaining about not being able to read the LM file (see below). Does Sphinx3 need a diffent format LM file, or am I making a stupid mistake (likely)?

Please show me the error of my ways!

Thanks,
Carl


T:\CARI\Transcribe\Data\Transcriptions\Trial5>sphinx3_decode -adcin yes -cepext.wav -cepdir

. -ctl test.ctl -dict quex.dic -fdict fillerdict -remove_dc no -lmquex.lm -hmm

C:\Projects\Transcribe\sphinx3-0.7\model\hmm\hub4_cd_continuous_8gau_1s_c_d_dd
INFO: c:\projects\transcribe\sphinxbase\src\libsphinxutil\info.c(70): sphinx3_decode

Compiled on: Jun 13 2008, AT: 10:12:17

INFO: c:\projects\transcribe\sphinxbase\src\libsphinxutil\cmd_ln.c(430): Parsing command

line:
sphinx3_decode \ -adcin yes \ -cepext .wav \ -cepdir . \ -ctl test.ctl \ -dict quex.dic \ -fdict fillerdict \ -remove_dc no \ -lm quex.lm \ -hmm C:\Projects\Transcribe\sphinx3-0.7\model\hmm\hub4_cd_continuous_8gau_1s_c_d_dd

Current configuration:
[NAME] [DEFLT] [VALUE]
-adchdr 0 0
-adcin no yes
-agc none none
-alpha 0.97 9.700000e-001
-backtrace yes yes
-beam 1.0e-55 1.000000e-055
-bestpath no no
-bestpathlw 0.000000e+000
-bestscoredir
-bestsenscrdir
-bghist no no
-bptbldir
-bptblsize 32768 32768
-cb2mllr .1cls. .1cls.
-cep2spec no no
-cepdir .
-cepext .mfc .wav
-ceplen 13 13
-ci_pbeam 1e-80 1.000000e-080
-cmn current current
-cond_ds no no
-ctl test.ctl
-ctlcount 1000000000 1000000000
-ctloffset 0 0
-ctl_lm
-ctl_mllr
-dagfudge 2 2
-dict quex.dic
-dist_ds no no
-dither no no
-doublebw no no
-ds 1 1
-epl 3 3
-fbtype mel_scale mel_scale
-fdict fillerdict
-feat 1s_c_d_dd 1s_c_d_dd
-fillpen
-fillprob 0.1 1.000000e-001
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-gs
-gs4gs yes yes
-hmm C:\Projects\Transcribe\sphinx3-0.7\model
\hmm\hub4_cd_continuous_8gau_1s_c_d_dd
-hmmdump no no
-hmmdumpef 200000000 200000000
-hmmdumpsf 200000000 200000000
-hmmhistbinsize 5000 5000
-hyp
-hypseg
-hypsegscore_unscale yes yes
-inlatdir
-inlatwin 50 50
-input_endian little little
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latcompress yes yes
-latext lat.gz lat.gz
-lda
-ldadim 29 29
-lextreedump 0 0
-lifter 0 0
-lm quex.lm
-lmctlfn
-lmdumpdir
-lminmemory no no
-lmname
-log3table yes yes
-logbase 1.0003 1.000300e+000
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+002
-lts_mismatch no no
-lw 9.5 9.500000e+000
-maxcdsenpf 100000 100000
-maxedge 2000000 2000000
-maxhistpf 100 100
-maxhmmpf 20000 20000
-maxlmop 100000000 100000000
-maxlpf 40000 40000
-maxppath 1000000 1000000
-maxwpf 20 20
-mdef
-mean
-min_endfr 3 3
-mixw
-mixwfloor 0.0000001 1.000000e-007
-mllr
-mode fwdtree fwdtree
-nbest 200 200
-nbestdir
-nbestext nbest.gz nbest.gz
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-Nlextree 3 3
-Nstalextree 25 25
-op_mode -1 -1
-outlatdir
-outlatfmt s3 s3
-pbeam 1.0e-50 1.000000e-050
-pheurtype 0 0
-phonepen 1.0 1.000000e+000
-pl_beam 1.0e-80 1.000000e-080
-pl_window 1 1
-ppathdebug no no
-ptranskip 0 0
-remove_dc no no
-round_filters yes yes
-samprate 16000.0 1.600000e+004
-seed -1 -1
-senmgau .cont. .cont.
-silprob 0.1 1.000000e-001
-smoothspec no no
-spec2cep no no
-subvq
-subvqbeam 3.0e-3 3.000000e-003
-svq4svq no no
-tighten_factor 0.5 5.000000e-001
-tmat
-tmatfloor 0.0001 1.000000e-004
-topn 4 4
-tracewhmm
-transform legacy legacy
-treeugprob yes yes
-unit_area yes yes
-upperf 6855.4976 6.855498e+003
-utt
-uw 0.7 7.000000e-001
-var
-varfloor 0.0001 1.000000e-004
-varnorm no no
-verbose no no
-vqeval 3 3
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 1.0e-35 1.000000e-035
-wend_beam 1.0e-80 1.000000e-080
-wip 0.7 7.000000e-001
-wlen 0.025625 2.562500e-002
-worddumpef 200000000 200000000
-worddumpsf 200000000 200000000

INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libsearch\kbcore.c(404): Begin

Initialization of Core Models:
INFO: c:\projects\transcribe\sphinxbase\src\libsphinxutil\cmd_ln.c(563): Cannot open

configuration file

C:\Projects\Transcribe\sphinx3-0.7\model\hmm\hub4_cd_continuous_8gau_1s_c_d_dd/feat.params

for reading
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libcommon\logs3.c(151):

Initializing logbase: 1.000300e+000 (add table: 1)
INFO: Initialization of the log add table
INFO: Log-Add table size = 29350
INFO:
INFO: c:\projects\transcribe\sphinxbase\src\libsphinxfeat\feat.c(835): Initializing feature

stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: c:\projects\transcribe\sphinxbase\src\libsphinxfeat\cmn.c(142): mean[0]= 12.00,

mean[1..12]= 0.0
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libsearch\kbcore.c(446): .cont.
INFO: Initialization of feat_t, report:
INFO: Feature type = 1s_c_d_dd
INFO: Cepstral size = 13
INFO: Cepstral size Used = 13
INFO: Number of stream = 1
INFO: Vector size of stream[0]: 39
INFO: Whether CMN is used = 1
INFO: Whether AGC is used = 0
INFO: Whether variance is normalized = 0
INFO:
INFO: Reading HMM in Sphinx 3 Model format
INFO: Model Definition File:

C:\Projects\Transcribe\sphinx3-0.7\model\hmm\hub4_cd_continuous_8gau_1s_c_d_dd/mdef
INFO: Mean File:

C:\Projects\Transcribe\sphinx3-0.7\model\hmm\hub4_cd_continuous_8gau_1s_c_d_dd/means
INFO: Variance File:

C:\Projects\Transcribe\sphinx3-0.7\model\hmm\hub4_cd_continuous_8gau_1s_c_d_dd/variances
INFO: Mixture Weight File:

C:\Projects\Transcribe\sphinx3-0.7\model\hmm\hub4_cd_continuous_8gau_1s_c_d_dd/mixture_weig

hts
INFO: Transition Matrices File:

C:\Projects\Transcribe\sphinx3-0.7\model\hmm\hub4_cd_continuous_8gau_1s_c_d_dd/transition_m

atrices
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libam\mdef.c(679): Reading model

definition: C:\Projects\Transcribe\sphinx3-0.7\model\hmm\hub4_cd_cont
inuous_8gau_1s_c_d_dd/mdef
INFO: Initialization of mdef_t, report:
INFO: 48 CI-phone, 133500 CD-phone, 3 emitstate/phone, 144 CI-sen, 6144 Sen, 32639

Sen-Seq
INFO:
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libsearch\kbcore.c(282): Using

optimized GMM computation for Continuous HMM, -topn will be ignored
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libam\cont_mgau.c(161): Reading

mixture gaussian file

'C:\Projects\Transcribe\sphinx3-0.7\model\hmm\hub4_cd_continuous_8gau_1s_c_d_dd/means'
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libam\cont_mgau.c(417): 6144

mixture Gaussians, 8 components, 1 streams, veclen 39
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libam\cont_mgau.c(161): Reading

mixture gaussian file

'C:\Projects\Transcribe\sphinx3-0.7\model\hmm\hub4_cd_continuous_8gau_1s_c_d_dd/variances'
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libam\cont_mgau.c(417): 6144

mixture Gaussians, 8 components, 1 streams, veclen 39
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libam\cont_mgau.c(505): Reading

mixture weights file

'C:\Projects\Transcribe\sphinx3-0.7\model\hmm\hub4_cd_continuous_8gau_1s_c_d_dd/mixture_wei

ghts'
ERROR: "c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libam\cont_mgau.c",line 645:

Weight normalization failed for 3 senones
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libam\cont_mgau.c(657): Read 6144

x 8 mixture weights
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libam\cont_mgau.c(685): Removing

uninitialized Gaussian densities 6 7 8
WARNING: "c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libam\cont_mgau.c", line 760:

24 densities removed (3 mixtures removed entirely)
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libam\cont_mgau.c(776): Applying

variance floor
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libam\cont_mgau.c(794): 0

variance values floored
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libam\cont_mgau.c(842):

Precomputing Mahalanobis distance invariants
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libam\tmat.c(167): Reading HMM

transition probability matrices:

C:\Projects\Transcribe\sphinx3-0.7\model\hmm\hub4_cd_continuous_8gau_1s_c_d_dd/transition_m

atrices
WARNING: "c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libam\tmat.c", line 239:

Normalization failed for tmat 2 from state 0
WARNING: "c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libam\tmat.c", line 239:

Normalization failed for tmat 2 from state 1
WARNING: "c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libam\tmat.c", line 239:

Normalization failed for tmat 2 from state 2
INFO: Initialization of tmat_t, report:
INFO: Read 48 transition matrices of size 3x4
INFO:
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libdict\dict.c(471): Reading main

dictionary: quex.dic
ERROR: "c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libdict\dict.c", line 268: Line

96: dict_add_word (FOR(2)) failed (duplicate?); ignored
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libdict\dict.c(474): 363 words

read
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libdict\dict.c(479): Reading

filler dictionary: fillerdict
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\libdict\dict.c(482): 3 words read
INFO: Initialization of dict_t, report:
INFO: No of CI phone: 0
INFO: Max word: 4463
INFO: No of word: 366
INFO:
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\liblm\lm.c(593): LM

read('quex.lm', lw= 9.50, wip= 0.70, uw= 0.70)
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\liblm\lm.c(595): Reading LM file

quex.lm (LM name "default")
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\liblm\lm_3g_dmp.c(469): Bad magic

number: 1735287116(676e614c), not an LM dumpfile??
ERROR: "c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\liblm\lm_3g_dmp.c", line 1267:

Error in reading the header of the DUMP file.
INFO: c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\liblm\lm.c(603): In lm_read, LM

is not a DMP file. Trying to read it as a txt file
ERROR: "c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\liblm\lm.c", line 608: LM is

not a dump file, so it is assumed to be a text file. However, disk-based LM is not working

for -lminmemory=0 at this point (i.e. LM has to be loaded into the memory).
FATAL_ERROR: "c:\projects\transcribe\sphinx3-0.7\src\libs3decoder\liblm\lmset.c", line 292:

lm_read_advance(quex.lm, 9.500000e+000, 7.000000e-001, 7.000000e-001 366 [Arbitrary Fmt],

Weighted Apply) failed