I followed the "Robust Group Tutorial" to get Sphinx 3 working, using the
following:
sphinx 3.0.8
sphinxbase 0.4.1
SphinxTrain 1.0
rm1
I am also using the VoxForge (English v0.1.2) acoustic models, and
"lm_giga_5k_nvp_3gram" language models & dictionaries.
Everything seemed to work file, until I executed
$ /usr/src/sphinx/sphinx3/src/programs/sphinx3_livedecode
/usr/src/sphinx/cfgfile
to test it out. I received the below output.
Google was no help for this at all. What exactly is the viterbi history,
anyway? How should I fix this?
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-cep2spec no no
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-dither no yes
-doublebw no no
-feat 1s_c_d_dd 1s_c_d_dd
-frate 100 100
-input_endian little little
-lda
-ldadim 0 0
-lifter 0 0
-logspec no no
-lowerf 133.33334 1.333333e+02
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-smoothspec no no
-spec2cep no no
-svspec
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2.560000e-02
INFO: Initialization of the log add table
INFO: Log-Add table size = 29350 x 2 >> 0
INFO:
INFO: feat.c(849): Initializing feature stream to type: '1s_c_d_dd',
ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean= 12.00, mean= 0.0
INFO: kbcore.c(480): .cont.
INFO: Initialization of feat_t, report:
INFO: Feature type = 1s_c_d_dd
INFO: Cepstral size = 13
INFO: Number of streams = 1
INFO: Vector size of stream: 39
INFO: Number of subvectors = 0
INFO: Whether CMN is used = 1
INFO: Whether AGC is used = 0
INFO: Whether variance is normalized = 0
INFO:
INFO: Reading Feature Space Transform from: /usr/src/sphinx/voxforge-
en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/feature_transform
INFO: Reading HMM in Sphinx 3 Model format
INFO: Model Definition File: /usr/src/sphinx/voxforge-
en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/mdef
INFO: Mean File: /usr/src/sphinx/voxforge-
en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/means
INFO: Variance File: /usr/src/sphinx/voxforge-
en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/variances
INFO: Mixture Weight File: /usr/src/sphinx/voxforge-
en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/mixture_weights
INFO: Transition Matrices File: /usr/src/sphinx/voxforge-
en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/transition_matrices
INFO: mdef.c(682): Reading model definition: /usr/src/sphinx/voxforge-
en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/mdef
INFO: Initialization of mdef_t, report:
INFO: 40 CI-phone, 100516 CD-phone, 3 emitstate/phone, 120 CI-sen, 3120 Sen,
18846 Sen-Seq
INFO:
INFO: kbcore.c(288): Using optimized GMM computation for Continuous HMM, -topn
will be ignored
INFO: cont_mgau.c(163): Reading mixture gaussian file '/usr/src/sphinx
/voxforge-en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/means'
INFO: cont_mgau.c(422): 3120 mixture Gaussians, 16 components, 1 streams,
veclen 29
INFO: cont_mgau.c(163): Reading mixture gaussian file '/usr/src/sphinx
/voxforge-en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/variances'
INFO: cont_mgau.c(422): 3120 mixture Gaussians, 16 components, 1 streams,
veclen 29
INFO: cont_mgau.c(510): Reading mixture weights file '/usr/src/sphinx
/voxforge-
en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/mixture_weights'
INFO: cont_mgau.c(665): Read 3120 x 16 mixture weights
INFO: cont_mgau.c(693): Removing uninitialized Gaussian densities
INFO: cont_mgau.c(783): Applying variance floor
INFO: cont_mgau.c(801): 63 variance values floored
INFO: cont_mgau.c(849): Precomputing Mahalanobis distance invariants
INFO: tmat.c(169): Reading HMM transition probability matrices:
/usr/src/sphinx/voxforge-
en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/transition_matrices
INFO: Initialization of tmat_t, report:
INFO: Read 40 transition matrices of size 3x4
INFO:
INFO: dict.c(475): Reading main dictionary:
/usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.dic
INFO: dict.c(478): 5900 words read
INFO: dict.c(483): Reading filler dictionary:
/usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.filler
INFO: dict.c(486): 3 words read
INFO: Initialization of dict_t, report:
INFO: No of CI phone: 0
INFO: Max word: 9999
INFO: No of word: 5903
INFO:
INFO: lm.c(606): LM
read('/usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp_3gram.arpa.DMP', lw=
9.50, wip= 0.70, uw= 0.70)
INFO: lm.c(608): Reading LM file
/usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp_3gram.arpa.DMP (LM name
"default")
INFO: lm_3g_dmp.c(630): Reading LM in 16 bits format
INFO: lm_3g_dmp.c(686): Read 5000 unigrams
INFO: lm_3g_dmp.c(759): 2821547 bigrams
INFO: lm_3g_dmp.c(832): 8095821 bigrams
INFO: lm_3g_dmp.c(902): 45171 bigram prob entries
INFO: lm_3g_dmp.c(936): 16932 trigram bowt entries
INFO: lm_3g_dmp.c(967): 48498 trigram prob entries
INFO: lm_3g_dmp.c(998): 5511 trigram segtable entries (512 segsize)
INFO: lm_3g_dmp.c(1053): 5000 word strings
INFO: lm.c(691): The LM routine is operating at 16 bits mode
ERROR: "wid.c", line 282: <unk> is not a word in dictionary and it is not a
class tag.
INFO: wid.c(292): 1 LM words not in dictionary; ignored
INFO: Initialization of fillpen_t, report:
INFO: Language weight =9.500000
INFO: Word Insertion Penalty =0.700000
INFO: Silence probability =0.100000
INFO: Filler probability =0.100000
INFO:
INFO: dict2pid.c(599): Building PID tables for dictionary
INFO: Initialization of dict2pid_t, report:
INFO: Dict2pid is in composite triphone mode
INFO: 2212 composite states; 1010 composite sseq
INFO:
INFO: kbcore.c(632): Inside kbcore: Verifying models consistency ......
INFO: kbcore.c(654): End of Initialization of Core Models:
INFO: Initialization of beam_t, report:
INFO: Parameters used in Beam Pruning of Viterbi Search:
INFO: Beam=-422133
INFO: PBeam=-383758
INFO: WBeam=-268630 (Skip=0)
INFO: WEndBeam=-614012
INFO: No of CI Phone assumed=40
INFO:
INFO: Initialization of fast_gmm_t, report:
INFO: Parameters used in Fast GMM computation:
INFO: Frame-level: Down Sampling Ratio 1, Conditional Down Sampling? 0,
Distance-based Down Sampling? 0
INFO: GMM-level: CI phone beam -614012. MAX CD 100000
INFO: Gaussian-level: GS map would be used for Gaussian Selection? =1, SVQ
would be used as Gaussian Score? =0 SubVQ Beam -19363
INFO:
INFO: Initialization of pl_t, report:
INFO: Parameters used in phoneme lookahead:
INFO: Phoneme look-ahead type = 0
INFO: Phoneme look-ahead beam size = 65945
INFO: No of CI Phones assumed=40
INFO:
INFO: Initialization of ascr_t, report:
INFO: No. of CI senone =120
INFO: No. of senone = 3120
INFO: No. of composite senone = 2212
INFO: No. of senone sequence = 18846
INFO: No. of composite senone sequence=1010
INFO: Parameters used in phoneme lookahead:
INFO: Phoneme lookahead window = 1
INFO:
INFO: kb.c(306): SEARCH MODE INDEX 4
INFO: srch.c(373): Search Initialization.
WARNING: "srch_time_switch_tree.c", line 283: -Nstalextree is omitted in TST
search.
INFO: lextree.c(222): Creating Unigram Table for lm (name: default)
INFO: lextree.c(235): Size of word table after unigram + words in class: 4997.
INFO: lextree.c(244): Size of word table after adding alternative prons: 5900.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 34
INFO: Number of node 23494
INFO: Number of links in the tree 281032
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 168
INFO: The size of a gnode_t 16
INFO:
INFO: srch_time_switch_tree.c(343): Lextrees (0) for lm 0, its name is
default, it has 23494 nodes(ug)
INFO: lextree.c(222): Creating Unigram Table for lm (name: default)
INFO: lextree.c(235): Size of word table after unigram + words in class: 4997.
INFO: lextree.c(244): Size of word table after adding alternative prons: 5900.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 34
INFO: Number of node 23494
INFO: Number of links in the tree 281032
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 168
INFO: The size of a gnode_t 16
INFO:
INFO: srch_time_switch_tree.c(343): Lextrees (1) for lm 0, its name is
default, it has 23494 nodes(ug)
INFO: lextree.c(222): Creating Unigram Table for lm (name: default)
INFO: lextree.c(235): Size of word table after unigram + words in class: 4997.
INFO: lextree.c(244): Size of word table after adding alternative prons: 5900.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 34
INFO: Number of node 23494
INFO: Number of links in the tree 281032
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 168
INFO: The size of a gnode_t 16
INFO:
INFO: srch_time_switch_tree.c(343): Lextrees (2) for lm 0, its name is
default, it has 23494 nodes(ug)
INFO: srch_time_switch_tree.c(350): Time for building trees, 0.0600 CPU 0.0609
Clk
INFO: srch_time_switch_tree.c(372): Lextrees(0), 1 nodes(filler)
INFO: srch_time_switch_tree.c(372): Lextrees(1), 1 nodes(filler)
INFO: srch_time_switch_tree.c(372): Lextrees(2), 1 nodes(filler)
INFO: vithist.c(168): Initializing Viterbi-history module
INFO: Initialization of srch_t, report:
INFO: Operation Mode = 4, Operation Name = fwdtree
INFO:
INFO: s3_decode.c(259): Input data will NOT be byte swapped
INFO: s3_decode.c(264): Partial hypothesis WILL be dumped
INFO: fe_interface.c(287): You are using the internal mechanism to generate
the seed.
press ENTER to start recording </unk>
press ENTER to finish recording
Warning: Could not find Mic element
WARNING: "srch_time_switch_tree.c", line 1340: Failed to retrieve viterbi
history.
WARNING: "s3_decode.c", line 536: Failed to retrieve viterbi history.
Cannot retrieve hypothesis.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It can. I looked through the source code, and it'll try to find the "Mic"
simple mixer element, if it can't it'll warn you then try to find the "Record"
simple mixer element, which succeeds in my case because my sound card uses
that name. If it failed at finding "Record", it would tell you that too, but
it doesn't for me.
It's also recommended to use
pocketsphinx instead of sphinx3.
Yes, I plan on using pocketsphinx if sphinx3 is too taxing on the processor.
I'm using sphinx3 right now because the Robust Group tutorial recommends it
for this test. I do not think this error is caused by this.
Thanks for replying.
Can anyone help me with the viterbi history error?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It can. I looked through the source
code, and it'll try to find the "Mic"
simple mixer element, if it can't
it'll warn you then try to find the
"Record" simple mixer element, which
succeeds in my case because my sound
card uses that name. If it failed at
finding "Record", it would tell you
that too, but it doesn't for me.
Its not just about mic element, it tells you it can't get sound from the
soundcard, that's why viterbi history is empty. Probably sound input is
blocked by pulseaudio, probably it's your soundcard, probably something else.
You can check out.raw file with the results of audio capture. This file is
created in a working directory of sphinx3_livedecode for example.
Yes, I plan on using pocketsphinx if
sphinx3 is too taxing on the
processor. I'm using sphinx3 right now
because the Robust Group tutorial
recommends it for this test. I do not
think this error is caused by this.
The reason to use pocketsphinx is not in the speed but the recommendation we
are doing for you. You can find more information on version comparision page:
Its not just about mic element, it
tells you it can't get sound from the
soundcard, that's why viterbi history
is empty. Probably sound input is
blocked by pulseaudio, probably it's
your soundcard, probably something
else. You can check out.raw file with
the results of audio capture. This
file is created in a working directory
of sphinx3_livedecode for example.
Yup, your right. I was digging deeper in the code and found that having no
samples will cause that. I also noticed that I kept getting an empty out.raw
file, as you mentioned. (I was able to find people who's live decode did work,
even though they got that warning)
Actually, I've been fussing with my sound for a couple days now.
I would like to strangle who ever made ALSA and PulseAudio!
The reason to use pocketsphinx is not
in the speed but the recommendation we
are doing for you. You can find more
information on version comparision
page:
Hello,
I followed the "Robust Group Tutorial" to get Sphinx 3 working, using the
following:
sphinx 3.0.8
sphinxbase 0.4.1
SphinxTrain 1.0
rm1
I am also using the VoxForge (English v0.1.2) acoustic models, and
"lm_giga_5k_nvp_3gram" language models & dictionaries.
Everything seemed to work file, until I executed
$ /usr/src/sphinx/sphinx3/src/programs/sphinx3_livedecode
/usr/src/sphinx/cfgfile
to test it out. I received the below output.
Google was no help for this at all. What exactly is the viterbi history,
anyway? How should I fix this?
Thank you,
Daniel
INFO: cmd_ln.c(506): Parsing command line:
\
-samprate 16000 \
-hmm /usr/src/sphinx/voxforge-en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000 \
-dict /usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.dic \
-fdict /usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.filler \
-lm /usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp_3gram.arpa.DMP
Current configuration:
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-backtrace yes yes
-beam 1.0e-55 1.000000e-55
-bestpath no no
-bestpathlw 0.000000e+00
-bestscoredir
-bestsenscrdir
-bghist no no
-bptbldir
-bptblsize 32768 32768
-cb2mllr .1cls. .1cls.
-cep2spec no no
-ceplen 13 13
-ci_pbeam 1e-80 1.000000e-80
-cmn current current
-cmninit 8.0 8.0
-cond_ds no no
-ctl
-ctlcount 1000000000 1000000000
-ctloffset 0 0
-ctl_lm
-ctl_mllr
-dagfudge 2 2
-dict /usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.dic
-dist_ds no no
-dither no no
-doublebw no no
-ds 1 1
-epl 3 3
-fdict /usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.filler
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillpen
-fillprob 0.1 1.000000e-01
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-gs
-gs4gs yes yes
-hmm /usr/src/sphinx/voxforge-en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000
-hmmdump no no
-hmmdumpef 200000000 200000000
-hmmdumpsf 200000000 200000000
-hmmhistbinsize 5000 5000
-hyp
-hypseg
-hypsegscore_unscale yes yes
-inlatdir
-inlatwin 50 50
-input_endian little little
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latcompress yes yes
-latext lat.gz lat.gz
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm /usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp_3gram.arpa.DMP
-lmctlfn
-lmdumpdir
-lmname
-log3table yes yes
-logbase 1.0003 1.000300e+00
-logspec no no
-lowerf 133.33334 1.333333e+02
-lts_mismatch no no
-lw 9.5 9.500000e+00
-machine_endian little little
-maxcdsenpf 100000 100000
-maxedge 2000000 2000000
-maxhistpf 100 100
-maxhmmpf 20000 20000
-maxhyplen 1000 1000
-maxlmop 100000000 100000000
-maxlpf 40000 40000
-maxppath 1000000 1000000
-maxwpf 20 20
-mdef
-mean
-min_endfr 3 3
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mode fwdtree fwdtree
-nbest 200 200
-nbestdir
-nbestext nbest.gz nbest.gz
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-Nlextree 3 3
-Nstalextree 25 25
-op_mode -1 -1
-outlatdir
-outlatfmt s3 s3
-pbeam 1.0e-50 1.000000e-50
-pheurtype 0 0
-phonepen 1.0 1.000000e+00
-phypdump yes yes
-pl_beam 1.0e-80 1.000000e-80
-pl_window 1 1
-ppathdebug no no
-ptranskip 0 0
-rawext .raw .raw
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-senmgau .cont. .cont.
-silprob 0.1 1.000000e-01
-smoothspec no no
-spec2cep no no
-subvq
-subvqbeam 3.0e-3 3.000000e-03
-svq4svq no no
-svspec
-tighten_factor 0.5 5.000000e-01
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-tracewhmm
-transform legacy legacy
-treeugprob yes yes
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-uw 0.7 7.000000e-01
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-vqeval 3 3
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 1.0e-35 1.000000e-35
-wend_beam 1.0e-80 1.000000e-80
-wip 0.7 7.000000e-01
-wlen 0.025625 2.562500e-02
-worddumpef 200000000 200000000
-worddumpsf 200000000 200000000
INFO: kbcore.c(433): Begin Initialization of Core Models:
INFO: cmd_ln.c(506): Parsing command line:
\
-alpha 0.97 \
-dither yes \
-doublebw no \
-nfilt 40 \
-ncep 13 \
-lowerf 133.33334 \
-upperf 6855.4976 \
-nfft 512 \
-wlen 0.0256 \
-transform legacy \
-feat 1s_c_d_dd \
-agc none \
-cmn current \
-varnorm no
Current configuration:
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-cep2spec no no
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-dither no yes
-doublebw no no
-feat 1s_c_d_dd 1s_c_d_dd
-frate 100 100
-input_endian little little
-lda
-ldadim 0 0
-lifter 0 0
-logspec no no
-lowerf 133.33334 1.333333e+02
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-smoothspec no no
-spec2cep no no
-svspec
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2.560000e-02
INFO: Initialization of the log add table
INFO: Log-Add table size = 29350 x 2 >> 0
INFO:
INFO: feat.c(849): Initializing feature stream to type: '1s_c_d_dd',
ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean= 12.00, mean= 0.0
INFO: kbcore.c(480): .cont.
INFO: Initialization of feat_t, report:
INFO: Feature type = 1s_c_d_dd
INFO: Cepstral size = 13
INFO: Number of streams = 1
INFO: Vector size of stream: 39
INFO: Number of subvectors = 0
INFO: Whether CMN is used = 1
INFO: Whether AGC is used = 0
INFO: Whether variance is normalized = 0
INFO:
INFO: Reading Feature Space Transform from: /usr/src/sphinx/voxforge-
en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/feature_transform
INFO: Reading HMM in Sphinx 3 Model format
INFO: Model Definition File: /usr/src/sphinx/voxforge-
en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/mdef
INFO: Mean File: /usr/src/sphinx/voxforge-
en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/means
INFO: Variance File: /usr/src/sphinx/voxforge-
en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/variances
INFO: Mixture Weight File: /usr/src/sphinx/voxforge-
en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/mixture_weights
INFO: Transition Matrices File: /usr/src/sphinx/voxforge-
en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/transition_matrices
INFO: mdef.c(682): Reading model definition: /usr/src/sphinx/voxforge-
en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/mdef
INFO: Initialization of mdef_t, report:
INFO: 40 CI-phone, 100516 CD-phone, 3 emitstate/phone, 120 CI-sen, 3120 Sen,
18846 Sen-Seq
INFO:
INFO: kbcore.c(288): Using optimized GMM computation for Continuous HMM, -topn
will be ignored
INFO: cont_mgau.c(163): Reading mixture gaussian file '/usr/src/sphinx
/voxforge-en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/means'
INFO: cont_mgau.c(422): 3120 mixture Gaussians, 16 components, 1 streams,
veclen 29
INFO: cont_mgau.c(163): Reading mixture gaussian file '/usr/src/sphinx
/voxforge-en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/variances'
INFO: cont_mgau.c(422): 3120 mixture Gaussians, 16 components, 1 streams,
veclen 29
INFO: cont_mgau.c(510): Reading mixture weights file '/usr/src/sphinx
/voxforge-
en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/mixture_weights'
INFO: cont_mgau.c(665): Read 3120 x 16 mixture weights
INFO: cont_mgau.c(693): Removing uninitialized Gaussian densities
INFO: cont_mgau.c(783): Applying variance floor
INFO: cont_mgau.c(801): 63 variance values floored
INFO: cont_mgau.c(849): Precomputing Mahalanobis distance invariants
INFO: tmat.c(169): Reading HMM transition probability matrices:
/usr/src/sphinx/voxforge-
en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/transition_matrices
INFO: Initialization of tmat_t, report:
INFO: Read 40 transition matrices of size 3x4
INFO:
INFO: dict.c(475): Reading main dictionary:
/usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.dic
INFO: dict.c(478): 5900 words read
INFO: dict.c(483): Reading filler dictionary:
/usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.filler
INFO: dict.c(486): 3 words read
INFO: Initialization of dict_t, report:
INFO: No of CI phone: 0
INFO: Max word: 9999
INFO: No of word: 5903
INFO:
INFO: lm.c(606): LM
read('/usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp_3gram.arpa.DMP', lw=
9.50, wip= 0.70, uw= 0.70)
INFO: lm.c(608): Reading LM file
/usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp_3gram.arpa.DMP (LM name
"default")
INFO: lm_3g_dmp.c(630): Reading LM in 16 bits format
INFO: lm_3g_dmp.c(686): Read 5000 unigrams
INFO: lm_3g_dmp.c(759): 2821547 bigrams
INFO: lm_3g_dmp.c(832): 8095821 bigrams
INFO: lm_3g_dmp.c(902): 45171 bigram prob entries
INFO: lm_3g_dmp.c(936): 16932 trigram bowt entries
INFO: lm_3g_dmp.c(967): 48498 trigram prob entries
INFO: lm_3g_dmp.c(998): 5511 trigram segtable entries (512 segsize)
INFO: lm_3g_dmp.c(1053): 5000 word strings
INFO: lm.c(691): The LM routine is operating at 16 bits mode
ERROR: "wid.c", line 282: <unk> is not a word in dictionary and it is not a
class tag.
INFO: wid.c(292): 1 LM words not in dictionary; ignored
INFO: Initialization of fillpen_t, report:
INFO: Language weight =9.500000
INFO: Word Insertion Penalty =0.700000
INFO: Silence probability =0.100000
INFO: Filler probability =0.100000
INFO:
INFO: dict2pid.c(599): Building PID tables for dictionary
INFO: Initialization of dict2pid_t, report:
INFO: Dict2pid is in composite triphone mode
INFO: 2212 composite states; 1010 composite sseq
INFO:
INFO: kbcore.c(632): Inside kbcore: Verifying models consistency ......
INFO: kbcore.c(654): End of Initialization of Core Models:
INFO: Initialization of beam_t, report:
INFO: Parameters used in Beam Pruning of Viterbi Search:
INFO: Beam=-422133
INFO: PBeam=-383758
INFO: WBeam=-268630 (Skip=0)
INFO: WEndBeam=-614012
INFO: No of CI Phone assumed=40
INFO:
INFO: Initialization of fast_gmm_t, report:
INFO: Parameters used in Fast GMM computation:
INFO: Frame-level: Down Sampling Ratio 1, Conditional Down Sampling? 0,
Distance-based Down Sampling? 0
INFO: GMM-level: CI phone beam -614012. MAX CD 100000
INFO: Gaussian-level: GS map would be used for Gaussian Selection? =1, SVQ
would be used as Gaussian Score? =0 SubVQ Beam -19363
INFO:
INFO: Initialization of pl_t, report:
INFO: Parameters used in phoneme lookahead:
INFO: Phoneme look-ahead type = 0
INFO: Phoneme look-ahead beam size = 65945
INFO: No of CI Phones assumed=40
INFO:
INFO: Initialization of ascr_t, report:
INFO: No. of CI senone =120
INFO: No. of senone = 3120
INFO: No. of composite senone = 2212
INFO: No. of senone sequence = 18846
INFO: No. of composite senone sequence=1010
INFO: Parameters used in phoneme lookahead:
INFO: Phoneme lookahead window = 1
INFO:
INFO: kb.c(306): SEARCH MODE INDEX 4
INFO: srch.c(373): Search Initialization.
WARNING: "srch_time_switch_tree.c", line 283: -Nstalextree is omitted in TST
search.
INFO: lextree.c(222): Creating Unigram Table for lm (name: default)
INFO: lextree.c(235): Size of word table after unigram + words in class: 4997.
INFO: lextree.c(244): Size of word table after adding alternative prons: 5900.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 34
INFO: Number of node 23494
INFO: Number of links in the tree 281032
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 168
INFO: The size of a gnode_t 16
INFO:
INFO: srch_time_switch_tree.c(343): Lextrees (0) for lm 0, its name is
default, it has 23494 nodes(ug)
INFO: lextree.c(222): Creating Unigram Table for lm (name: default)
INFO: lextree.c(235): Size of word table after unigram + words in class: 4997.
INFO: lextree.c(244): Size of word table after adding alternative prons: 5900.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 34
INFO: Number of node 23494
INFO: Number of links in the tree 281032
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 168
INFO: The size of a gnode_t 16
INFO:
INFO: srch_time_switch_tree.c(343): Lextrees (1) for lm 0, its name is
default, it has 23494 nodes(ug)
INFO: lextree.c(222): Creating Unigram Table for lm (name: default)
INFO: lextree.c(235): Size of word table after unigram + words in class: 4997.
INFO: lextree.c(244): Size of word table after adding alternative prons: 5900.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 34
INFO: Number of node 23494
INFO: Number of links in the tree 281032
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 168
INFO: The size of a gnode_t 16
INFO:
INFO: srch_time_switch_tree.c(343): Lextrees (2) for lm 0, its name is
default, it has 23494 nodes(ug)
INFO: srch_time_switch_tree.c(350): Time for building trees, 0.0600 CPU 0.0609
Clk
INFO: srch_time_switch_tree.c(372): Lextrees(0), 1 nodes(filler)
INFO: srch_time_switch_tree.c(372): Lextrees(1), 1 nodes(filler)
INFO: srch_time_switch_tree.c(372): Lextrees(2), 1 nodes(filler)
INFO: vithist.c(168): Initializing Viterbi-history module
INFO: Initialization of srch_t, report:
INFO: Operation Mode = 4, Operation Name = fwdtree
INFO:
INFO: s3_decode.c(259): Input data will NOT be byte swapped
INFO: s3_decode.c(264): Partial hypothesis WILL be dumped
INFO: fe_interface.c(287): You are using the internal mechanism to generate
the seed.
press ENTER to start recording </unk>
press ENTER to finish recording
Warning: Could not find Mic element
WARNING: "srch_time_switch_tree.c", line 1340: Failed to retrieve viterbi
history.
WARNING: "s3_decode.c", line 536: Failed to retrieve viterbi history.
Cannot retrieve hypothesis.
It tells you it can't find microphone.
It's also recommended to use pocketsphinx instead of sphinx3.
It can. I looked through the source code, and it'll try to find the "Mic"
simple mixer element, if it can't it'll warn you then try to find the "Record"
simple mixer element, which succeeds in my case because my sound card uses
that name. If it failed at finding "Record", it would tell you that too, but
it doesn't for me.
Yes, I plan on using pocketsphinx if sphinx3 is too taxing on the processor.
I'm using sphinx3 right now because the Robust Group tutorial recommends it
for this test. I do not think this error is caused by this.
Thanks for replying.
Can anyone help me with the viterbi history error?
Its not just about mic element, it tells you it can't get sound from the
soundcard, that's why viterbi history is empty. Probably sound input is
blocked by pulseaudio, probably it's your soundcard, probably something else.
You can check out.raw file with the results of audio capture. This file is
created in a working directory of sphinx3_livedecode for example.
The reason to use pocketsphinx is not in the speed but the recommendation we
are doing for you. You can find more information on version comparision page:
http://cmusphinx.sourceforge.net/wordpress/versions/
Yup, your right. I was digging deeper in the code and found that having no
samples will cause that. I also noticed that I kept getting an empty out.raw
file, as you mentioned. (I was able to find people who's live decode did work,
even though they got that warning)
Actually, I've been fussing with my sound for a couple days now.
I would like to strangle who ever made ALSA and PulseAudio!
I'll be using pocket then. :)
Thanks for your replies.