Menu

Failed to retrieve viterbi history.

Help
Daniel
2009-12-26
2012-09-22
  • Daniel

    Daniel - 2009-12-26

    Hello,

    I followed the "Robust Group Tutorial" to get Sphinx 3 working, using the
    following:

    sphinx 3.0.8
    sphinxbase 0.4.1
    SphinxTrain 1.0
    rm1

    I am also using the VoxForge (English v0.1.2) acoustic models, and
    "lm_giga_5k_nvp_3gram" language models & dictionaries.

    Everything seemed to work file, until I executed
    $ /usr/src/sphinx/sphinx3/src/programs/sphinx3_livedecode
    /usr/src/sphinx/cfgfile
    to test it out. I received the below output.

    Google was no help for this at all. What exactly is the viterbi history,
    anyway? How should I fix this?

    Thank you,
    Daniel

    INFO: cmd_ln.c(506): Parsing command line:
    \
    -samprate 16000 \
    -hmm /usr/src/sphinx/voxforge-en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000 \
    -dict /usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.dic \
    -fdict /usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.filler \
    -lm /usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp_3gram.arpa.DMP

    Current configuration:

    -agc none none
    -agcthresh 2.0 2.000000e+00
    -alpha 0.97 9.700000e-01
    -backtrace yes yes
    -beam 1.0e-55 1.000000e-55
    -bestpath no no
    -bestpathlw 0.000000e+00
    -bestscoredir
    -bestsenscrdir
    -bghist no no
    -bptbldir
    -bptblsize 32768 32768
    -cb2mllr .1cls. .1cls.
    -cep2spec no no
    -ceplen 13 13
    -ci_pbeam 1e-80 1.000000e-80
    -cmn current current
    -cmninit 8.0 8.0
    -cond_ds no no
    -ctl
    -ctlcount 1000000000 1000000000
    -ctloffset 0 0
    -ctl_lm
    -ctl_mllr
    -dagfudge 2 2
    -dict /usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.dic
    -dist_ds no no
    -dither no no
    -doublebw no no
    -ds 1 1
    -epl 3 3
    -fdict /usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.filler
    -feat 1s_c_d_dd 1s_c_d_dd
    -featparams
    -fillpen
    -fillprob 0.1 1.000000e-01
    -frate 100 100
    -fsg
    -fsgusealtpron yes yes
    -fsgusefiller yes yes
    -gs
    -gs4gs yes yes
    -hmm /usr/src/sphinx/voxforge-en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000
    -hmmdump no no
    -hmmdumpef 200000000 200000000
    -hmmdumpsf 200000000 200000000
    -hmmhistbinsize 5000 5000
    -hyp
    -hypseg
    -hypsegscore_unscale yes yes
    -inlatdir
    -inlatwin 50 50
    -input_endian little little
    -kdmaxbbi -1 -1
    -kdmaxdepth 0 0
    -kdtree
    -latcompress yes yes
    -latext lat.gz lat.gz
    -lda
    -ldadim 0 0
    -lextreedump 0 0
    -lifter 0 0
    -lm /usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp_3gram.arpa.DMP
    -lmctlfn
    -lmdumpdir
    -lmname
    -log3table yes yes
    -logbase 1.0003 1.000300e+00
    -logspec no no
    -lowerf 133.33334 1.333333e+02
    -lts_mismatch no no
    -lw 9.5 9.500000e+00
    -machine_endian little little
    -maxcdsenpf 100000 100000
    -maxedge 2000000 2000000
    -maxhistpf 100 100
    -maxhmmpf 20000 20000
    -maxhyplen 1000 1000
    -maxlmop 100000000 100000000
    -maxlpf 40000 40000
    -maxppath 1000000 1000000
    -maxwpf 20 20
    -mdef
    -mean
    -min_endfr 3 3
    -mixw
    -mixwfloor 0.0000001 1.000000e-07
    -mllr
    -mode fwdtree fwdtree
    -nbest 200 200
    -nbestdir
    -nbestext nbest.gz nbest.gz
    -ncep 13 13
    -nfft 512 512
    -nfilt 40 40
    -Nlextree 3 3
    -Nstalextree 25 25
    -op_mode -1 -1
    -outlatdir
    -outlatfmt s3 s3
    -pbeam 1.0e-50 1.000000e-50
    -pheurtype 0 0
    -phonepen 1.0 1.000000e+00
    -phypdump yes yes
    -pl_beam 1.0e-80 1.000000e-80
    -pl_window 1 1
    -ppathdebug no no
    -ptranskip 0 0
    -rawext .raw .raw
    -remove_dc no no
    -round_filters yes yes
    -samprate 16000 1.600000e+04
    -seed -1 -1
    -senmgau .cont. .cont.
    -silprob 0.1 1.000000e-01
    -smoothspec no no
    -spec2cep no no
    -subvq
    -subvqbeam 3.0e-3 3.000000e-03
    -svq4svq no no
    -svspec
    -tighten_factor 0.5 5.000000e-01
    -tmat
    -tmatfloor 0.0001 1.000000e-04
    -topn 4 4
    -tracewhmm
    -transform legacy legacy
    -treeugprob yes yes
    -unit_area yes yes
    -upperf 6855.4976 6.855498e+03
    -uw 0.7 7.000000e-01
    -var
    -varfloor 0.0001 1.000000e-04
    -varnorm no no
    -verbose no no
    -vqeval 3 3
    -warp_params
    -warp_type inverse_linear inverse_linear
    -wbeam 1.0e-35 1.000000e-35
    -wend_beam 1.0e-80 1.000000e-80
    -wip 0.7 7.000000e-01
    -wlen 0.025625 2.562500e-02
    -worddumpef 200000000 200000000
    -worddumpsf 200000000 200000000

    INFO: kbcore.c(433): Begin Initialization of Core Models:
    INFO: cmd_ln.c(506): Parsing command line:
    \
    -alpha 0.97 \
    -dither yes \
    -doublebw no \
    -nfilt 40 \
    -ncep 13 \
    -lowerf 133.33334 \
    -upperf 6855.4976 \
    -nfft 512 \
    -wlen 0.0256 \
    -transform legacy \
    -feat 1s_c_d_dd \
    -agc none \
    -cmn current \
    -varnorm no

    Current configuration:

    -agc none none
    -agcthresh 2.0 2.000000e+00
    -alpha 0.97 9.700000e-01
    -cep2spec no no
    -ceplen 13 13
    -cmn current current
    -cmninit 8.0 8.0
    -dither no yes
    -doublebw no no
    -feat 1s_c_d_dd 1s_c_d_dd
    -frate 100 100
    -input_endian little little
    -lda
    -ldadim 0 0
    -lifter 0 0
    -logspec no no
    -lowerf 133.33334 1.333333e+02
    -ncep 13 13
    -nfft 512 512
    -nfilt 40 40
    -remove_dc no no
    -round_filters yes yes
    -samprate 16000 1.600000e+04
    -seed -1 -1
    -smoothspec no no
    -spec2cep no no
    -svspec
    -transform legacy legacy
    -unit_area yes yes
    -upperf 6855.4976 6.855498e+03
    -varnorm no no
    -verbose no no
    -warp_params
    -warp_type inverse_linear inverse_linear
    -wlen 0.025625 2.560000e-02

    INFO: Initialization of the log add table
    INFO: Log-Add table size = 29350 x 2 >> 0
    INFO:
    INFO: feat.c(849): Initializing feature stream to type: '1s_c_d_dd',
    ceplen=13, CMN='current', VARNORM='no', AGC='none'
    INFO: cmn.c(142): mean= 12.00, mean= 0.0
    INFO: kbcore.c(480): .cont.
    INFO: Initialization of feat_t, report:
    INFO: Feature type = 1s_c_d_dd
    INFO: Cepstral size = 13
    INFO: Number of streams = 1
    INFO: Vector size of stream: 39
    INFO: Number of subvectors = 0
    INFO: Whether CMN is used = 1
    INFO: Whether AGC is used = 0
    INFO: Whether variance is normalized = 0
    INFO:
    INFO: Reading Feature Space Transform from: /usr/src/sphinx/voxforge-
    en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/feature_transform
    INFO: Reading HMM in Sphinx 3 Model format
    INFO: Model Definition File: /usr/src/sphinx/voxforge-
    en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/mdef
    INFO: Mean File: /usr/src/sphinx/voxforge-
    en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/means
    INFO: Variance File: /usr/src/sphinx/voxforge-
    en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/variances
    INFO: Mixture Weight File: /usr/src/sphinx/voxforge-
    en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/mixture_weights
    INFO: Transition Matrices File: /usr/src/sphinx/voxforge-
    en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/transition_matrices
    INFO: mdef.c(682): Reading model definition: /usr/src/sphinx/voxforge-
    en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/mdef
    INFO: Initialization of mdef_t, report:
    INFO: 40 CI-phone, 100516 CD-phone, 3 emitstate/phone, 120 CI-sen, 3120 Sen,
    18846 Sen-Seq
    INFO:
    INFO: kbcore.c(288): Using optimized GMM computation for Continuous HMM, -topn
    will be ignored
    INFO: cont_mgau.c(163): Reading mixture gaussian file '/usr/src/sphinx
    /voxforge-en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/means'
    INFO: cont_mgau.c(422): 3120 mixture Gaussians, 16 components, 1 streams,
    veclen 29
    INFO: cont_mgau.c(163): Reading mixture gaussian file '/usr/src/sphinx
    /voxforge-en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/variances'
    INFO: cont_mgau.c(422): 3120 mixture Gaussians, 16 components, 1 streams,
    veclen 29
    INFO: cont_mgau.c(510): Reading mixture weights file '/usr/src/sphinx
    /voxforge-
    en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/mixture_weights'
    INFO: cont_mgau.c(665): Read 3120 x 16 mixture weights
    INFO: cont_mgau.c(693): Removing uninitialized Gaussian densities
    INFO: cont_mgau.c(783): Applying variance floor
    INFO: cont_mgau.c(801): 63 variance values floored
    INFO: cont_mgau.c(849): Precomputing Mahalanobis distance invariants
    INFO: tmat.c(169): Reading HMM transition probability matrices:
    /usr/src/sphinx/voxforge-
    en/model_parameters/voxforge_en_sphinx.mllt_cd_cont_3000/transition_matrices
    INFO: Initialization of tmat_t, report:
    INFO: Read 40 transition matrices of size 3x4
    INFO:
    INFO: dict.c(475): Reading main dictionary:
    /usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.dic
    INFO: dict.c(478): 5900 words read
    INFO: dict.c(483): Reading filler dictionary:
    /usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.filler
    INFO: dict.c(486): 3 words read
    INFO: Initialization of dict_t, report:
    INFO: No of CI phone: 0
    INFO: Max word: 9999
    INFO: No of word: 5903
    INFO:
    INFO: lm.c(606): LM
    read('/usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp_3gram.arpa.DMP', lw=
    9.50, wip= 0.70, uw= 0.70)
    INFO: lm.c(608): Reading LM file
    /usr/src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp_3gram.arpa.DMP (LM name
    "default")
    INFO: lm_3g_dmp.c(630): Reading LM in 16 bits format
    INFO: lm_3g_dmp.c(686): Read 5000 unigrams
    INFO: lm_3g_dmp.c(759): 2821547 bigrams
    INFO: lm_3g_dmp.c(832): 8095821 bigrams
    INFO: lm_3g_dmp.c(902): 45171 bigram prob entries
    INFO: lm_3g_dmp.c(936): 16932 trigram bowt entries
    INFO: lm_3g_dmp.c(967): 48498 trigram prob entries
    INFO: lm_3g_dmp.c(998): 5511 trigram segtable entries (512 segsize)
    INFO: lm_3g_dmp.c(1053): 5000 word strings
    INFO: lm.c(691): The LM routine is operating at 16 bits mode
    ERROR: "wid.c", line 282: <unk> is not a word in dictionary and it is not a
    class tag.
    INFO: wid.c(292): 1 LM words not in dictionary; ignored
    INFO: Initialization of fillpen_t, report:
    INFO: Language weight =9.500000
    INFO: Word Insertion Penalty =0.700000
    INFO: Silence probability =0.100000
    INFO: Filler probability =0.100000
    INFO:
    INFO: dict2pid.c(599): Building PID tables for dictionary
    INFO: Initialization of dict2pid_t, report:
    INFO: Dict2pid is in composite triphone mode
    INFO: 2212 composite states; 1010 composite sseq
    INFO:
    INFO: kbcore.c(632): Inside kbcore: Verifying models consistency ......
    INFO: kbcore.c(654): End of Initialization of Core Models:
    INFO: Initialization of beam_t, report:
    INFO: Parameters used in Beam Pruning of Viterbi Search:
    INFO: Beam=-422133
    INFO: PBeam=-383758
    INFO: WBeam=-268630 (Skip=0)
    INFO: WEndBeam=-614012
    INFO: No of CI Phone assumed=40
    INFO:
    INFO: Initialization of fast_gmm_t, report:
    INFO: Parameters used in Fast GMM computation:
    INFO: Frame-level: Down Sampling Ratio 1, Conditional Down Sampling? 0,
    Distance-based Down Sampling? 0
    INFO: GMM-level: CI phone beam -614012. MAX CD 100000
    INFO: Gaussian-level: GS map would be used for Gaussian Selection? =1, SVQ
    would be used as Gaussian Score? =0 SubVQ Beam -19363
    INFO:
    INFO: Initialization of pl_t, report:
    INFO: Parameters used in phoneme lookahead:
    INFO: Phoneme look-ahead type = 0
    INFO: Phoneme look-ahead beam size = 65945
    INFO: No of CI Phones assumed=40
    INFO:
    INFO: Initialization of ascr_t, report:
    INFO: No. of CI senone =120
    INFO: No. of senone = 3120
    INFO: No. of composite senone = 2212
    INFO: No. of senone sequence = 18846
    INFO: No. of composite senone sequence=1010
    INFO: Parameters used in phoneme lookahead:
    INFO: Phoneme lookahead window = 1
    INFO:
    INFO: kb.c(306): SEARCH MODE INDEX 4
    INFO: srch.c(373): Search Initialization.
    WARNING: "srch_time_switch_tree.c", line 283: -Nstalextree is omitted in TST
    search.
    INFO: lextree.c(222): Creating Unigram Table for lm (name: default)
    INFO: lextree.c(235): Size of word table after unigram + words in class: 4997.
    INFO: lextree.c(244): Size of word table after adding alternative prons: 5900.
    INFO: lextree_t, report:
    INFO: Parameters of the lexical tree.
    INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
    INFO: Number of left contexts 34
    INFO: Number of node 23494
    INFO: Number of links in the tree 281032
    INFO: The previous word for this tree
    INFO: The size of a node of the lexical tree 168
    INFO: The size of a gnode_t 16
    INFO:
    INFO: srch_time_switch_tree.c(343): Lextrees (0) for lm 0, its name is
    default, it has 23494 nodes(ug)
    INFO: lextree.c(222): Creating Unigram Table for lm (name: default)
    INFO: lextree.c(235): Size of word table after unigram + words in class: 4997.
    INFO: lextree.c(244): Size of word table after adding alternative prons: 5900.
    INFO: lextree_t, report:
    INFO: Parameters of the lexical tree.
    INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
    INFO: Number of left contexts 34
    INFO: Number of node 23494
    INFO: Number of links in the tree 281032
    INFO: The previous word for this tree
    INFO: The size of a node of the lexical tree 168
    INFO: The size of a gnode_t 16
    INFO:
    INFO: srch_time_switch_tree.c(343): Lextrees (1) for lm 0, its name is
    default, it has 23494 nodes(ug)
    INFO: lextree.c(222): Creating Unigram Table for lm (name: default)
    INFO: lextree.c(235): Size of word table after unigram + words in class: 4997.
    INFO: lextree.c(244): Size of word table after adding alternative prons: 5900.
    INFO: lextree_t, report:
    INFO: Parameters of the lexical tree.
    INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
    INFO: Number of left contexts 34
    INFO: Number of node 23494
    INFO: Number of links in the tree 281032
    INFO: The previous word for this tree
    INFO: The size of a node of the lexical tree 168
    INFO: The size of a gnode_t 16
    INFO:
    INFO: srch_time_switch_tree.c(343): Lextrees (2) for lm 0, its name is
    default, it has 23494 nodes(ug)
    INFO: srch_time_switch_tree.c(350): Time for building trees, 0.0600 CPU 0.0609
    Clk
    INFO: srch_time_switch_tree.c(372): Lextrees(0), 1 nodes(filler)
    INFO: srch_time_switch_tree.c(372): Lextrees(1), 1 nodes(filler)
    INFO: srch_time_switch_tree.c(372): Lextrees(2), 1 nodes(filler)
    INFO: vithist.c(168): Initializing Viterbi-history module
    INFO: Initialization of srch_t, report:
    INFO: Operation Mode = 4, Operation Name = fwdtree
    INFO:
    INFO: s3_decode.c(259): Input data will NOT be byte swapped
    INFO: s3_decode.c(264): Partial hypothesis WILL be dumped
    INFO: fe_interface.c(287): You are using the internal mechanism to generate
    the seed.
    press ENTER to start recording </unk>

    press ENTER to finish recording
    Warning: Could not find Mic element

    WARNING: "srch_time_switch_tree.c", line 1340: Failed to retrieve viterbi
    history.
    WARNING: "s3_decode.c", line 536: Failed to retrieve viterbi history.
    Cannot retrieve hypothesis.

     
  • Nickolay V. Shmyrev

    It tells you it can't find microphone.

    It's also recommended to use pocketsphinx instead of sphinx3.

     
  • Daniel

    Daniel - 2009-12-27

    It tells you it can't find microphone.

    It can. I looked through the source code, and it'll try to find the "Mic"
    simple mixer element, if it can't it'll warn you then try to find the "Record"
    simple mixer element, which succeeds in my case because my sound card uses
    that name. If it failed at finding "Record", it would tell you that too, but
    it doesn't for me.

    It's also recommended to use
    pocketsphinx instead of sphinx3.

    Yes, I plan on using pocketsphinx if sphinx3 is too taxing on the processor.
    I'm using sphinx3 right now because the Robust Group tutorial recommends it
    for this test. I do not think this error is caused by this.

    Thanks for replying.

    Can anyone help me with the viterbi history error?

     
  • Nickolay V. Shmyrev

    It can. I looked through the source
    code, and it'll try to find the "Mic"
    simple mixer element, if it can't
    it'll warn you then try to find the
    "Record" simple mixer element, which
    succeeds in my case because my sound
    card uses that name. If it failed at
    finding "Record", it would tell you
    that too, but it doesn't for me.

    Its not just about mic element, it tells you it can't get sound from the
    soundcard, that's why viterbi history is empty. Probably sound input is
    blocked by pulseaudio, probably it's your soundcard, probably something else.

    You can check out.raw file with the results of audio capture. This file is
    created in a working directory of sphinx3_livedecode for example.

    Yes, I plan on using pocketsphinx if
    sphinx3 is too taxing on the
    processor. I'm using sphinx3 right now
    because the Robust Group tutorial
    recommends it for this test. I do not
    think this error is caused by this.

    The reason to use pocketsphinx is not in the speed but the recommendation we
    are doing for you. You can find more information on version comparision page:

    The tutorial text will be changed soon.

    http://cmusphinx.sourceforge.net/wordpress/versions/

     
  • Daniel

    Daniel - 2009-12-28

    Its not just about mic element, it
    tells you it can't get sound from the
    soundcard, that's why viterbi history
    is empty. Probably sound input is
    blocked by pulseaudio, probably it's
    your soundcard, probably something
    else. You can check out.raw file with
    the results of audio capture. This
    file is created in a working directory
    of sphinx3_livedecode for example.

    Yup, your right. I was digging deeper in the code and found that having no
    samples will cause that. I also noticed that I kept getting an empty out.raw
    file, as you mentioned. (I was able to find people who's live decode did work,
    even though they got that warning)

    Actually, I've been fussing with my sound for a couple days now.
    I would like to strangle who ever made ALSA and PulseAudio!

    The reason to use pocketsphinx is not
    in the speed but the recommendation we
    are doing for you. You can find more
    information on version comparision
    page:

    http://cmusphinx.sourceforge.net/wordpress/versions/

    The tutorial text will be changed
    soon.

    I'll be using pocket then. :)

    Thanks for your replies.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.