Menu

Need help with pocketSphinx

2009-07-24
2012-09-22
  • Claude Aynaud

    Claude Aynaud - 2009-07-24

    Hello,
    My problem is the following : I have an acoustic model that works with Sphinx3, but it doesn't work with pocketsphinx; Do you know what are differences between pocketSphinx and Sphinx3 ?

    PS : my acoustic model is a french Model, I tried with english acoustic models provided on the web and that works.

     
    • Nickolay V. Shmyrev

      Welcome to the CMU sphinx forums, Claude.

      What exactly do you mean by "doesn't work", there shouldn't be any difference. If you are talking about

      WARNING: "ngram_search_fwdtree.c", line 115: Filler word 202 = accompagne(2) has more than one phone, ignoring it.

      please open the filler dictionary (the one with -fdict) and check the words there, accompagne shouldn't belong to fillers really. Also please don't bother to provide more information. Paste full log here with options you are using and the output pocketsphinx gives to you.

       
    • Claude Aynaud

      Claude Aynaud - 2009-07-27

      Ok, actually there are others errors before the Warning. Here the log :
      pocketsphinx_test
      Run CMU PocketSphinx in Batch mode to decode an example utterance.

      INFO: cmd_ln.c(459): Parsing command line:
      /home/aynaud/pocket-sphinx/pocketsphinx-0.5/bin/pocketsphinx_batch \
      -adcin yes \
      -fdict /home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/bref120ci/bref120.filler \
      -dict /home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/DESDHIS-2_sphinx.dico \
      -hmm /home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/bref120ci \
      -ctl /home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/ca-fr.ctl \
      -cepext .raw \
      -cepdir /home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/cepstra \
      -lm /home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/DESDHIS-2-Interpole.sphinx.trigram.DMP \
      -wip 0.8 \
      -lw 12 \
      -dictcase yes \
      -fwdflat no \
      -bestpath no \
      -agc none \
      -hyp res.trn

      Current configuration:
      [NAME] [DEFLT] [VALUE]
      -adchdr 0 0
      -adcin no yes
      -agc none none
      -agcthresh 2.0 2.000000e+00
      -alpha 0.97 9.700000e-01
      -ascale 20.0 2.000000e+01
      -backtrace no no
      -beam 1e-48 1.000000e-48
      -bestpath yes no
      -bestpathlw 9.5 9.500000e+00
      -cep2spec no no
      -cepdir /home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/cepstra
      -cepext .mfc .raw
      -ceplen 13 13
      -cmn current current
      -cmninit 8.0 8.0
      -compallsen no no
      -ctl /home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/ca-fr.ctl
      -ctlcount -1 -1
      -ctlincr 1 1
      -ctloffset 0 0
      -dict /home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/DESDHIS-2_sphinx.dico
      -dictcase no yes
      -dither no no
      -doublebw no no
      -ds 1 1
      -fdict /home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/bref120ci/bref120.filler
      -feat 1s_c_d_dd 1s_c_d_dd
      -featparams
      -fillprob 1e-8 1.000000e-08
      -frate 100 100
      -fsg
      -fsgusealtpron yes yes
      -fsgusefiller yes yes
      -fwdflat yes no
      -fwdflatbeam 1e-64 1.000000e-64
      -fwdflatefwid 4 4
      -fwdflatlw 8.5 8.500000e+00
      -fwdflatsfwin 25 25
      -fwdflatwbeam 7e-29 7.000000e-29
      -fwdtree yes yes
      -hmm /home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/bref120ci
      -hyp res.trn
      -hypseg
      -input_endian little little
      -jsgf
      -kdmaxbbi -1 -1
      -kdmaxdepth 0 0
      -kdtree
      -latsize 5000 5000
      -lda
      -ldadim 0 0
      -lifter 0 0
      -lm /home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/DESDHIS-2-Interpole.sphinx.trigram.DMP
      -lmctl
      -lmname default default
      -logbase 1.0001 1.000100e+00
      -logfn
      -logspec no no
      -lowerf 133.33334 1.333333e+02
      -lpbeam 1e-40 1.000000e-40
      -lponlybeam 7e-29 7.000000e-29
      -lw 6.5 1.200000e+01
      -maxhistpf 100 100
      -maxhmmpf -1 -1
      -maxnewoov 20 20
      -maxwpf -1 -1
      -mdef
      -mean
      -mixw
      -mixwfloor 0.0000001 1.000000e-07
      -mmap yes yes
      -nbest 0 0
      -nbestdir
      -nbestext .hyp .hyp
      -ncep 13 13
      -nfft 512 512
      -nfilt 40 40
      -nwpen 1.0 1.000000e+00
      -outlatdir
      -pbeam 1e-48 1.000000e-48
      -pip 1.0 1.000000e+00
      -remove_dc no no
      -round_filters yes yes
      -samprate 16000 1.600000e+04
      -sdmap
      -seed -1 -1
      -sendump
      -silprob 0.005 5.000000e-03
      -smoothspec no no
      -spec2cep no no
      -svspec
      -tmat
      -tmatfloor 0.0001 1.000000e-04
      -topn 4 4
      -toprule
      -transform legacy legacy
      -unit_area yes yes
      -upperf 6855.4976 6.855498e+03
      -usewdphones no no
      -uw 1.0 1.000000e+00
      -var
      -varfloor 0.0001 1.000000e-04
      -varnorm no no
      -verbose no no
      -warp_params
      -warp_type inverse_linear inverse_linear
      -wbeam 7e-29 7.000000e-29
      -wip 0.65 8.000000e-01
      -wlen 0.025625 2.562500e-02

      INFO: mdef.c(520): Reading model definition: /home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/bref120ci/mdef
      INFO: bin_mdef.c(157): cd_tree: nodes 176 wpos start 0 ci start 4 lc start 176 rc start 176
      INFO: tmat.c(204): Reading HMM transition probability matrices: /home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/bref120ci/transition_matrices
      INFO: acmod.c(108): Attempting to use SCGMM computation module
      INFO: s2_semi_mgau.c(985): Reading S3 mixture gaussian file '/home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/bref120ci/means'
      ERROR: "s2_semi_mgau.c", line 1015: /home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/bref120ci/means: #codebooks (129) != 1
      INFO: acmod.c(121): Falling back to general multi-stream GMM computation
      INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/bref120ci/means
      INFO: ms_gauden.c(292): 129 codebook, 1 feature, size
      8x39
      INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/bref120ci/variances
      INFO: ms_gauden.c(292): 129 codebook, 1 feature, size
      8x39
      INFO: ms_gauden.c(369): 0 variance values floored, 0 zero-variance components removed
      INFO: ms_senone.c(157): Reading senone mixture weights: /home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/bref120ci/mixture_weights
      INFO: ms_senone.c(208): Truncating senone logs3(pdf) values by 10 bits
      INFO: ms_senone.c(272): Read mixture weights for 129 senones: 1 features x 8 codewords
      INFO: ms_mgau.c(124): The value of topn: 4
      INFO: feat.c(849): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
      INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
      INFO: dict.c(232): Allocating 20 placeholders for new OOVs
      INFO: dict.c(494): 24034 = words in file [/home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/DESDHIS-2_sphinx.dico]
      WARNING: "dict.c", line 435: Skipping duplicate definition of $
      WARNING: "dict.c", line 435: Skipping duplicate definition of (
      WARNING: "dict.c", line 435: Skipping duplicate definition of )
      WARNING: "dict.c", line 435: Skipping duplicate definition of </s>
      WARNING: "dict.c", line 435: Skipping duplicate definition of <s>
      WARNING: "dict.c", line 435: Skipping duplicate definition of <sil>
      INFO: dict.c(494): 11 = words in file [/home/aynaud/pocket-sphinx/pocketsphinx-0.5-fr/bref120ci/bref120.filler]
      INFO: dict.c(349): LEFT CONTEXT TABLES
      INFO: dict.c(1013): Entry Context table contains
      625 entries
      INFO: dict.c(1014): 26875 possible cross word triphones.
      INFO: dict.c(1052): 0 triphones
      0 pseudo diphones
      26875 uniphones
      INFO: dict.c(1099): Exit Context table contains
      625 entries
      INFO: dict.c(1100): 26875 possible cross word triphones.
      INFO: dict.c(1166): 0 triphones
      0 pseudo diphones
      26875 uniphones
      INFO: dict.c(1168): 625 right context entries
      INFO: dict.c(1169): 1 ave entries per exit context
      INFO: dict.c(355): RIGHT CONTEXT TABLES
      INFO: dict.c(1013): Entry Context table contains
      658 entries
      INFO: dict.c(1014): 28294 possible cross word triphones.
      INFO: dict.c(1052): 0 triphones
      0 pseudo diphones
      28294 uniphones
      INFO: dict.c(1099): Exit Context table contains
      658 entries
      INFO: dict.c(1100): 28294 possible cross word triphones.
      INFO: dict.c(1166): 0 triphones
      0 pseudo diphones
      28294 uniphones
      INFO: dict.c(1168): 658 right context entries
      INFO: dict.c(1169): 1 ave entries per exit context
      ERROR: "ngram_model_arpa.c", line 155: No \data\ mark in LM file
      INFO: ngram_model_dmp.c(141): Will use memory-mapped I/O for LM file
      INFO: ngram_model_dmp.c(190): ngrams 1=297, 2=764, 3=933
      INFO: ngram_model_dmp.c(236): 297 = LM.unigrams(+trailer) read
      WARNING: "ngram_model_dmp.c", line 247: -mmap specified, but tseg_base is not word-aligned. Will not memory-map.
      INFO: ngram_model_dmp.c(286): 764 = LM.bigrams(+trailer) read
      INFO: ngram_model_dmp.c(313): 933 = LM.trigrams read
      INFO: ngram_model_dmp.c(339): 119 = LM.prob2 entries read
      INFO: ngram_model_dmp.c(359): 62 = LM.bo_wt2 entries read
      INFO: ngram_model_dmp.c(380): 65 = LM.prob3 entries read
      INFO: ngram_model_dmp.c(409): 2 = LM.tseg_base entries read
      INFO: ngram_model_dmp.c(468): 297 = ascii word strings read
      WARNING: "ngram_search_fwdtree.c", line 115: Filler word 26 = à_la_tête has more than one phone, ignoring it.
      WARNING: "ngram_search_fwdtree.c", line 115: Filler word 27 = à_la_tête(2) has more than one phone, ignoring it.
      WARNING: "ngram_search_fwdtree.c", line 115: Filler word 28 = à_la_tête(3) has more than one phone, ignoring it.
      WARNING: "ngram_search_fwdtree.c", line 115: Filler word 29 = à_la_tête(4) has more than one phone, ignoring it.

      ...
      WARNING: "ngram_search_fwdtree.c", line 115: Filler word 24053 = zéro(3) has more than one phone, ignoring it.
      INFO: ngram_search_fwdtree.c(156): 0 root, 0 non-root channels, 131 single-phone words
      INFO: ngram_search_fwdtree.c(195): Creating search tree
      INFO: ngram_search_fwdtree.c(203): 0 root, 0 non-root channels, 131 single-phone words
      INFO: ngram_search_fwdtree.c(325): max nonroot chan increased to 128
      INFO: ngram_search_fwdtree.c(334): 0 root, 0 non-root channels, 49 single-phone words
      INFO: cmn.c(175): CMN: -24.97 -0.46 0.25 0.36 0.38 0.34 0.37 0.35 0.32 0.31 0.34 0.31 0.28
      INFO: ngram_search_fwdtree.c(1471): 462 words recognized (2/fr)
      INFO: ngram_search_fwdtree.c(1473): 745 senones evaluated (3/fr)
      INFO: ngram_search_fwdtree.c(1475): 2911 channels searched (11/fr), 0 1st, 2911 last
      INFO: ngram_search_fwdtree.c(1479): 2911 words for which last channels evaluated (11/fr)
      INFO: ngram_search_fwdtree.c(1482): 0 candidate words for entering last phone (0/fr)
      INFO: batch.c(339): test: 2.44 seconds speech, 0.02 seconds CPU, 0.04 seconds wall
      INFO: batch.c(341): test: 0.01 xRT (CPU), 0.02 xRT (elapsed)
      INFO: batch.c(349): TOTAL 2.44 seconds speech, 0.02 seconds CPU, 0.04 seconds wall
      INFO: batch.c(351): AVERAGE 0.01 xRT (CPU), 0.02 xRT (elapsed)

      TEST FINISHED

      But someone said me that problem is because pocketsphinx uses semi-continuous models unlikely Sphinx. Do you know if it's possible to convert from continuous models to semi-continuous models ?

      PS : I am French and if there are mistakes in my sentences, say me please. I wanna improve my level in English.

       
      • Nickolay V. Shmyrev

        > WARNING: "ngram_search_fwdtree.c", line 115: Filler word 28 = à_la_tête(3) has more than one phone, ignoring it.

        This looks like a bug in pocketsphinx related to the non-ascii character present in your dictionary. What encoding are you using? What pocketsphinx version are you trying?

        > But someone said me that problem is because pocketsphinx uses semi-continuous models unlikely Sphinx.

        It's not true. pocketsphinx can use both semicontinuous and continuous models

        > Do you know if it's possible to convert from continuous models to semi-continuous models ?

        It's not possible.

         
      • Nickolay V. Shmyrev

        Hi.

        Sorry, for late response. I looked into this and in my opinion the issue is that dictionary is not properly sorted. So this function:

        int32
        dict_get_num_main_words(dict_t * dict)
        {
        / FIXME FIXME: Relies on a particular ordering of the dictionary. /
        return dict_to_id(dict, "</s>");
        }

        instead of large number returns just something in the beginning. After that all words are counted as fillers.

        Could you please sort the dictionary with:

        LANG= LC_ALL= sort your.dict > your.dict.new

        Also make sure that </s> is not in your main dictionary.

         
    • Claude Aynaud

      Claude Aynaud - 2009-07-27

      Sorry it' unlike and not unlikely at the end of my previous post.

       
    • Claude Aynaud

      Claude Aynaud - 2009-07-28

      > What encoding are you using? What pocketsphinx version are you trying ?
      My encoding is UTF8 and my pocketsphinx version is 0.5, for sphinxbase I use version 0.4.
      But I also have this error for words exclusively with Ascii character, e.g :

      WARNING: "ngram_search_fwdtree.c", line 115: Filler word 43 = abandon has more than one phone, ignoring it.

      And presence or not of option -fdict doesn't affect the result. Seemingly, pocketsphinx skips this option and uses dictionnary as filler.

       

Log in to post a comment.