Menu

Problem with PocketSphinx using continuous AM

Help
mmarge
2012-02-15
2012-09-22
  • mmarge

    mmarge - 2012-02-15

    Hi,

    I'm getting the following error when trying to run a (successfully installed)
    pocketsphinx_batch with a continuous acoustic model:
    pocketsphinx_batch: ngram_search.c:707: ngram_compute_seg_score: Assertion
    `start_score > ((int)0xE0000000)' failed.
    Abort

    Do you know what could be going on?

    I'm running PocketSphinx 0.7 on Fedora 10.

    Here's the relevant set of parameters I used for PocketSphinx:

    -samprate 16000
    -cmn current
    -agc none
    -logfn ./desktop.log

    Here are the parameters in feat.params for the continuous acoustic model:
    -alpha 0.97
    -dither yes
    -doublebw no
    -nfilt 40
    -ncep 13
    -lowerf 200
    -upperf 6700
    -nfft 512
    -wlen 0.0256
    -transform dct
    -feat 1s_c_d_dd
    -agc none
    -cmn current
    -varnorm no

    Here's the output of the log file:

    INFO: cmd_ln.c(559): Parsing command line:
    \
    -alpha 0.97 \
    -dither yes \
    -doublebw no \
    -nfilt 40 \
    -ncep 13 \
    -lowerf 200 \
    -upperf 6700 \
    -nfft 512 \
    -wlen 0.0256 \
    -transform dct \
    -feat 1s_c_d_dd \
    -agc none \
    -cmn current \
    -varnorm no

    Current configuration:

    -agc none none
    -agcthresh 2.0 2.000000e+00
    -alpha 0.97 9.700000e-01
    -ceplen 13 13
    -cmn current current
    -cmninit 8.0 8.0
    -dither no yes
    -doublebw no no
    -feat 1s_c_d_dd 1s_c_d_dd
    -frate 100 100
    -input_endian little little
    -lda
    -ldadim 0 0
    -lifter 0 0
    -logspec no no
    -lowerf 133.33334 2.000000e+02
    -ncep 13 13
    -nfft 512 512
    -nfilt 40 40
    -remove_dc no no
    -round_filters yes yes
    -samprate 16000 1.600000e+04
    -seed -1 -1
    -smoothspec no no
    -svspec
    -transform legacy dct
    -unit_area yes yes
    -upperf 6855.4976 6.700000e+03
    -varnorm no no
    -verbose no no
    -warp_params
    -warp_type inverse_linear inverse_linear
    -wlen 0.025625 2.560000e-02

    INFO: acmod.c(242): Parsed model-specific feature parameters from
    /usr1/mrmarge/thesiswork/asr/testdata/200_6700.cd_cont_4000/feat.params
    INFO: fe_interface.c(289): You are using the internal mechanism to generate
    the seed.
    INFO: feat.c(697): Initializing feature stream to type: '1s_c_d_dd',
    ceplen=13, CMN='current', VARNORM='no', AGC='none'
    INFO: cmn.c(142): mean= 12.00, mean= 0.0
    INFO: mdef.c(520): Reading model definition:
    /usr1/mrmarge/thesiswork/asr/testdata/200_6700.cd_cont_4000/mdef
    INFO: bin_mdef.c(173): Allocating 70048 * 8 bytes (547 KiB) for CD tree
    INFO: tmat.c(205): Reading HMM transition probability matrices: /usr1/mrmarge/
    thesiswork/asr/testdata/200_6700.cd_cont_4000/transition_matrices
    INFO: acmod.c(117): Attempting to use SCHMM computation module
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
    /usr1/mrmarge/thesiswork/asr/testdata/200_6700.cd_cont_4000/means
    INFO: ms_gauden.c(292): 4132 codebook, 1 feature, size:
    INFO: ms_gauden.c(294): 32x39
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
    /usr1/mrmarge/thesiswork/asr/testdata/200_6700.cd_cont_4000/variances
    INFO: ms_gauden.c(292): 4132 codebook, 1 feature, size:
    INFO: ms_gauden.c(294): 32x39
    INFO: ms_gauden.c(354): 52 variance values floored
    INFO: acmod.c(119): Attempting to use PTHMM computation module
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
    /usr1/mrmarge/thesiswork/asr/testdata/200_6700.cd_cont_4000/means
    INFO: ms_gauden.c(292): 4132 codebook, 1 feature, size:
    INFO: ms_gauden.c(294): 32x39
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
    /usr1/mrmarge/thesiswork/asr/testdata/200_6700.cd_cont_4000/variances
    INFO: ms_gauden.c(292): 4132 codebook, 1 feature, size:
    INFO: ms_gauden.c(294): 32x39
    INFO: ms_gauden.c(354): 52 variance values floored
    INFO: ptm_mgau.c(800): Number of codebooks exceeds 256: 4132
    INFO: acmod.c(121): Falling back to general multi-stream GMM computation
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
    /usr1/mrmarge/thesiswork/asr/testdata/200_6700.cd_cont_4000/means
    INFO: ms_gauden.c(292): 4132 codebook, 1 feature, size:
    INFO: ms_gauden.c(294): 32x39
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
    /usr1/mrmarge/thesiswork/asr/testdata/200_6700.cd_cont_4000/variances
    INFO: ms_gauden.c(292): 4132 codebook, 1 feature, size:
    INFO: ms_gauden.c(294): 32x39
    INFO: ms_gauden.c(354): 52 variance values floored
    INFO: ms_senone.c(160): Reading senone mixture weights:
    /usr1/mrmarge/thesiswork/asr/testdata/200_6700.cd_cont_4000/mixture_weights
    INFO: ms_senone.c(211): Truncating senone logs3(pdf) values by 10 bits
    INFO: ms_senone.c(218): Not transposing mixture weights in memory
    INFO: ms_senone.c(277): Read mixture weights for 4132 senones: 1 features x 32
    codewords
    INFO: ms_senone.c(331): Mapping senones to individual codebooks
    INFO: ms_mgau.c(122): The value of topn: 4
    INFO: dict.c(306): Allocating 10371 * 32 bytes (324 KiB) for word entries
    INFO: dict.c(321): Reading main dictionary:
    /net/ginkgo/usr0/lqin/wsj_si284/etc/tcb05cnp.dic
    INFO: dict.c(212): Allocated 47 KiB for strings, 74 KiB for phones
    INFO: dict.c(324): 6232 words read
    INFO: dict.c(330): Reading filler dictionary:
    /usr1/mrmarge/thesiswork/asr/testdata/200_6700.cd_cont_4000/noisedict
    INFO: dict.c(212): Allocated 0 KiB for strings, 0 KiB for phones
    INFO: dict.c(333): 43 words read
    INFO: dict2pid.c(396): Building PID tables for dictionary
    INFO: dict2pid.c(404): Allocating 44^3 * 2 bytes (166 KiB) for word-initial
    triphones
    INFO: dict2pid.c(131): Allocated 46816 bytes (45 KiB) for word-final triphones
    INFO: dict2pid.c(195): Allocated 46816 bytes (45 KiB) for single-phone word
    triphones
    INFO: ngram_model_arpa.c(77): No \data\ mark in LM file
    INFO: ngram_model_dmp.c(142): Will use memory-mapped I/O for LM file
    INFO: ngram_model_dmp.c(196): ngrams 1=4989, 2=1639687, 3=2684151
    INFO: ngram_model_dmp.c(242): 4989 = LM.unigrams(+trailer) read
    WARNING: "ngram_model_dmp.c", line 253: -mmap specified, but tseg_base is not
    word-aligned. Will not memory-map.
    INFO: ngram_model_dmp.c(291): 1639687 = LM.bigrams(+trailer) read
    INFO: ngram_model_dmp.c(317): 2684151 = LM.trigrams read
    INFO: ngram_model_dmp.c(342): 25453 = LM.prob2 entries read
    INFO: ngram_model_dmp.c(362): 14719 = LM.bo_wt2 entries read
    INFO: ngram_model_dmp.c(382): 26030 = LM.prob3 entries read
    INFO: ngram_model_dmp.c(410): 3203 = LM.tseg_base entries read
    INFO: ngram_model_dmp.c(466): 4989 = ascii word strings read
    INFO: ngram_search_fwdtree.c(99): 440 unique initial diphones
    WARNING: "ngram_search_fwdtree.c", line 111: Filler word 6273 = ++UM++ has
    more than one phone, ignoring it.
    INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 53 single-
    phone words
    INFO: ngram_search_fwdtree.c(186): Creating search tree
    INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 53
    single-phone words
    INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 14037
    INFO: ngram_search_fwdtree.c(338): after: 440 root, 13909 non-root channels,
    52 single-phone words
    INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25
    INFO: cmn.c(175): CMN: 6.84 0.13 -0.08 0.22 -0.03 -0.10 -0.22 -0.28 -0.14
    -0.08 -0.20 -0.12 -0.15
    INFO: ngram_search.c(466): Resized backpointer table to 10000 entries
    INFO: ngram_search.c(466): Resized backpointer table to 20000 entries
    INFO: ngram_search_fwdtree.c(1549): 14385 words recognized (21/fr)
    INFO: ngram_search_fwdtree.c(1551): 1062811 senones evaluated (1527/fr)
    INFO: ngram_search_fwdtree.c(1553): 994948 channels searched (1429/fr), 242177
    1st, 95260 last
    INFO: ngram_search_fwdtree.c(1557): 26005 words for which last channels
    evaluated (37/fr)
    INFO: ngram_search_fwdtree.c(1560): 48649 candidate words for entering last
    phone (69/fr)
    INFO: ngram_search_fwdtree.c(1562): fwdtree 4.37 CPU 0.628 xRT
    INFO: ngram_search_fwdtree.c(1565): fwdtree 4.37 wall 0.628 xRT
    INFO: ngram_search_fwdflat.c(305): Utterance vocabulary contains 145 words
    INFO: ngram_search_fwdflat.c(940): 13902 words recognized (20/fr)
    INFO: ngram_search_fwdflat.c(942): 103646 senones evaluated (149/fr)
    INFO: ngram_search_fwdflat.c(944): 100589 channels searched (144/fr)
    INFO: ngram_search_fwdflat.c(946): 30637 words searched (44/fr)
    INFO: ngram_search_fwdflat.c(948): 8413 word transitions (12/fr)
    INFO: ngram_search_fwdflat.c(951): fwdflat 0.35 CPU 0.050 xRT
    INFO: ngram_search_fwdflat.c(954): fwdflat 0.35 wall 0.050 xRT
    INFO: ngram_search.c(1201): not found in last frame, using <sil>.694
    instead
    INFO: ngram_search.c(1253): lattice start node .0 end node <sil>.516
    INFO: ngram_search.c(1281): Eliminated 3030 nodes before end node
    INFO: ngram_search.c(1386): Lattice has 4677 nodes, 30154 links
    INFO: ps_lattice.c(1352): Normalizer P(O) = alpha(<sil>:516:694) = -1595596
    INFO: ps_lattice.c(1390): Joint P(O,S) = -1612563 P(S|O) = -16967
    INFO: ngram_search.c(875): bestpath 0.14 CPU 0.020 xRT
    INFO: ngram_search.c(878): bestpath 0.14 wall 0.020 xRT
    INFO: batch.c(760): wsj-5k-feats/Matt_Kinect_1m_200_6700/0_kinect: 6.95
    seconds speech, 4.86 seconds CPU, 4.86 seconds wall
    INFO: batch.c(762): wsj-5k-feats/Matt_Kinect_1m_200_6700/0_kinect: 0.70 xRT
    (CPU), 0.70 xRT (elapsed)
    INFO: cmn.c(175): CMN: 6.63 0.12 -0.05 0.13 -0.03 -0.05 -0.28 -0.27 -0.14
    -0.08 -0.16 -0.10 -0.15
    INFO: ngram_search_fwdtree.c(1549): 12206 words recognized (23/fr)
    INFO: ngram_search_fwdtree.c(1551): 856007 senones evaluated (1630/fr)
    INFO: ngram_search_fwdtree.c(1553): 806907 channels searched (1536/fr), 190707
    1st, 61226 last
    INFO: ngram_search_fwdtree.c(1557): 21307 words for which last channels
    evaluated (40/fr)
    INFO: ngram_search_fwdtree.c(1560): 41912 candidate words for entering last
    phone (79/fr)
    INFO: ngram_search_fwdtree.c(1562): fwdtree 3.59 CPU 0.684 xRT
    INFO: ngram_search_fwdtree.c(1565): fwdtree 3.59 wall 0.684 xRT
    INFO: ngram_search_fwdflat.c(305): Utterance vocabulary contains 68 words
    INFO: ngram_search_fwdflat.c(940): 11700 words recognized (22/fr)
    INFO: ngram_search_fwdflat.c(942): 69932 senones evaluated (133/fr)
    INFO: ngram_search_fwdflat.c(944): 57811 channels searched (110/fr)
    INFO: ngram_search_fwdflat.c(946): 20648 words searched (39/fr)
    INFO: ngram_search_fwdflat.c(948): 3988 word transitions (7/fr)
    INFO: ngram_search_fwdflat.c(951): fwdflat 0.21 CPU 0.039 xRT
    INFO: ngram_search_fwdflat.c(954): fwdflat 0.21 wall 0.039 xRT
    INFO: ngram_search.c(1201): </sil></sil>
    not found in last frame, using ++BEEP++.523
    instead
    INFO: ngram_search.c(1253): lattice start node .0 end node ++BEEP++.383
    INFO: ngram_search.c(1281): Eliminated 1013 nodes before end node
    </sil>

    I get only one hypothesis, when I should have received 25 hypotheses.

    Thanks for your help!
    Matt

     
  • Nickolay V. Shmyrev

    Sorry, it's not quite clear where the assertion is failed, it is missed in the
    log. Can you please share the full log?

    To make it easier to diagnose your problem and to get help you can always
    share the data which will enable to reproduce your problem.

     
  • mmarge

    mmarge - 2012-02-16

    Thanks for replying so quickly - I am pretty sure I pasted the full log - that
    was the entirety of contents in desktop.log (this is the file which saves the
    log as specified in PocketSphinx's command line params.

    Do you have any ideas what could be the problem? Thanks!

     
  • Nickolay V. Shmyrev

    Do you have any ideas what could be the problem? Thanks!

    No, without data it's hard to say anything.

     

Log in to post a comment.