Menu

pocketsphinx_continuous decoding error

Help
mehmet
2011-09-20
2012-09-22
  • mehmet

    mehmet - 2011-09-20

    Hi, I use pocketsphinx_continuous to decode long audio files. I run the
    following command to decode 16 minute audio file with 16K sampling rate:
    pocketsphinx_continuous -hmm /mnt/asr/ar-model/am/ar-ipsos -dict /mnt/asr/ar-
    model/lm/ar-ipsos/arabic.dic -lm /mnt/asr/ar-model/lm/ar-
    ipsos/arabic.ug.lm.DMP -adcin yes -samprate 16000 -hypseg aligned -infile
    out.tmp.ready.wav

    Then, it aborted with the following error:
    ...
    INFO: ms_gauden.c(292): 6105 codebook, 1 feature, size:
    INFO: ms_gauden.c(294): 16x39
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /mnt/asr/ar-
    model/am/ar-ipsos/variances
    INFO: ms_gauden.c(292): 6105 codebook, 1 feature, size:
    INFO: ms_gauden.c(294): 16x39
    INFO: ms_gauden.c(354): 92 variance values floored
    INFO: ms_senone.c(160): Reading senone mixture weights: /mnt/asr/ar-model/am
    /ar-ipsos/mixture_weights
    INFO: ms_senone.c(211): Truncating senone logs3(pdf) values by 10 bits
    INFO: ms_senone.c(218): Not transposing mixture weights in memory
    INFO: ms_senone.c(277): Read mixture weights for 6105 senones: 1 features x 16
    codewords
    INFO: ms_senone.c(331): Mapping senones to individual codebooks
    INFO: ms_mgau.c(122): The value of topn: 4
    INFO: dict.c(306): Allocating 111223 * 32 bytes (3475 KiB) for word entries
    INFO: dict.c(321): Reading main dictionary: /mnt/asr/ar-model/lm/ar-
    ipsos/arabic.dic
    INFO: dict.c(212): Allocated 983 KiB for strings, 2022 KiB for phones
    INFO: dict.c(324): 107124 words read
    INFO: dict.c(330): Reading filler dictionary: /mnt/asr/ar-model/am/ar-
    ipsos/noisedict
    INFO: dict.c(212): Allocated 0 KiB for strings, 0 KiB for phones
    INFO: dict.c(333): 3 words read
    INFO: dict2pid.c(396): Building PID tables for dictionary
    INFO: dict2pid.c(404): Allocating 35^3 * 2 bytes (83 KiB) for word-initial
    triphones
    INFO: dict2pid.c(131): Allocated 29680 bytes (28 KiB) for word-final triphones
    INFO: dict2pid.c(195): Allocated 29680 bytes (28 KiB) for single-phone word
    triphones
    INFO: ngram_model_arpa.c(77): No \data\ mark in LM file
    INFO: ngram_model_dmp.c(142): Will use memory-mapped I/O for LM file
    INFO: ngram_model_dmp.c(196): ngrams 1=65530, 2=564123, 3=867799
    INFO: ngram_model_dmp.c(242): 65530 = LM.unigrams(+trailer) read
    INFO: ngram_model_dmp.c(291): 564123 = LM.bigrams(+trailer) read
    INFO: ngram_model_dmp.c(317): 867799 = LM.trigrams read
    INFO: ngram_model_dmp.c(342): 8396 = LM.prob2 entries read
    INFO: ngram_model_dmp.c(362): 8781 = LM.bo_wt2 entries read
    INFO: ngram_model_dmp.c(382): 3170 = LM.prob3 entries read
    INFO: ngram_model_dmp.c(410): 1102 = LM.tseg_base entries read
    INFO: ngram_model_dmp.c(466): 65530 = ascii word strings read
    INFO: ngram_search_fwdtree.c(99): 543 unique initial diphones
    INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 14 single-
    phone words
    INFO: ngram_search_fwdtree.c(186): Creating search tree
    INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 14
    single-phone words
    INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 235750
    INFO: ngram_search_fwdtree.c(338): after: 454 root, 235622 non-root channels,
    12 single-phone words
    INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25
    INFO: continuous.c(367): pocketsphinx_continuous COMPILED ON: Sep 20 2011, AT:
    11:20:36

    INFO: ngram_search.c(466): Resized backpointer table to 10000 entries
    INFO: ngram_search.c(474): Resized score stack to 200000 entries
    INFO: ngram_search.c(466): Resized backpointer table to 20000 entries
    INFO: ngram_search.c(474): Resized score stack to 400000 entries
    INFO: ngram_search.c(466): Resized backpointer table to 40000 entries
    INFO: ngram_search.c(474): Resized score stack to 800000 entries
    INFO: ngram_search.c(466): Resized backpointer table to 80000 entries
    INFO: ngram_search_fwdtree.c(951): cand_sf increased to 64 entries
    INFO: ngram_search.c(474): Resized score stack to 1600000 entries
    INFO: ngram_search.c(466): Resized backpointer table to 160000 entries
    INFO: ngram_search.c(474): Resized score stack to 3200000 entries
    INFO: ngram_search.c(466): Resized backpointer table to 320000 entries
    INFO: ngram_search.c(474): Resized score stack to 6400000 entries
    pocketsphinx_continuous: feat.c:362: feat_array_alloc: Assertion `nfr > 0'
    failed.
    Aborted

    How can I solve this problem?

     
  • Nickolay V. Shmyrev

    Please share the file

     
  • mehmet

    mehmet - 2011-09-20

    the sizes of model files are too large to be uploaded.

     
  • Nickolay V. Shmyrev

    Hello

    I only need the audio file, I do not need the models.

     
  • mehmet

    mehmet - 2011-09-21

    the problem was because of the audio file. now, it runs and terminates but I
    have another problem now. I was using hypseg parameter with pocketsphinx_batch
    to obtain the start frames of the words. Now even if I pass this argument to
    pocketsphinx_continuous, it does not provide any output. -hyp parameter does
    not work as well. It prints output to the standard output without start frames
    which is useless for me. How can I get start frames then? Thanks for your kind
    support.

     
  • Nickolay V. Shmyrev

    This feature is not supported.

     
  • mehmet

    mehmet - 2011-09-21

    Since I need the timings, then I guess I have to use pocketsphinx_batch. How
    long audio files can pocketsphinx_batch can decode at most? I will be
    splitting the audio files if their length exceed this upper bound.

     
  • mehmet

    mehmet - 2011-09-21

    sorry for consuming your time, I should have searched more:)

     

Log in to post a comment.