Menu

pocketsphinx gst plugin not loading hmm?

Help
luree
2012-03-27
2012-09-22
  • luree

    luree - 2012-03-27

    Good day!

    I'm trying to use pocketsphinx to decode audio from video files in another
    language using the gstreamer pocketsphinx plugin.

    However, it seems pocketsphinx would not use the files I specified, except for
    the dictionary. As a result, I get many errors saying "missing phone in the
    acoustic model", and many words in the dictionary are left out.

    The program works fine except for this. The decoder is able to produce partial
    and final results per utterance, as expected, and I am able to access and
    output these strings.

    I will be posting my code and the log.

    Any thoughts? Thanks! ^-^

     
  • luree

    luree - 2012-03-27

    The relevant part:

    int main (int argc, char *argv[])
    {
        GMainLoop *loop;
        GstElement *pipeline, *source, *vad, *asr;
        GstBus *bus;
        GError *error = NULL;
    
    /* Initialization */
        gst_init (&argc, &argv);
        loop = g_main_loop_new (NULL, FALSE);
    
    /* Create the pipeline */
    /* Will display decoding results on top of video */
        pipeline = gst_parse_launch("filesrc name=source ! decodebin2 name=mux ! tee name=t ! queue ! autoaudiosink t. ! queue ! audioconvert ! audioresample ! vader name=vad ! pocketsphinx name=asr mux. ! queue ! ffmpegcolorspace ! textoverlay name=captions ! autovideosink", &error);
    
        if (!pipeline) {
            g_error("Pipeline could not be created: %s", error->message);
            return -1;
        }
    
    /* Get element pointers from pipeline */
        source = gst_bin_get_by_name(GST_BIN(pipeline), "source");
        vad = gst_bin_get_by_name(GST_BIN(pipeline), "vad");
        asr = gst_bin_get_by_name(GST_BIN(pipeline), "asr");
        captions = gst_bin_get_by_name(GST_BIN(pipeline), "captions");
    
    /* Open the source file */
        g_object_set(G_OBJECT (source), "location", argv[1], NULL);
    
    /* Configure vader element */
        g_object_set(G_OBJECT (vad), "silent", TRUE, NULL);
    
    /* Force ASR to be initialized */
        g_object_set(G_OBJECT (asr), "configured", TRUE, NULL);
    
    /* Load Acoustic model, Language model, and dictionary */
        g_object_set(G_OBJECT (asr), "hmm", PS_MODEL "/hmm/news_SMS_models/mobileasr.cd_semi_8000", NULL);
        g_object_set(G_OBJECT (asr), "lm", PS_MODEL "/lm/news_SMS_models/mobileasr_NewsSMS.lm.DMP", NULL);
        g_object_set(G_OBJECT (asr), "dict", PS_MODEL "/lm/news_SMS_models/mobileasr_test.dic", NULL);
    
    /* Add a message handler */
        bus = gst_pipeline_get_bus (GST_PIPELINE (pipeline));
        gst_bus_add_watch (bus, bus_call, loop);
        gst_object_unref (bus);
    
    /* Set the pipeline to "playing" state */
        g_print ("Now playing: %s\n", argv[1]);
        gst_element_set_state (pipeline, GST_STATE_PLAYING);
        g_print ("Running...\n");
        g_main_loop_run (loop);
        g_print ("Returned, stopping playback\n");
        gst_element_set_state (pipeline, GST_STATE_NULL);
        g_print ("Deleting pipeline\n");
        gst_object_unref (GST_OBJECT (pipeline));
        return 0;
    }
    
     
  • luree

    luree - 2012-03-27

    The log file:

    INFO: cmd_ln.c(691): Parsing command line:
    gst-pocketsphinx \
        -samprate 8000 \
        -cmn prior \
        -nfft 256 \
        -fwdflat no \
        -bestpath no \
        -maxhmmpf 1000 \
        -maxwpf 10
    
    Current configuration:
    [NAME]      [DEFLT]     [VALUE]
    -agc        none        none
    -agcthresh  2.0     2.000000e+00
    -alpha      0.97        9.700000e-01
    -ascale     20.0        2.000000e+01
    -aw     1       1
    -backtrace  no      no
    -beam       1e-48       1.000000e-48
    -bestpath   yes     no
    -bestpathlw 9.5     9.500000e+00
    -bghist     no      no
    -ceplen     13      13
    -cmn        current     prior
    -cmninit    8.0     8.0
    -compallsen no      no
    -debug              0
    -dict               
    -dictcase   no      no
    -dither     no      no
    -doublebw   no      no
    -ds     1       1
    -fdict              
    -feat       1s_c_d_dd   1s_c_d_dd
    -featparams         
    -fillprob   1e-8        1.000000e-08
    -frate      100     100
    -fsg                
    -fsgusealtpron  yes     yes
    -fsgusefiller   yes     yes
    -fwdflat    yes     no
    -fwdflatbeam    1e-64       1.000000e-64
    -fwdflatefwid   4       4
    -fwdflatlw  8.5     8.500000e+00
    -fwdflatsfwin   25      25
    -fwdflatwbeam   7e-29       7.000000e-29
    -fwdtree    yes     yes
    -hmm                
    -input_endian   little      little
    -jsgf               
    -kdmaxbbi   -1      -1
    -kdmaxdepth 0       0
    -kdtree             
    -latsize    5000        5000
    -lda                
    -ldadim     0       0
    -lextreedump    0       0
    -lifter     0       0
    -lm             
    -lmctl              
    -lmname     default     default
    -logbase    1.0001      1.000100e+00
    -logfn              
    -logspec    no      no
    -lowerf     133.33334   1.333333e+02
    -lpbeam     1e-40       1.000000e-40
    -lponlybeam 7e-29       7.000000e-29
    -lw     6.5     6.500000e+00
    -maxhmmpf   -1      1000
    -maxnewoov  20      20
    -maxwpf     -1      10
    -mdef               
    -mean               
    -mfclogdir          
    -min_endfr  0       0
    -mixw               
    -mixwfloor  0.0000001   1.000000e-07
    -mllr               
    -mmap       yes     yes
    -ncep       13      13
    -nfft       512     256
    -nfilt      40      40
    -nwpen      1.0     1.000000e+00
    -pbeam      1e-48       1.000000e-48
    -pip        1.0     1.000000e+00
    -pl_beam    1e-10       1.000000e-10
    -pl_pbeam   1e-5        1.000000e-05
    -pl_window  0       0
    -rawlogdir          
    -remove_dc  no      no
    -round_filters  yes     yes
    -samprate   16000       8.000000e+03
    -seed       -1      -1
    -sendump            
    -senlogdir          
    -senmgau            
    -silprob    0.005       5.000000e-03
    -smoothspec no      no
    -svspec             
    -tmat               
    -tmatfloor  0.0001      1.000000e-04
    -topn       4       4
    -topn_beam  0       0
    -toprule            
    -transform  legacy      legacy
    -unit_area  yes     yes
    -upperf     6855.4976   6.855498e+03
    -usewdphones    no      no
    -uw     1.0     1.000000e+00
    -var                
    -varfloor   0.0001      1.000000e-04
    -varnorm    no      no
    -verbose    no      no
    -warp_params            
    -warp_type  inverse_linear  inverse_linear
    -wbeam      7e-29       7.000000e-29
    -wip        0.65        6.500000e-01
    -wlen       0.025625    2.562500e-02
    
    INFO: cmd_ln.c(691): Parsing command line:
    \
        -nfilt 20 \
        -lowerf 1 \
        -upperf 4000 \
        -wlen 0.025 \
        -transform dct \
        -round_filters no \
        -remove_dc yes \
        -svspec 0-12/13-25/26-38 \
        -feat 1s_c_d_dd \
        -agc none \
        -cmn current \
        -cmninit 56,-3,1 \
        -varnorm no
    
    Current configuration:
    [NAME]      [DEFLT]     [VALUE]
    -agc        none        none
    -agcthresh  2.0     2.000000e+00
    -alpha      0.97        9.700000e-01
    -ceplen     13      13
    -cmn        current     current
    -cmninit    8.0     56,-3,1
    -dither     no      no
    -doublebw   no      no
    -feat       1s_c_d_dd   1s_c_d_dd
    -frate      100     100
    -input_endian   little      little
    -lda                
    -ldadim     0       0
    -lifter     0       0
    -logspec    no      no
    -lowerf     133.33334   1.000000e+00
    -ncep       13      13
    -nfft       512     256
    -nfilt      40      20
    -remove_dc  no      yes
    -round_filters  yes     no
    -samprate   16000       8.000000e+03
    -seed       -1      -1
    -smoothspec no      no
    -svspec             0-12/13-25/26-38
    -transform  legacy      dct
    -unit_area  yes     yes
    -upperf     6855.4976   4.000000e+03
    -varnorm    no      no
    -verbose    no      no
    -warp_params            
    -warp_type  inverse_linear  inverse_linear
    -wlen       0.025625    2.500000e-02
    
    INFO: acmod.c(242): Parsed model-specific feature parameters from /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/feat.params
    INFO: feat.c(684): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
    INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
    INFO: acmod.c(163): Using subvector specification 0-12/13-25/26-38
    INFO: mdef.c(520): Reading model definition: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/mdef
    INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
    INFO: bin_mdef.c(330): Reading binary model definition: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/mdef
    INFO: bin_mdef.c(507): 50 CI-phone, 143047 CD-phone, 3 emitstate/phone, 150 CI-sen, 5150 Sen, 27135 Sen-Seq
    INFO: tmat.c(205): Reading HMM transition probability matrices: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/transition_matrices
    INFO: acmod.c(117): Attempting to use SCHMM computation module
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/means
    INFO: ms_gauden.c(292): 1 codebook, 3 feature, size: 
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/variances
    INFO: ms_gauden.c(292): 1 codebook, 3 feature, size: 
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(354): 0 variance values floored
    INFO: s2_semi_mgau.c(908): Loading senones from dump file /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/sendump
    INFO: s2_semi_mgau.c(932): BEGIN FILE FORMAT DESCRIPTION
    INFO: s2_semi_mgau.c(1027): Using memory-mapped I/O for senones
    INFO: s2_semi_mgau.c(1304): Maximum top-N: 4 Top-N beams: 0 0 0
    INFO: dict.c(306): Allocating 137542 * 32 bytes (4298 KiB) for word entries
    INFO: dict.c(321): Reading main dictionary: /usr/local/share/pocketsphinx/model/lm/en_US/cmu07a.dic
    INFO: dict.c(212): Allocated 1010 KiB for strings, 1664 KiB for phones
    INFO: dict.c(324): 133436 words read
    INFO: dict.c(330): Reading filler dictionary: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/noisedict
    INFO: dict.c(212): Allocated 0 KiB for strings, 0 KiB for phones
    INFO: dict.c(333): 11 words read
    INFO: dict2pid.c(396): Building PID tables for dictionary
    INFO: dict2pid.c(404): Allocating 50^3 * 2 bytes (244 KiB) for word-initial triphones
    INFO: dict2pid.c(131): Allocated 60400 bytes (58 KiB) for word-final triphones
    INFO: dict2pid.c(195): Allocated 60400 bytes (58 KiB) for single-phone word triphones
    INFO: ngram_model_arpa.c(77): No \data\ mark in LM file
    INFO: ngram_model_dmp.c(142): Will use memory-mapped I/O for LM file
    INFO: ngram_model_dmp.c(196): ngrams 1=5001, 2=436879, 3=418286
    INFO: ngram_model_dmp.c(242):     5001 = LM.unigrams(+trailer) read
    INFO: ngram_model_dmp.c(291):   436879 = LM.bigrams(+trailer) read
    INFO: ngram_model_dmp.c(317):   418286 = LM.trigrams read
    INFO: ngram_model_dmp.c(342):    37293 = LM.prob2 entries read
    INFO: ngram_model_dmp.c(362):    14370 = LM.bo_wt2 entries read
    INFO: ngram_model_dmp.c(382):    36094 = LM.prob3 entries read
    INFO: ngram_model_dmp.c(410):      854 = LM.tseg_base entries read
    INFO: ngram_model_dmp.c(466):     5001 = ascii word strings read
    INFO: ngram_search_fwdtree.c(99): 788 unique initial diphones
    INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 60 single-phone words
    INFO: ngram_search_fwdtree.c(186): Creating search tree
    INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 60 single-phone words
    INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 13428
    INFO: ngram_search_fwdtree.c(338): after: 457 root, 13300 non-root channels, 26 single-phone words
    INFO: ngram_search_fwdtree.c(430): TOTAL fwdtree 0.00 CPU -nan xRT
    INFO: ngram_search_fwdtree.c(433): TOTAL fwdtree 0.00 wall -nan xRT
    INFO: cmd_ln.c(691): Parsing command line:
    \
        -nfilt 20 \
        -lowerf 1 \
        -upperf 4000 \
        -wlen 0.025 \
        -transform dct \
        -round_filters no \
        -remove_dc yes \
        -svspec 0-12/13-25/26-38 \
        -feat 1s_c_d_dd \
        -agc none \
        -cmn current \
        -cmninit 56,-3,1 \
        -varnorm no
    
    Current configuration:
    [NAME]      [DEFLT]     [VALUE]
    -agc        none        none
    -agcthresh  2.0     2.000000e+00
    -alpha      0.97        9.700000e-01
    -ceplen     13      13
    -cmn        current     current
    -cmninit    8.0     56,-3,1
    -dither     no      no
    -doublebw   no      no
    -feat       1s_c_d_dd   1s_c_d_dd
    -frate      100     100
    -input_endian   little      little
    -lda                
    -ldadim     0       0
    -lifter     0       0
    -logspec    no      no
    -lowerf     133.33334   1.000000e+00
    -ncep       13      13
    -nfft       512     256
    -nfilt      40      20
    -remove_dc  no      yes
    -round_filters  yes     no
    -samprate   16000       8.000000e+03
    -seed       -1      -1
    -smoothspec no      no
    -svspec             0-12/13-25/26-38
    -transform  legacy      dct
    -unit_area  yes     yes
    -upperf     6855.4976   4.000000e+03
    -varnorm    no      no
    -verbose    no      no
    -warp_params            
    -warp_type  inverse_linear  inverse_linear
    -wlen       0.025625    2.500000e-02
    
    INFO: acmod.c(242): Parsed model-specific feature parameters from /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/feat.params
    INFO: feat.c(684): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
    INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
    INFO: acmod.c(163): Using subvector specification 0-12/13-25/26-38
    INFO: mdef.c(520): Reading model definition: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/mdef
    INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
    INFO: bin_mdef.c(330): Reading binary model definition: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/mdef
    INFO: bin_mdef.c(507): 50 CI-phone, 143047 CD-phone, 3 emitstate/phone, 150 CI-sen, 5150 Sen, 27135 Sen-Seq
    INFO: tmat.c(205): Reading HMM transition probability matrices: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/transition_matrices
    INFO: acmod.c(117): Attempting to use SCHMM computation module
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/means
    INFO: ms_gauden.c(292): 1 codebook, 3 feature, size: 
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/variances
    INFO: ms_gauden.c(292): 1 codebook, 3 feature, size: 
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(354): 0 variance values floored
    INFO: s2_semi_mgau.c(908): Loading senones from dump file /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/sendump
    INFO: s2_semi_mgau.c(932): BEGIN FILE FORMAT DESCRIPTION
    INFO: s2_semi_mgau.c(1027): Using memory-mapped I/O for senones
    INFO: s2_semi_mgau.c(1304): Maximum top-N: 4 Top-N beams: 0 0 0
    INFO: dict.c(306): Allocating 137542 * 32 bytes (4298 KiB) for word entries
    INFO: dict.c(321): Reading main dictionary: /usr/local/share/pocketsphinx/model/lm/en_US/cmu07a.dic
    INFO: dict.c(212): Allocated 1010 KiB for strings, 1664 KiB for phones
    INFO: dict.c(324): 133436 words read
    INFO: dict.c(330): Reading filler dictionary: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/noisedict
    INFO: dict.c(212): Allocated 0 KiB for strings, 0 KiB for phones
    INFO: dict.c(333): 11 words read
    INFO: dict2pid.c(396): Building PID tables for dictionary
    INFO: dict2pid.c(404): Allocating 50^3 * 2 bytes (244 KiB) for word-initial triphones
    INFO: dict2pid.c(131): Allocated 60400 bytes (58 KiB) for word-final triphones
    INFO: dict2pid.c(195): Allocated 60400 bytes (58 KiB) for single-phone word triphones
    INFO: ngram_model_arpa.c(77): No \data\ mark in LM file
    INFO: ngram_model_dmp.c(142): Will use memory-mapped I/O for LM file
    INFO: ngram_model_dmp.c(196): ngrams 1=5001, 2=436879, 3=418286
    INFO: ngram_model_dmp.c(242):     5001 = LM.unigrams(+trailer) read
    INFO: ngram_model_dmp.c(291):   436879 = LM.bigrams(+trailer) read
    INFO: ngram_model_dmp.c(317):   418286 = LM.trigrams read
    INFO: ngram_model_dmp.c(342):    37293 = LM.prob2 entries read
    INFO: ngram_model_dmp.c(362):    14370 = LM.bo_wt2 entries read
    INFO: ngram_model_dmp.c(382):    36094 = LM.prob3 entries read
    INFO: ngram_model_dmp.c(410):      854 = LM.tseg_base entries read
    INFO: ngram_model_dmp.c(466):     5001 = ascii word strings read
    INFO: ngram_search_fwdtree.c(99): 788 unique initial diphones
    INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 60 single-phone words
    INFO: ngram_search_fwdtree.c(186): Creating search tree
    INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 60 single-phone words
    INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 13428
    INFO: ngram_search_fwdtree.c(338): after: 457 root, 13300 non-root channels, 26 single-phone words
    INFO: ngram_model_arpa.c(77): No \data\ mark in LM file
    INFO: ngram_model_dmp.c(142): Will use memory-mapped I/O for LM file
    INFO: ngram_model_dmp.c(196): ngrams 1=6685, 2=10803, 3=11727
    INFO: ngram_model_dmp.c(242):     6685 = LM.unigrams(+trailer) read
    INFO: ngram_model_dmp.c(291):    10803 = LM.bigrams(+trailer) read
    INFO: ngram_model_dmp.c(317):    11727 = LM.trigrams read
    INFO: ngram_model_dmp.c(342):     2265 = LM.prob2 entries read
    INFO: ngram_model_dmp.c(362):     1943 = LM.bo_wt2 entries read
    INFO: ngram_model_dmp.c(382):      492 = LM.prob3 entries read
    INFO: ngram_model_dmp.c(410):       22 = LM.tseg_base entries read
    INFO: ngram_model_dmp.c(466):     6685 = ascii word strings read
    INFO: ngram_search_fwdtree.c(99): 788 unique initial diphones
    INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 60 single-phone words
    INFO: ngram_search_fwdtree.c(186): Creating search tree
    INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 60 single-phone words
    INFO: ngram_search_fwdtree.c(338): after: 457 root, 13300 non-root channels, 26 single-phone words
    INFO: ngram_search_fwdtree.c(430): TOTAL fwdtree 0.00 CPU -nan xRT
    INFO: ngram_search_fwdtree.c(433): TOTAL fwdtree 0.00 wall -nan xRT
    INFO: cmd_ln.c(691): Parsing command line:
    \
        -nfilt 20 \
        -lowerf 1 \
        -upperf 4000 \
        -wlen 0.025 \
        -transform dct \
        -round_filters no \
        -remove_dc yes \
        -svspec 0-12/13-25/26-38 \
        -feat 1s_c_d_dd \
        -agc none \
        -cmn current \
        -cmninit 56,-3,1 \
        -varnorm no
    
    Current configuration:
    [NAME]      [DEFLT]     [VALUE]
    -agc        none        none
    -agcthresh  2.0     2.000000e+00
    -alpha      0.97        9.700000e-01
    -ceplen     13      13
    -cmn        current     current
    -cmninit    8.0     56,-3,1
    -dither     no      no
    -doublebw   no      no
    -feat       1s_c_d_dd   1s_c_d_dd
    -frate      100     100
    -input_endian   little      little
    -lda                
    -ldadim     0       0
    -lifter     0       0
    -logspec    no      no
    -lowerf     133.33334   1.000000e+00
    -ncep       13      13
    -nfft       512     256
    -nfilt      40      20
    -remove_dc  no      yes
    -round_filters  yes     no
    -samprate   16000       8.000000e+03
    -seed       -1      -1
    -smoothspec no      no
    -svspec             0-12/13-25/26-38
    -transform  legacy      dct
    -unit_area  yes     yes
    -upperf     6855.4976   4.000000e+03
    -varnorm    no      no
    -verbose    no      no
    -warp_params            
    -warp_type  inverse_linear  inverse_linear
    -wlen       0.025625    2.500000e-02
    
    INFO: acmod.c(242): Parsed model-specific feature parameters from /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/feat.params
    INFO: feat.c(684): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
    INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
    INFO: acmod.c(163): Using subvector specification 0-12/13-25/26-38
    INFO: mdef.c(520): Reading model definition: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/mdef
    INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
    INFO: bin_mdef.c(330): Reading binary model definition: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/mdef
    INFO: bin_mdef.c(507): 50 CI-phone, 143047 CD-phone, 3 emitstate/phone, 150 CI-sen, 5150 Sen, 27135 Sen-Seq
    INFO: tmat.c(205): Reading HMM transition probability matrices: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/transition_matrices
    INFO: acmod.c(117): Attempting to use SCHMM computation module
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/means
    INFO: ms_gauden.c(292): 1 codebook, 3 feature, size: 
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/variances
    INFO: ms_gauden.c(292): 1 codebook, 3 feature, size: 
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(354): 0 variance values floored
    INFO: s2_semi_mgau.c(908): Loading senones from dump file /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/sendump
    INFO: s2_semi_mgau.c(932): BEGIN FILE FORMAT DESCRIPTION
    INFO: s2_semi_mgau.c(1027): Using memory-mapped I/O for senones
    INFO: s2_semi_mgau.c(1304): Maximum top-N: 4 Top-N beams: 0 0 0
    INFO: dict.c(306): Allocating 20902 * 32 bytes (653 KiB) for word entries
    INFO: dict.c(321): Reading main dictionary: /usr/local/share/pocketsphinx/model/lm/news_SMS_models/mobileasr_test.dic
    ERROR: "dict.c", line 194: Line 1: Phone 'IX' is mising in the acoustic model; word ''di' ignored
    ERROR: "dict.c", line 194: Line 3: Phone 'OX' is mising in the acoustic model; word ''ko' ignored
    ERROR: "dict.c", line 194: Line 4: Phone 'OX' is mising in the acoustic model; word ''kong' ignored
    ERROR: "dict.c", line 194: Line 6: Phone 'OX' is mising in the acoustic model; word ''no' ignored
    ERROR: "dict.c", line 194: Line 7: Phone 'AX' is mising in the acoustic model; word ''pag' ignored
    ERROR: "dict.c", line 194: Line 8: Phone 'IX' is mising in the acoustic model; word ''pinas' ignored
    ERROR: "dict.c", line 194: Line 16794: Phone 'H' is mising in the acoustic model; word 'zucchini' ignored
    INFO: dict.c(212): Allocated 12 KiB for strings, 20 KiB for phones
    INFO: dict.c(324): 2327 words read
    INFO: dict.c(330): Reading filler dictionary: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/noisedict
    INFO: dict.c(212): Allocated 0 KiB for strings, 0 KiB for phones
    INFO: dict.c(333): 11 words read
    INFO: dict2pid.c(396): Building PID tables for dictionary
    INFO: dict2pid.c(404): Allocating 50^3 * 2 bytes (244 KiB) for word-initial triphones
    INFO: dict2pid.c(131): Allocated 60400 bytes (58 KiB) for word-final triphones
    INFO: dict2pid.c(195): Allocated 60400 bytes (58 KiB) for single-phone word triphones
    INFO: ngram_model_arpa.c(77): No \data\ mark in LM file
    INFO: ngram_model_dmp.c(142): Will use memory-mapped I/O for LM file
    INFO: ngram_model_dmp.c(196): ngrams 1=6685, 2=10803, 3=11727
    INFO: ngram_model_dmp.c(242):     6685 = LM.unigrams(+trailer) read
    INFO: ngram_model_dmp.c(291):    10803 = LM.bigrams(+trailer) read
    INFO: ngram_model_dmp.c(317):    11727 = LM.trigrams read
    INFO: ngram_model_dmp.c(342):     2265 = LM.prob2 entries read
    INFO: ngram_model_dmp.c(362):     1943 = LM.bo_wt2 entries read
    INFO: ngram_model_dmp.c(382):      492 = LM.prob3 entries read
    INFO: ngram_model_dmp.c(410):       22 = LM.tseg_base entries read
    INFO: ngram_model_dmp.c(466):     6685 = ascii word strings read
    INFO: ngram_search_fwdtree.c(99): 374 unique initial diphones
    INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 26 single-phone words
    INFO: ngram_search_fwdtree.c(186): Creating search tree
    INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 26 single-phone words
    INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 2738
    INFO: ngram_search_fwdtree.c(338): after: 336 root, 2610 non-root channels, 23 single-phone words
    Now playing: 05-111101-001.flv
    Running...
    
     
  • Nickolay V. Shmyrev

    You need to set

    configured
    

    property as a last step. You need to set

    hmm
    

    property before

    configured
    

    property.

     
  • luree

    luree - 2012-03-27

    I see, thank you very much!

    I'm not sure if I can ask this in the same thread, but..

    Now I'm encountering a problem I do not understand.
    Sometimes, when I run the program, the decoder initialization stops at this
    error:

    INFO: acmod.c(242): Parsed model-specific feature parameters from /usr/local/share/pocketsphinx/model/hmm/news_SMS_models/mobileasr.cd_semi_8000/feat.params
    FATAL_ERROR: "fe_sigproc.c", line 399: WTF, 5078.125000 < -15.625000 > 5734.375000
    

    Sometimes this happens; other times, everything works fine. I do not see a
    trend when it fails or not, and I have not experienced such when using the
    default models.

    Is this a problem with the models I'm using?

    Again, thanks for the help! ^-^

     
  • Nickolay V. Shmyrev

    Sometimes, when I run the program, the decoder initialization stops at this
    error:

    This means that sometimes sample rate is not properly set or properly
    negotiated in pipeline. It might be that another model doesn't properly
    configure the frontentd through feat.params options or something else.

    I recommend you to use latest version, it it the messages are btter.

                    E_FATAL("Failed to create filterbank, frequency range does not match. "
                            "Sample rate %f, FFT size %d, lowerf %f < freq %f > upperf %f.\n", mel_fb->s
    
     

Log in to post a comment.