Menu

pocketsphinx program listens and then crashes

Help
Shane
2011-05-09
2012-09-22
  • Shane

    Shane - 2011-05-09

    Hello,

    I have written a small program using the pocketsphinx library -- it basically
    takes code from the HelloWorld tutorial and continuous.c. When I run it, it
    waits for the user to talk and tries to process it and comes out with an
    initial answer. It outputs READY.... but then it crashes giving an Access
    violation writing location 0xfeeeff12. Visual Studio then loads _file.c and
    shows the call stack. Is there anything else that I need to give you to help
    with solving the problem?

    Thanks.

     
  • Nickolay V. Shmyrev

    You need to provide call stack and your application source code at least.

     
  • Shane

    Shane - 2011-05-17

    Sorry it's taken me so long to get back to you. Ok, here is what is displayed
    on the console:

    INFO: cmd_ln.c(559): Parsing command line:
    \
    -hmm /pocketsphinx-0.7/model/hmm/en_US/hub4wsj_sc_8k \
    -lm /pocketsphinx-0.7/model/lm/en/turtle.DMP \
    -dict /pocketsphinx-0.7/model/lm/en/turtle.dic

    Current configuration:

    -agc none none
    -agcthresh 2.0 2.000000e+000
    -alpha 0.97 9.700000e-001
    -ascale 20.0 2.000000e+001
    -aw 1 1
    -backtrace no no
    -beam 1e-48 1.000000e-048
    -bestpath yes yes
    -bestpathlw 9.5 9.500000e+000
    -bghist no no
    -ceplen 13 13
    -cmn current current
    -cmninit 8.0 8.0
    -compallsen no no
    -debug 0
    -dict /pocketsphinx-0.7/model/lm/en/turtle.dic
    -dictcase no no
    -dither no no
    -doublebw no no
    -ds 1 1
    -fdict
    -feat 1s_c_d_dd 1s_c_d_dd
    -featparams
    -fillprob 1e-8 1.000000e-008
    -frate 100 100
    -fsg
    -fsgusealtpron yes yes
    -fsgusefiller yes yes
    -fwdflat yes yes
    -fwdflatbeam 1e-64 1.000000e-064
    -fwdflatefwid 4 4
    -fwdflatlw 8.5 8.500000e+000
    -fwdflatsfwin 25 25
    -fwdflatwbeam 7e-29 7.000000e-029
    -fwdtree yes yes
    -hmm /pocketsphinx-0.7/model/hmm/en_US/hub4wsj_sc_8k
    -input_endian little little
    -jsgf
    -kdmaxbbi -1 -1
    -kdmaxdepth 0 0
    -kdtree
    -latsize 5000 5000
    -lda
    -ldadim 0 0
    -lextreedump 0 0
    -lifter 0 0
    -lm /pocketsphinx-0.7/model/lm/en/turtle.DMP
    -lmctl
    -lmname default default
    -logbase 1.0001 1.000100e+000
    -logfn
    -logspec no no
    -lowerf 133.33334 1.333333e+002
    -lpbeam 1e-40 1.000000e-040
    -lponlybeam 7e-29 7.000000e-029
    -lw 6.5 6.500000e+000
    -maxhmmpf -1 -1
    -maxnewoov 20 20
    -maxwpf -1 -1
    -mdef
    -mean
    -mfclogdir
    -min_endfr 0 0
    -mixw
    -mixwfloor 0.0000001 1.000000e-007
    -mllr
    -mmap yes yes
    -ncep 13 13
    -nfft 512 512
    -nfilt 40 40
    -nwpen 1.0 1.000000e+000
    -pbeam 1e-48 1.000000e-048
    -pip 1.0 1.000000e+000
    -pl_beam 1e-10 1.000000e-010
    -pl_pbeam 1e-5 1.000000e-005
    -pl_window 0 0
    -rawlogdir
    -remove_dc no no
    -round_filters yes yes
    -samprate 16000 1.600000e+004
    -seed -1 -1
    -sendump
    -senlogdir
    -senmgau
    -silprob 0.005 5.000000e-003
    -smoothspec no no
    -svspec
    -tmat
    -tmatfloor 0.0001 1.000000e-004
    -topn 4 4
    -topn_beam 0 0
    -toprule
    -transform legacy legacy
    -unit_area yes yes
    -upperf 6855.4976 6.855498e+003
    -usewdphones no no
    -uw 1.0 1.000000e+000
    -var
    -varfloor 0.0001 1.000000e-004
    -varnorm no no
    -verbose no no
    -warp_params
    -warp_type inverse_linear inverse_linear
    -wbeam 7e-29 7.000000e-029
    -wip 0.65 6.500000e-001
    -wlen 0.025625 2.562500e-002

    INFO: cmd_ln.c(559): Parsing command line:
    \
    -nfilt 20 \
    -lowerf 1 \
    -upperf 4000 \
    -wlen 0.025 \
    -transform dct \
    -round_filters no \
    -remove_dc yes \
    -svspec 0-12/13-25/26-38 \
    -feat 1s_c_d_dd \
    -agc none \
    -cmn current \
    -cmninit 56,-3,1 \
    -varnorm no

    Current configuration:

    -agc none none
    -agcthresh 2.0 2.000000e+000
    -alpha 0.97 9.700000e-001
    -ceplen 13 13
    -cmn current current
    -cmninit 8.0 56,-3,1
    -dither no no
    -doublebw no no
    -feat 1s_c_d_dd 1s_c_d_dd
    -frate 100 100
    -input_endian little little
    -lda
    -ldadim 0 0
    -lifter 0 0
    -logspec no no
    -lowerf 133.33334 1.000000e+000
    -ncep 13 13
    -nfft 512 512
    -nfilt 40 20
    -remove_dc no yes
    -round_filters yes no
    -samprate 16000 1.600000e+004
    -seed -1 -1
    -smoothspec no no
    -svspec 0-12/13-25/26-38
    -transform legacy dct
    -unit_area yes yes
    -upperf 6855.4976 4.000000e+003
    -varnorm no no
    -verbose no no
    -warp_params
    -warp_type inverse_linear inverse_linear
    -wlen 0.025625 2.500000e-002

    INFO: acmod.c(242): Parsed model-specific feature parameters from
    /pocketsphinx-0.7/mod
    el/hmm/en_US/hub4wsj_sc_8k/feat.params
    INFO: feat.c(697): Initializing feature stream to type: '1s_c_d_dd',
    ceplen=13,
    CMN='current', VARNORM='no', AGC='none'
    INFO: cmn.c(142): mean= 12.00, mean= 0.0
    INFO: acmod.c(163): Using subvector specification 0-12/13-25/26-38
    INFO: mdef.c(520): Reading model definition:
    /pocketsphinx-0.7/model/hmm/en_US/hub4wsj_
    sc_8k/mdef
    INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef
    fi
    le
    INFO: bin_mdef.c(330): Reading binary model definition:
    /pocketsphinx-0.7/model/hmm/en_
    US/hub4wsj_sc_8k/mdef
    INFO: bin_mdef.c(507): 50 CI-phone, 143047 CD-phone, 3 emitstate/phone, 150
    CI-s
    en, 5150 Sen, 27135 Sen-Seq
    INFO: tmat.c(205): Reading HMM transition probability matrices:
    /pocketsphinx-0.7/model
    /hmm/en_US/hub4wsj_sc_8k/transition_matrices
    INFO: acmod.c(117): Attempting to use SCHMM computation module
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
    /pocketsphinx-0.7/model/hmm
    /en_US/hub4wsj_sc_8k/means
    INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
    INFO: ms_gauden.c(294): 256x13
    INFO: ms_gauden.c(294): 256x13
    INFO: ms_gauden.c(294): 256x13
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
    /pocketsphinx-0.7/model/hmm
    /en_US/hub4wsj_sc_8k/variances
    INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
    INFO: ms_gauden.c(294): 256x13
    INFO: ms_gauden.c(294): 256x13
    INFO: ms_gauden.c(294): 256x13
    INFO: ms_gauden.c(354): 0 variance values floored
    INFO: s2_semi_mgau.c(908): Loading senones from dump file
    /pocketsphinx-0.7/model/hmm/e
    n_US/hub4wsj_sc_8k/sendump
    INFO: s2_semi_mgau.c(932): BEGIN FILE FORMAT DESCRIPTION
    INFO: s2_semi_mgau.c(1027): Using memory-mapped I/O for senones
    INFO: s2_semi_mgau.c(1304): Maximum top-N: 4 Top-N beams: 0 0 0
    INFO: dict.c(306): Allocating 4217 * 20 bytes (82 KiB) for word entries
    INFO: dict.c(321): Reading main dictionary:
    /pocketsphinx-0.7/model/lm/en/turtle.dic
    INFO: dict.c(212): Allocated 0 KiB for strings, 0 KiB for phones
    INFO: dict.c(324): 110 words read
    INFO: dict.c(330): Reading filler dictionary:
    /pocketsphinx-0.7/model/hmm/en_US/hub4wsj
    _sc_8k/noisedict
    INFO: dict.c(212): Allocated 0 KiB for strings, 0 KiB for phones
    INFO: dict.c(333): 11 words read
    INFO: dict2pid.c(396): Building PID tables for dictionary
    INFO: dict2pid.c(404): Allocating 50^3 * 2 bytes (244 KiB) for word-initial
    trip
    hones
    INFO: dict2pid.c(131): Allocated 30200 bytes (29 KiB) for word-final triphones
    INFO: dict2pid.c(195): Allocated 30200 bytes (29 KiB) for single-phone word
    trip
    hones
    INFO: ngram_model_arpa.c(77): No \data\ mark in LM file
    INFO: ngram_model_dmp.c(142): Will use memory-mapped I/O for LM file
    INFO: ngram_model_dmp.c(196): ngrams 1=91, 2=212, 3=177
    INFO: ngram_model_dmp.c(242): 91 = LM.unigrams(+trailer) read
    INFO: ngram_model_dmp.c(291): 212 = LM.bigrams(+trailer) read
    INFO: ngram_model_dmp.c(317): 177 = LM.trigrams read
    INFO: ngram_model_dmp.c(342): 20 = LM.prob2 entries read
    INFO: ngram_model_dmp.c(362): 12 = LM.bo_wt2 entries read
    INFO: ngram_model_dmp.c(382): 12 = LM.prob3 entries read
    INFO: ngram_model_dmp.c(410): 1 = LM.tseg_base entries read
    INFO: ngram_model_dmp.c(466): 91 = ascii word strings read
    INFO: ngram_search_fwdtree.c(99): 67 unique initial diphones
    INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 15 single-
    phone
    words
    INFO: ngram_search_fwdtree.c(186): Creating search tree
    INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 15
    singl
    e-phone words
    INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 328
    INFO: ngram_search_fwdtree.c(338): after: 67 root, 200 non-root channels, 14
    sin
    gle-phone words
    INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25
    Allocating 32 buffers of 2500 samples each
    READY...
    Listening...
    Stopped listenting, please wait...
    INFO: cmn_prior.c(121): cmn_prior_update: from < 56.00 -3.00 1.00 0.00 0.00
    0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >
    INFO: cmn_prior.c(139): cmn_prior_update: to < 44.35 -0.33 1.71 0.31 0.53
    0.21 0.60 0.46 0.21 -0.00 0.53 -0.16 -0.04 >
    INFO: ngram_search_fwdtree.c(1549): 1497 words recognized (18/fr)
    INFO: ngram_search_fwdtree.c(1551): 64134 senones evaluated (782/fr)
    INFO: ngram_search_fwdtree.c(1553): 37868 channels searched (461/fr), 5226 1s
    t, 21443 last
    INFO: ngram_search_fwdtree.c(1557): 2009 words for which last channels evalu
    ated (24/fr)
    INFO: ngram_search_fwdtree.c(1560): 3163 candidate words for entering last p
    hone (38/fr)
    INFO: ngram_search_fwdtree.c(1562): fwdtree 0.11 CPU 0.133 xRT
    INFO: ngram_search_fwdtree.c(1565): fwdtree 1.54 wall 1.879 xRT
    INFO: ngram_search_fwdflat.c(305): Utterance vocabulary contains 30 words
    INFO: ngram_search_fwdflat.c(940): 1040 words recognized (13/fr)
    INFO: ngram_search_fwdflat.c(942): 53132 senones evaluated (648/fr)
    INFO: ngram_search_fwdflat.c(944): 49757 channels searched (606/fr)
    INFO: ngram_search_fwdflat.c(946): 2311 words searched (28/fr)
    INFO: ngram_search_fwdflat.c(948): 1596 word transitions (19/fr)
    INFO: ngram_search_fwdflat.c(951): fwdflat 0.06 CPU 0.076 xRT
    INFO: ngram_search_fwdflat.c(954): fwdflat 0.06 wall 0.077 xRT
    INFO: ngram_search.c(1201): not found in last frame, using hundred.80
    inste
    ad
    INFO: ngram_search.c(1253): lattice start node .0 end node hundred.45
    INFO: ngram_search.c(1281): Eliminated 8 nodes before end node
    INFO: ngram_search.c(1386): Lattice has 92 nodes, 167 links
    INFO: ps_lattice.c(1352): Normalizer P(O) = alpha(hundred:45:80) = -647604
    INFO: ps_lattice.c(1390): Joint P(O,S) = -655953 P(S|O) = -8349
    INFO: ngram_search.c(875): bestpath 0.00 CPU 0.000 xRT
    INFO: ngram_search.c(878): bestpath 0.00 wall 0.005 xRT
    000000000: you hundred
    READY...

    My Code:

    #define DEFAULT_DEVICE NULL
    
    int _tmain(int argc, _TCHAR* argv[])
    {
        ps_decoder_t *ps;
        cmd_ln_t *config;
        FILE *fh;
        char const *hyp, *uttid;
        int16 buf[512];
        int rv;
        int32 score;
    
        config = cmd_ln_init(NULL, ps_args(), TRUE,
                            "-hmm", "../pocketsphinx-0.7/model/hmm/en_US/hub4wsj_sc_8k",
                            "-lm", "../pocketsphinx-0.7/model/lm/en/turtle.DMP",
                            "-dict", "../pocketsphinx-0.7/model/lm/en/turtle.dic",
                            NULL);
        if(config == NULL)
        {
            printf("Config Initialization failed");
            system("Pause");
            return 1;
        }
    
        //PS Initializing
        ps = ps_init(config);
        if (ps == NULL)
                return 1;
    
        ad_rec_t* ad;
        int16 adbuf[4096];
        int32 k, ts, rem;
        cont_ad_t *cont;
        char word[256];
    
        if((ad = ad_open_dev(DEFAULT_DEVICE, DEFAULT_SAMPLES_PER_SEC)) == NULL)
        {
            printf("Falied to open audio device\n");
            system("pause");
            return 1;
        }
    
        if((cont = cont_ad_init(ad, ad_read)) == NULL)
        {
            printf("Failed to initialize voice activity detection\n");
            system("pause");
            return 1;
        }
    
        if (ad_start_rec(ad) < 0) {
            printf("Failed to start recording");
            return 1;
        }
    
        if(cont_ad_calib(cont) < 0)
        {
            printf("Failed to calibrate voice activity detection\n");
            printf_s("Failed with code %i\n",cont_ad_calib(cont));
            system("pause");
            return 1;
        }
    
        for(;;)
        { 
            printf("READY...\n");
            fflush(stdout);
            fflush(stderr);
    
            //Wait for next utterance
            while((k = cont_ad_read(cont, adbuf, 4096)) == 0)
                Sleep((DWORD)100);
    
            if(k < 0)
            {
                printf("Failed to read audio\n");
                return 1;
            }
    
            if(ps_start_utt(ps,NULL) < 0)
            {
                printf("Failed to start utterance\n");
                return 1;
            }
            ps_process_raw(ps,adbuf,k,FALSE,FALSE);
            printf("Listening...\n");
            fflush(stdout);
    
            //Note the timestamp for the first block of data
            ts = cont->read_ts;
    
            for(;;)
            { 
                //Read non-silence audio data, if any, from continuous listening module
                if((k = cont_ad_read(cont,adbuf,4096)) < 0)
                {
                    printf("Failed to read audio\n");
                    return 1;
                }
                if(k == 0)
                {
                    /*
                     * No speech data available; check current timestamp with the most recent
                     * speech to see if more than 1 sec elapsed. If so, end utterance.
                     */ 
                    if((cont->read_ts - ts) > DEFAULT_SAMPLES_PER_SEC)
                        break;
                }
                else
                {  //New speech data received; note current timestamp
                    ts = cont->read_ts;
                }
    
                //Decode whatever was read above
                rem = ps_process_raw(ps,adbuf,k,FALSE,FALSE);
    
                //If no work to be done, sleep a bit
                if((rem == 0) && (k == 0))
                    Sleep((DWORD)20);
            }
    
            /*
             * Utterance ended; flush any accumulated, unprocesses A/D data and stop
             * listening until current utterance is completely decoded
             */
            ad_stop_rec(ad);
            while(ad_read(ad, adbuf, 4096) >= 0);
            cont_ad_reset(cont);
    
            printf("Stopped listenting, please wait...\n");
            fflush(stdout);
    
            //Finish decoding, obtain and print result
            ps_end_utt(ps);
            hyp = ps_get_hyp(ps, NULL, &uttid);
            printf("%s: %s\n", uttid, hyp);
            fflush(stdout);
    
            //Exit if the first word spoken was Goodbye
            if(hyp)
            {
                sscanf(hyp,"%s",word);
                if(strcmp(word,"goodbye") == 0)
                    break;
            }
    
            //resume a/d recording for next utterance
            if(ad_start_rec(ad) < 0)
            {
                printf("Failed to start recording");
                return 1;
            }
    
            cont_ad_close(cont);
            ad_close(ad);
        }
    
        ps_free(ps);
    
        system("Pause");
    
        return 0;
    }
    

    And finally the Call Stack:

    ntdll.dll!771422c2()

    msvcr100d.dll!_lock_file(_iobuf * pf) Line 237 C
    msvcr100d.dll!fprintf(_iobuf * str, const char * format, ...) Line 63 + 0x9
    bytes C
    sphinxbase.dll!cont_ad_read(cont_ad_t * r, short * buf, int max) Line 879 +
    0x1b bytes C
    sphinxTest.exe!wmain(int argc, wchar_t * * argv) Line 82 + 0x1b bytes C++
    sphinxTest.exe!__tmainCRTStartup() Line 552 + 0x19 bytes C
    sphinxTest.exe!wmainCRTStartup() Line 371 C
    kernel32.dll!763433ca()
    ntdll.dll!77159ed2()
    ntdll.dll!77159ea5()

     
  • Shane

    Shane - 2011-05-22

    Any thoughts?

     
  • Nickolay V. Shmyrev

    Hello

    Sorry for delay. With the information provided it's actually clear why it
    crashes. You are closing the ad here in the for loop and continue to read from
    the ad. You need to move ad_close out of the for loop.

            cont_ad_close(cont);
            ad_close(ad);
        }
    
        ps_free(ps);
    
     
  • Shane

    Shane - 2011-05-23

    Wow, that really is obvious. Thanks for the help, I appreciate it.

     

Log in to post a comment.