Menu

Grammar and PocketSphinx Android demo

Help
2012-07-06
2012-09-22
  • Sinisa Suzic

    Sinisa Suzic - 2012-07-06

    Hi,
    I managed to build and run PocketSphinx Android demo project. Now I want it to
    use a simple grammar, which I wrote.
    What do I have to chnage in my project?

    When I tried to convert my .gram file to .fsg using sphinx__jsgf2fsg I got a
    following output in terminal:

    INFO: jsgf.c(546): Defined rule: PUBLIC <pizza.startPizza>
    INFO: jsgf.c(546): Defined rule: <pizza.size>
    INFO: jsgf.c(546): Defined rule: <pizza.topping>
    Segmentation fault (core dumped)
    

    Is there another way to convert .gram to .fsg files?

     
  • Nickolay V. Shmyrev

    What do I have to chnage in my project?

    Put the JSGF grammar on the file system. Change decoder initialization call
    from:

                    c.setString("-lm",
                                    "/sdcard/Android/data/edu.cmu.pocketsphinx/lm/en_US/hub4.5000.DMP");
    

    to

                    c.setString("-jsgf",
                                    "/sdcard/Android/data/edu.cmu.pocketsphinx/your.jsgf");
    

    When I tried to convert my .gram file to .fsg using sphinx__jsgf2fsg I got a
    following output in terminal:

    There is no need to convert to FSG, you can use JSGF grammar directly. In
    order to solve specific conversion issue you need to use a proper command line
    arguments to invoke the tool. Since you didn't provide the exact command line
    you were using it's hard to say what was wrong there.

     
  • Sinisa Suzic

    Sinisa Suzic - 2012-07-09

    Hi,

    I put my_grammar.jsgf file in /sdcard/Android/data/edu.cmu.pocketsphinx
    folder. I also changed the decoder initialization call but the app crashed (
    closed with no message ). Following messages were in log file:

    signalling START
    signalled START
    gotSTART
    START
    

    All of them are from RecognizerTask class.
    Whiule debugging the application I found that the problem appers in line

    this.ps.startUtt();
    

    It is run function from RecognizerTask class
    What could be the problem?

    Since you didn't provide the exact command line you were using it's hard to
    say what was wrong there.

    I used the following command in Cygwin:

    sphinx_jsgf2fsg input.jsgf  output.fsg
    
     
  • Nickolay V. Shmyrev

    Following messages were in log file:

    This is not the right log. The log is created in the file named

    /sdcard/Android/data/edu.cmu.pocketsphinx/pocketsphinx.log
    

    on the device filesystem

    You can check it for details

    I used the following command in Cygwin:

    Thanks. It should work now, it doesn't seem like you are using the latest
    sphinxbase version.

     
  • Sinisa Suzic

    Sinisa Suzic - 2012-07-09

    This is not the right log.

    That was console output :)

    I've looked the pocketsphinx.log file and this part was inteersting

    INFO: ngram_model_arpa.c(77): No \data\ mark in LM file
    INFO: ngram_model_dmp.c(142): Will use memory-mapped I/O for LM file
    INFO: ngram_model_dmp.c(196): ngrams 1=5001, 2=436879, 3=418286
    INFO: ngram_model_dmp.c(242):     5001 = LM.unigrams(+trailer) read
    INFO: ngram_model_dmp.c(291):   436879 = LM.bigrams(+trailer) read
    INFO: ngram_model_dmp.c(317):   418286 = LM.trigrams read
    INFO: ngram_model_dmp.c(342):    37293 = LM.prob2 entries read
    INFO: ngram_model_dmp.c(362):    14370 = LM.bo_wt2 entries read
    INFO: ngram_model_dmp.c(382):    36094 = LM.prob3 entries read
    INFO: ngram_model_dmp.c(410):      854 = LM.tseg_base entries read
    INFO: ngram_model_dmp.c(466):     5001 = ascii word strings read
    INFO: ngram_search_fwdtree.c(99): 457 unique initial diphones
    INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 27 single-phone words
    INFO: ngram_search_fwdtree.c(186): Creating search tree
    INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 27 single-phone words
    INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 13439
    INFO: ngram_search_fwdtree.c(338): after: 457 root, 13311 non-root channels, 26 single-phone words
    INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25
    

    Does this mean that decoder is still trying to use statistical language model
    instead of a grammar?

     
  • Nickolay V. Shmyrev

    Does this mean that decoder is still trying to use statistical language
    model instead of a grammar?

    Yes, it seems you didn't update the decoder initialization cal properly as
    described above.

     
  • Sinisa Suzic

    Sinisa Suzic - 2012-07-12

    I'm sure I updated everything properly. In Android LogCat I get fatal signal
    11 SIGSEGV which occurs in Android Native libraries.. usually when an unknown
    memory area is tried to access by the android internal storage system.
    Could it be a problem in startUtt method?

     
  • Nickolay V. Shmyrev

    I'm sure I updated everything properly

    I'm not

    In Android LogCat I get fatal signal 11 SIGSEGV which occurs in Android
    Native libraries

    The crash is often caused by earlier errors which are reported in
    pocketsphinx.log file. You need to provide the full contents of that file
    which you see on your filesystem. That will help you to solve the problem
    faster. Sorry, I do not have remove access to your phone in order to check
    that file for you.

    Could it be a problem in startUtt method?

    No

     
  • Sinisa Suzic

    Sinisa Suzic - 2012-07-13

    Here is content of my pocketsphinx.log file

    INFO: cmd_ln.c(512): Parsing command line:
    
    
    Current configuration:
    [NAME]      [DEFLT]     [VALUE]
    -agc        none        none
    -agcthresh  2.0     2.000000e+00
    -alpha      0.97        9.700000e-01
    -ascale     20.0        2.000000e+01
    -aw     1       1
    -backtrace  no      no
    -beam       1e-48       1.000000e-48
    -bestpath   yes     yes
    -bestpathlw 9.5     9.500000e+00
    -bghist     no      no
    -ceplen     13      13
    -cmn        current     current
    -cmninit    8.0     8.0
    -compallsen no      no
    -debug              0
    -dict               
    -dictcase   no      no
    -dither     no      no
    -doublebw   no      no
    -ds     1       1
    -fdict              
    -feat       1s_c_d_dd   1s_c_d_dd
    -featparams         
    -fillprob   1e-8        1.000000e-08
    -frate      100     100
    -fsg                
    -fsgusealtpron  yes     yes
    -fsgusefiller   yes     yes
    -fwdflat    yes     yes
    -fwdflatbeam    1e-64       1.000000e-64
    -fwdflatefwid   4       4
    -fwdflatlw  8.5     8.500000e+00
    -fwdflatsfwin   25      25
    -fwdflatwbeam   7e-29       7.000000e-29
    -fwdtree    yes     yes
    -hmm                
    -input_endian   little      little
    -jsgf               
    -kdmaxbbi   -1      -1
    -kdmaxdepth 0       0
    -kdtree             
    -latsize    5000        5000
    -lda                
    -ldadim     0       0
    -lextreedump    0       0
    -lifter     0       0
    -lm             
    -lmctl              
    -lmname     default     default
    -logbase    1.0001      1.000100e+00
    -logfn              
    -logspec    no      no
    -lowerf     133.33334   1.333333e+02
    -lpbeam     1e-40       1.000000e-40
    -lponlybeam 7e-29       7.000000e-29
    -lw     6.5     6.500000e+00
    -maxhmmpf   -1      -1
    -maxnewoov  20      20
    -maxwpf     -1      -1
    -mdef               
    -mean               
    -mfclogdir          
    -min_endfr  0       0
    -mixw               
    -mixwfloor  0.0000001   1.000000e-07
    -mllr               
    -mmap       yes     yes
    -ncep       13      13
    -nfft       512     512
    -nfilt      40      40
    -nwpen      1.0     1.000000e+00
    -pbeam      1e-48       1.000000e-48
    -pip        1.0     1.000000e+00
    -pl_beam    1e-10       1.000000e-10
    -pl_pbeam   1e-5        1.000000e-05
    -pl_window  0       0
    -rawlogdir          
    -remove_dc  no      no
    -round_filters  yes     yes
    -samprate   16000       1.600000e+04
    -seed       -1      -1
    -sendump            
    -senlogdir          
    -senmgau            
    -silprob    0.005       5.000000e-03
    -smoothspec no      no
    -svspec             
    -tmat               
    -tmatfloor  0.0001      1.000000e-04
    -topn       4       4
    -topn_beam  0       0
    -toprule            
    -transform  legacy      legacy
    -unit_area  yes     yes
    -upperf     6855.4976   6.855498e+03
    -usewdphones    no      no
    -uw     1.0     1.000000e+00
    -var                
    -varfloor   0.0001      1.000000e-04
    -varnorm    no      no
    -verbose    no      no
    -warp_params            
    -warp_type  inverse_linear  inverse_linear
    -wbeam      7e-29       7.000000e-29
    -wip        0.65        6.500000e-01
    -wlen       0.025625    2.562500e-02
    
    INFO: cmd_ln.c(512): Parsing command line:
    \
        -nfilt 20 \
        -lowerf 1 \
        -upperf 4000 \
        -wlen 0.025 \
        -transform dct \
        -round_filters no \
        -remove_dc yes \
        -svspec 0-12/13-25/26-38 \
        -feat 1s_c_d_dd \
        -agc none \
        -cmn current \
        -cmninit 56,-3,1 \
        -varnorm no
    
    Current configuration:
    [NAME]      [DEFLT]     [VALUE]
    -agc        none        none
    -agcthresh  2.0     2.000000e+00
    -alpha      0.97        9.700000e-01
    -ceplen     13      13
    -cmn        current     current
    -cmninit    8.0     56,-3,1
    -dither     no      no
    -doublebw   no      no
    -feat       1s_c_d_dd   1s_c_d_dd
    -frate      100     100
    -input_endian   little      little
    -lda                
    -ldadim     0       0
    -lifter     0       0
    -logspec    no      no
    -lowerf     133.33334   1.000000e+00
    -ncep       13      13
    -nfft       512     512
    -nfilt      40      20
    -remove_dc  no      yes
    -round_filters  yes     no
    -samprate   16000       8.000000e+03
    -seed       -1      -1
    -smoothspec no      no
    -svspec             0-12/13-25/26-38
    -transform  legacy      dct
    -unit_area  yes     yes
    -upperf     6855.4976   4.000000e+03
    -varnorm    no      no
    -verbose    no      no
    -warp_params            
    -warp_type  inverse_linear  inverse_linear
    -wlen       0.025625    2.500000e-02
    
    INFO: acmod.c(242): Parsed model-specific feature parameters from /sdcard/Android/data/ca.ilanguage.labs.pocketsphinx/hmm/hub4wsj_sc_8k/feat.params
    INFO: feat.c(860): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
    INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
    INFO: acmod.c(163): Using subvector specification 0-12/13-25/26-38
    INFO: mdef.c(520): Reading model definition: /sdcard/Android/data/ca.ilanguage.labs.pocketsphinx/hmm/hub4wsj_sc_8k/mdef
    INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
    INFO: bin_mdef.c(330): Reading binary model definition: /sdcard/Android/data/ca.ilanguage.labs.pocketsphinx/hmm/hub4wsj_sc_8k/mdef
    INFO: bin_mdef.c(507): 50 CI-phone, 143047 CD-phone, 3 emitstate/phone, 150 CI-sen, 5150 Sen, 27135 Sen-Seq
    INFO: tmat.c(205): Reading HMM transition probability matrices: /sdcard/Android/data/ca.ilanguage.labs.pocketsphinx/hmm/hub4wsj_sc_8k/transition_matrices
    INFO: acmod.c(117): Attempting to use SCHMM computation module
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /sdcard/Android/data/ca.ilanguage.labs.pocketsphinx/hmm/hub4wsj_sc_8k/means
    INFO: ms_gauden.c(292): 1 codebook, 3 feature, size: 
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /sdcard/Android/data/ca.ilanguage.labs.pocketsphinx/hmm/hub4wsj_sc_8k/variances
    INFO: ms_gauden.c(292): 1 codebook, 3 feature, size: 
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(294):  256x13
    INFO: ms_gauden.c(354): 0 variance values floored
    INFO: s2_semi_mgau.c(908): Loading senones from dump file /sdcard/Android/data/ca.ilanguage.labs.pocketsphinx/hmm/hub4wsj_sc_8k/sendump
    INFO: s2_semi_mgau.c(932): BEGIN FILE FORMAT DESCRIPTION
    INFO: s2_semi_mgau.c(1027): Using memory-mapped I/O for senones
    INFO: s2_semi_mgau.c(1304): Maximum top-N: 4 Top-N beams: 0 0 0
    INFO: phone_loop_search.c(105): State beam -230231 Phone exit beam -115115 Insertion penalty 0
    INFO: dict.c(306): Allocating 10319 * 20 bytes (201 KiB) for word entries
    INFO: dict.c(321): Reading main dictionary: /sdcard/Android/data/ca.ilanguage.labs.pocketsphinx/lm/hub4.5000.dic
    INFO: dict.c(212): Allocated 44 KiB for strings, 69 KiB for phones
    INFO: dict.c(324): 6212 words read
    INFO: dict.c(330): Reading filler dictionary: /sdcard/Android/data/ca.ilanguage.labs.pocketsphinx/hmm/hub4wsj_sc_8k/noisedict
    INFO: dict.c(212): Allocated 0 KiB for strings, 0 KiB for phones
    INFO: dict.c(333): 11 words read
    INFO: dict2pid.c(396): Building PID tables for dictionary
    INFO: dict2pid.c(404): Allocating 50^3 * 2 bytes (244 KiB) for word-initial triphones
    INFO: dict2pid.c(131): Allocated 30200 bytes (29 KiB) for word-final triphones
    INFO: dict2pid.c(195): Allocated 30200 bytes (29 KiB) for single-phone word triphones
    INFO: ngram_model_arpa.c(77): No \data\ mark in LM file
    INFO: ngram_model_dmp.c(142): Will use memory-mapped I/O for LM file
    INFO: ngram_model_dmp.c(196): ngrams 1=5001, 2=436879, 3=418286
    INFO: ngram_model_dmp.c(242):     5001 = LM.unigrams(+trailer) read
    INFO: ngram_model_dmp.c(291):   436879 = LM.bigrams(+trailer) read
    INFO: ngram_model_dmp.c(317):   418286 = LM.trigrams read
    INFO: ngram_model_dmp.c(342):    37293 = LM.prob2 entries read
    INFO: ngram_model_dmp.c(362):    14370 = LM.bo_wt2 entries read
    INFO: ngram_model_dmp.c(382):    36094 = LM.prob3 entries read
    INFO: ngram_model_dmp.c(410):      854 = LM.tseg_base entries read
    INFO: ngram_model_dmp.c(466):     5001 = ascii word strings read
    INFO: ngram_search_fwdtree.c(99): 457 unique initial diphones
    INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 27 single-phone words
    INFO: ngram_search_fwdtree.c(186): Creating search tree
    INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 27 single-phone words
    INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 13439
    INFO: ngram_search_fwdtree.c(338): after: 457 root, 13311 non-root channels, 26 single-phone words
    INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25
    INFO: cmd_ln.c(512): Parsing command line:
    
     

Log in to post a comment.