Menu

Segmentation fault with fsg grammar

Help
2011-09-12
2012-09-22
  • NGUYEN dang-khoa

    i try to use pocketsphinx_continuous with fsg grammar but it got segmentation
    fault error.
    here is my mdef:
    http://dl.dropbox.com/u/5137777/download/mdef
    filler dic:
    http://dl.dropbox.com/u/5137777/download/noisedict
    dict
    http://dl.dropbox.com/u/5137777/download/word.dic
    fsg file:
    http://dl.dropbox.com/u/5137777/download/word.fsg

    output log:

    INFO: cmd_ln.c(559): Parsing command line:
    pocketsphinx_continuous \
    -hmm /media/KHOAND/workspaces/acousticmodel-building/Aug-2011/30_td_alldata_dither/model_parameters/30_td_alldata_dither.cd_cont_5000 \
    -fsg word.fsg \
    -dict word.dic

    Current configuration:

    -adcdev
    -agc none none
    -agcthresh 2.0 2.000000e+00
    -alpha 0.97 9.700000e-01
    -argfile
    -ascale 20.0 2.000000e+01
    -aw 1 1
    -backtrace no no
    -beam 1e-48 1.000000e-48
    -bestpath yes yes
    -bestpathlw 9.5 9.500000e+00
    -bghist no no
    -ceplen 13 13
    -cmn current current
    -cmninit 8.0 8.0
    -compallsen no no
    -debug 0
    -dict word.dic
    -dictcase no no
    -dither no no
    -doublebw no no
    -ds 1 1
    -fdict
    -feat 1s_c_d_dd 1s_c_d_dd
    -featparams
    -fillprob 1e-8 1.000000e-08
    -frate 100 100
    -fsg word.fsg
    -fsgusealtpron yes yes
    -fsgusefiller yes yes
    -fwdflat yes yes
    -fwdflatbeam 1e-64 1.000000e-64
    -fwdflatefwid 4 4
    -fwdflatlw 8.5 8.500000e+00
    -fwdflatsfwin 25 25
    -fwdflatwbeam 7e-29 7.000000e-29
    -fwdtree yes yes
    -hmm /media/KHOAND/workspaces/acousticmodel-building/Aug-2011/30_td_alldata_dither/model_parameters/30_td_alldata_dither.cd_cont_5000
    -infile
    -input_endian little little
    -jsgf
    -kdmaxbbi -1 -1
    -kdmaxdepth 0 0
    -kdtree
    -latsize 5000 5000
    -lda
    -ldadim 0 0
    -lextreedump 0 0
    -lifter 0 0
    -lm
    -lmctl
    -lmname default default
    -logbase 1.0001 1.000100e+00
    -logfn
    -logspec no no
    -lowerf 133.33334 1.333333e+02
    -lpbeam 1e-40 1.000000e-40
    -lponlybeam 7e-29 7.000000e-29
    -lw 6.5 6.500000e+00
    -maxhmmpf -1 -1
    -maxnewoov 20 20
    -maxwpf -1 -1
    -mdef
    -mean
    -mfclogdir
    -min_endfr 0 0
    -mixw
    -mixwfloor 0.0000001 1.000000e-07
    -mllr
    -mmap yes yes
    -ncep 13 13
    -nfft 512 512
    -nfilt 40 40
    -nwpen 1.0 1.000000e+00
    -pbeam 1e-48 1.000000e-48
    -pip 1.0 1.000000e+00
    -pl_beam 1e-10 1.000000e-10
    -pl_pbeam 1e-5 1.000000e-05
    -pl_window 0 0
    -rawlogdir
    -remove_dc no no
    -round_filters yes yes
    -samprate 16000 1.600000e+04
    -seed -1 -1
    -sendump
    -senlogdir
    -senmgau
    -silprob 0.005 5.000000e-03
    -smoothspec no no
    -svspec
    -time no no
    -tmat
    -tmatfloor 0.0001 1.000000e-04
    -topn 4 4
    -topn_beam 0 0
    -toprule
    -transform legacy legacy
    -unit_area yes yes
    -upperf 6855.4976 6.855498e+03
    -usewdphones no no
    -uw 1.0 1.000000e+00
    -var
    -varfloor 0.0001 1.000000e-04
    -varnorm no no
    -verbose no no
    -warp_params
    -warp_type inverse_linear inverse_linear
    -wbeam 7e-29 7.000000e-29
    -wip 0.65 6.500000e-01
    -wlen 0.025625 2.562500e-02

    INFO: cmd_ln.c(559): Parsing command line:
    \
    -alpha 0.97 \
    -doublebw no \
    -nfilt 40 \
    -ncep 13 \
    -lowerf 1.3333334 \
    -upperf 6855.4976 \
    -nfft 512 \
    -wlen 0.0256 \
    -transform legacy \
    -samprate 16000 \
    -feat 1s_c_d_dd \
    -agc none \
    -cmn current \
    -varnorm no

    Current configuration:

    -agc none none
    -agcthresh 2.0 2.000000e+00
    -alpha 0.97 9.700000e-01
    -ceplen 13 13
    -cmn current current
    -cmninit 8.0 8.0
    -dither no no
    -doublebw no no
    -feat 1s_c_d_dd 1s_c_d_dd
    -frate 100 100
    -input_endian little little
    -lda
    -ldadim 0 0
    -lifter 0 0
    -logspec no no
    -lowerf 133.33334 1.333333e+00
    -ncep 13 13
    -nfft 512 512
    -nfilt 40 40
    -remove_dc no no
    -round_filters yes yes
    -samprate 16000 1.600000e+04
    -seed -1 -1
    -smoothspec no no
    -svspec
    -transform legacy legacy
    -unit_area yes yes
    -upperf 6855.4976 6.855498e+03
    -varnorm no no
    -verbose no no
    -warp_params
    -warp_type inverse_linear inverse_linear
    -wlen 0.025625 2.560000e-02

    INFO: acmod.c(238): Parsed model-specific feature parameters from
    /media/KHOAND/workspaces/acousticmodel-building/Aug-2011/30_td_alldata_dither/
    model_parameters/30_td_alldata_dither.cd_cont_5000/feat.params
    INFO: feat.c(697): Initializing feature stream to type: '1s_c_d_dd',
    ceplen=13, CMN='current', VARNORM='no', AGC='none'
    INFO: cmn.c(142): mean= 12.00, mean= 0.0
    INFO: mdef.c(520): Reading model definition: /media/KHOAND/workspaces
    /acousticmodel-building/Aug-2011/30_td_alldata_dither/model_parameters/30_td_a
    lldata_dither.cd_cont_5000/mdef
    INFO: bin_mdef.c(173): Allocating 423280 * 8 bytes (3306 KiB) for CD tree
    INFO: tmat.c(205): Reading HMM transition probability matrices:
    /media/KHOAND/workspaces/acousticmodel-building/Aug-2011/30_td_alldata_dither/
    model_parameters/30_td_alldata_dither.cd_cont_5000/transition_matrices
    INFO: acmod.c(117): Attempting to use SCHMM computation module
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
    /media/KHOAND/workspaces/acousticmodel-building/Aug-2011/30_td_alldata_dither/
    model_parameters/30_td_alldata_dither.cd_cont_5000/means
    INFO: ms_gauden.c(292): 5384 codebook, 1 feature, size
    8x39
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
    /media/KHOAND/workspaces/acousticmodel-building/Aug-2011/30_td_alldata_dither/
    model_parameters/30_td_alldata_dither.cd_cont_5000/variances
    INFO: ms_gauden.c(292): 5384 codebook, 1 feature, size
    8x39
    INFO: ms_gauden.c(356): 121967 variance values floored
    INFO: acmod.c(119): Attempting to use PTHMM computation module
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
    /media/KHOAND/workspaces/acousticmodel-building/Aug-2011/30_td_alldata_dither/
    model_parameters/30_td_alldata_dither.cd_cont_5000/means
    INFO: ms_gauden.c(292): 5384 codebook, 1 feature, size
    8x39
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
    /media/KHOAND/workspaces/acousticmodel-building/Aug-2011/30_td_alldata_dither/
    model_parameters/30_td_alldata_dither.cd_cont_5000/variances
    INFO: ms_gauden.c(292): 5384 codebook, 1 feature, size
    8x39
    INFO: ms_gauden.c(356): 121967 variance values floored
    ERROR: "ptm_mgau.c", line 801: Number of codebooks exceeds 256: 5384
    INFO: acmod.c(121): Falling back to general multi-stream GMM computation
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
    /media/KHOAND/workspaces/acousticmodel-building/Aug-2011/30_td_alldata_dither/
    model_parameters/30_td_alldata_dither.cd_cont_5000/means
    INFO: ms_gauden.c(292): 5384 codebook, 1 feature, size
    8x39
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
    /media/KHOAND/workspaces/acousticmodel-building/Aug-2011/30_td_alldata_dither/
    model_parameters/30_td_alldata_dither.cd_cont_5000/variances
    INFO: ms_gauden.c(292): 5384 codebook, 1 feature, size
    8x39
    INFO: ms_gauden.c(356): 121967 variance values floored
    INFO: ms_senone.c(160): Reading senone mixture weights:
    /media/KHOAND/workspaces/acousticmodel-building/Aug-2011/30_td_alldata_dither/
    model_parameters/30_td_alldata_dither.cd_cont_5000/mixture_weights
    INFO: ms_senone.c(211): Truncating senone logs3(pdf) values by 10 bits
    INFO: ms_senone.c(218): Not transposing mixture weights in memory
    ERROR: "ms_senone.c", line 265: Weight normalization failed for 6 senones
    INFO: ms_senone.c(277): Read mixture weights for 5384 senones: 1 features x 8
    codewords
    INFO: ms_senone.c(331): Mapping senones to individual codebooks
    INFO: ms_mgau.c(123): The value of topn: 4
    INFO: dict.c(294): Allocating 4103 * 20 bytes (80 KiB) for word entries
    INFO: dict.c(306): Reading main dictionary: word.dic
    INFO: dict.c(206): Allocated 0 KiB for strings, 0 KiB for phones
    INFO: dict.c(309): 4 words read
    INFO: dict.c(314): Reading filler dictionary: /media/KHOAND/workspaces
    /acousticmodel-building/Aug-2011/30_td_alldata_dither/model_parameters/30_td_a
    lldata_dither.cd_cont_5000/noisedict
    INFO: dict.c(206): Allocated 0 KiB for strings, 0 KiB for phones
    INFO: dict.c(317): 3 words read
    INFO: dict2pid.c(396): Building PID tables for dictionary
    INFO: dict2pid.c(405): Allocating 128^3 * 2 bytes (4096 KiB) for word-initial
    triphones
    INFO: dict2pid.c(131): Allocated 197120 bytes (192 KiB) for word-final
    triphones
    INFO: dict2pid.c(195): Allocated 197120 bytes (192 KiB) for single-phone word
    triphones
    INFO: fsg_search.c(139): FSG(beam: -1105112, pbeam: -1105112, wbeam: -648215;
    wip: -25842, pip: 0)
    INFO: fsg_model.c(678): FSG: 19 states, 5 unique words, 6 transitions (18
    null)
    INFO: fsg_model.c(213): Computing transitive closure for null transitions
    INFO: fsg_model.c(264): 88 null transitions added
    INFO: fsg_model.c(411): Adding silence transitions for <sil> to FSG
    INFO: fsg_model.c(431): Added 19 silence word transitions
    INFO: fsg_model.c(411): Adding silence transitions for sil to FSG
    INFO: fsg_model.c(431): Added 19 silence word transitions
    INFO: fsg_lextree.c(110): Allocated 4902 bytes (4 KiB) for left and right
    context phones
    Segmentation fault </sil>

     
  • Joseph S. Wisniewski

    My first guess was encoding, but you look OK there.

    I can see 3 things wrong with that fsg.

    The sequence 0-2-4-3-6-8-7-15-17-16-1 constitutes a short-circuit through the
    grammar.
    The exit probability for node 10 totals to 2, 10-6 at 1, 10-7 at 1.
    The sequence 6-9-11-10-6 constitutes a short-circuited loop.

    I don't know if any of these things will bother pocketsphinx. Where did you
    get this FSG? I assume it's not hand-made.

    You also shouldn't need the optional beginning and ending sil nodes. Sphinx
    adds those by itself, so the ugly structures are 2,4,5,3 and 15,16,17,18 can
    go away.

     
  • Nickolay V. Shmyrev

    sorry, without model files it's very hard to reproduce and fix this problem.
    You only provided mdef so far

     
  • Nickolay V. Shmyrev

    Recent pocketsphinx_continuous doesn't produce segmenation fault, it creates a
    warning:

    FATAL_ERROR: "fsg_lextree.c", line 710: #phones > 64; increase FSG_PNODE_CTXT_BVSZ and recompile
    

    When FSG_PNODE_CTXT_BVSZ is increased it just works. I suggest you to try a
    newer version.

     
  • NGUYEN dang-khoa

    "newer version" do you mean "pocketsphinx-0.7" ?
    i already use pocketsphinx 0.7 for this test.

     
  • Nickolay V. Shmyrev

    "newer version" do you mean "pocketsphinx-0.7" ?

    By newer version I mean snapshot/subversion trunk. See

    http://cmusphinx.sourceforge.net/wiki/download

     
  • NGUYEN dang-khoa

    i had tried with pocketsphinx & sphinxbase snapshot version but i got the same
    error ?

     
  • NGUYEN dang-khoa

    i changed #define FSG_PNODE_CTXT_BVSZ 2 to #define FSG_PNODE_CTXT_BVSZ 4 at
    fsg_lextree.h, the recognizer run sucessfully but recognize time is too long
    ??

     
  • Nickolay V. Shmyrev

    4 at fsg_lextree.h, the recognizer run sucessfully but recognize time is too
    long ??

    Well, you need to select between big phoneset and recognizer speed. Probably
    the best point is somewhere in the middle.

     

Log in to post a comment.