Menu

pocketsphinx_batch and nbest

Help
floboc
2012-10-15
2012-10-18
  • floboc

    floboc - 2012-10-15

    Hello,
    I am trying to improve my accuracy by using the n-best list instead of just the best hypothesis.
    When using pocketsphinx_batch i get weird results:
    - if I use it without nbest, I get a coherent decoding result like:

    zéro (0 -3923)
    un (1 -2162)
    seize (10 -2677)
    onze (11 -1854)
    douze (12 -2560)
    treize (13 -2408)
    quatorze (14 -2386)
    seize (15 -2590)
    seize (16 -1999)
    dix-sept (17 -2736)
    six (18 -3464)
    dix-neuf (19 -2991)
    deux (2 -2500)

    -if I add "-nbest 10" and "-nbestdir my_nbest_folder" to the arguments of pocket sphinx_batch, the result is weird. I get all the files with my 10 nest hypothesis but it doesn't match, the best result I got without it is not present in the nbest list. For instance in the 17.hyp file I got:

    dix -799
    dix -799
    dix -799
    deux -865
    deux -865
    deux -865
    onze -993
    onze -993
    onze -993
    neuf -1175

    You can see that my best hypothesis "dix-sept" is not present in this list.

    I insist on the fact that I didn't change anything except the two nbest arguments.

    Do you have any idea of hat could be the source of my problem.
    For your interest, I am using JSGF grammar.
    There is no error or warning in the log.

    My settings are:

    -adchdr 0 0
    -adcin no yes
    -agc none max
    -agcthresh 2.0 2.000000e+00
    -alpha 0.97 9.700000e-01
    -argfile
    -ascale 20.0 2.000000e+01
    -aw 1 1
    -backtrace no no
    -beam 1e-48 0.000000e+00
    -bestpath yes yes
    -bestpathlw 9.5 9.500000e+00
    -bghist no no
    -build_outdirs yes yes
    -cepdir wav
    -cepext .mfc .wav
    -ceplen 13 13
    -cmn current current
    -cmninit 8.0 8.0
    -compallsen no no
    -ctl etc/test.fileids
    -ctlcount -1 -1
    -ctlincr 1 1
    -ctloffset 0 0
    -ctm
    -debug 0
    -dict etc/test.dic
    -dictcase no no
    -dither no no
    -doublebw no no
    -ds 1 1
    -fdict
    -feat 1s_c_d_dd 1s_c_d_dd
    -featparams
    -fillprob 1e-8 1.000000e-08
    -frate 100 100
    -fsg
    -fsgctl
    -fsgdir
    -fsgext
    -fsgusealtpron yes yes
    -fsgusefiller yes yes
    -fwdflat yes yes
    -fwdflatbeam 1e-64 1.000000e-64
    -fwdflatefwid 4 4
    -fwdflatlw 8.5 8.500000e+00
    -fwdflatsfwin 25 25
    -fwdflatwbeam 7e-29 7.000000e-29
    -fwdtree yes yes
    -hmm hub4wsj_sc_8k
    -hyp result/hyp.txt
    -hypseg
    -input_endian little little
    -jsgf etc/test.gram
    -kdmaxbbi -1 -1
    -kdmaxdepth 0 0
    -kdtree
    -latsize 5000 5000
    -lda
    -ldadim 0 0
    -lextreedump 0 0
    -lifter 0 0
    -lm
    -lmctl
    -lmname default default
    -lmnamectl
    -logbase 1.0001 1.000100e+00
    -logfn
    -logspec no no
    -lowerf 133.33334 1.333333e+02
    -lpbeam 1e-40 0.000000e+00
    -lponlybeam 7e-29 0.000000e+00
    -lw 6.5 0.000000e+00
    -maxhmmpf -1 3000
    -maxnewoov 20 20
    -maxwpf -1 -1
    -mdef
    -mean
    -mfclogdir
    -min_endfr 0 0
    -mixw
    -mixwfloor 0.0000001 1.000000e-07
    -mllr
    -mllrctl
    -mllrdir
    -mllrext
    -mmap yes yes
    -nbest 0 10
    -nbestdir result
    -nbestext .hyp .hyp
    -ncep 13 13
    -nfft 512 512
    -nfilt 40 40
    -nwpen 1.0 1.000000e+00
    -outlatbeam 1e-5 1.000000e-05
    -outlatdir
    -outlatext .lat .lat
    -outlatfmt s3 s3
    -pbeam 1e-48 1.000000e-48
    -pip 1.0 1.000000e+00
    -pl_beam 1e-10 1.000000e-10
    -pl_pbeam 1e-5 1.000000e-05
    -pl_window 0 0
    -rawlogdir
    -remove_dc no no
    -round_filters yes yes
    -samprate 16000 1.600000e+04
    -seed -1 -1
    -sendump
    -senin no no
    -senlogdir
    -senmgau
    -silprob 0.005 5.000000e-03
    -smoothspec no no
    -svspec
    -tmat
    -tmatfloor 0.0001 1.000000e-04
    -topn 4 4
    -topn_beam 0 0
    -toprule
    -transform legacy legacy
    -unit_area yes yes
    -upperf 6855.4976 6.855498e+03
    -usewdphones no no
    -uw 1.0 1.000000e+00
    -var
    -varfloor 0.0001 1.000000e-04
    -varnorm no no
    -verbose no no
    -warp_params
    -warp_type inverse_linear inverse_linear
    -wbeam 7e-29 0.000000e+00
    -wip 0.65 0.000000e+00
    -wlen 0.025625 2.562500e-02

    Do you know what could be the source of the problem ?
    Thank you

     

    Last edit: floboc 2012-10-15
  • Nickolay V. Shmyrev

    Do you know what could be the source of the problem ?

    Hello

    In order to enable us to analyze the problem you need to provide the data files you are using.

     
  • floboc

    floboc - 2012-10-17

    Here is my project folder: http://www.sendspace.com/file/l2tybt
    I use the script "test.sh" to run some decoding with custom parameters (like no beam pruning, etc.)

    Thank you for your time

     
  • Nickolay V. Shmyrev

    Hello

    N-best lists are created from the lattice. In order to construct lattice properly you need -bestpath to be enabled. Without that nbest will not work as you demonstrated.

     
  • floboc

    floboc - 2012-10-18

    Of course, you are right!
    I still have a problem.
    If I set bestpath to yes, I obtain in 0.hyp the transcription "zéro huit" or "deux onze" in 12.hyp.
    It shouldn't be possible because these possibilities doesn't exist in my JSGF grammar (the files are the same as before, only the bestpath option changed).
    Do you know why ?

    My grammar is (without special caracters since I don't know how to display them on the forum) :


    JSGF V1.0;
    grammar samplegrammar;
    "unitsWO01" = deux | trois | quatre | cinq | six | sept | huit | neuf;
    "unitsWO0" = un | "unitsWO01";
    "units" = zéro | "unitsWO0";
    "elevens" = dix | onze | douze | treize | quatorze | quinze | seize | dix sept | dix huit | dix neuf;
    "tensWO10" = vingt | trente | quarante | cinquante | soixante | quatre vingt;
    "specialTens" = soixante | quatre vingt;
    public "max2digits" = "units" | "elevens" | "tensWO10> "[unitsWO0]" | "specialTens" "elevens";

    And by the way, all the transcription given by nbest (with nbest = 10) are identical with the same score, it's strange to me. Do you think it's normal ?

    Thank you

    EDIT:
    another problem I had with running pocketsphinx_batch with the following parameters (sill the same project, nothing else changed) :


    pocketsphinx_batch \
    -adcin 'yes' \
    -cepdir 'wav' \
    -cepext '.wav' \
    -ctl 'etc/test.fileids' \
    -jsgf 'etc/test.gram' \
    -dict 'etc/test.dic' \
    -bestpath 'yes' \
    -fwdflat 'no' \
    -samprate 16000 \
    -nbest 10 \
    -nbestdir "result" \
    -hmm 'hub4wsj_sc_8k' \
    -hyp 'result/hyp.txt'

    I got a segmentation fault when decoding the fourth file:

    INFO: fsg_search.c(1407): Start node un.0:2:6
    INFO: fsg_search.c(1407): Start node <sil>.0:2:49
    INFO: fsg_search.c(1446): End node <sil>.79:81:152 (-388)
    INFO: fsg_search.c(1662): lattice start node .0 end node <sil>.79
    INFO: ps_lattice.c(1352): Normalizer P(O) = alpha(<sil>:79:152) = -69683
    INFO: ps_lattice.c(1390): Joint P(O,S) = -69683 P(S|O) = 0
    pocketsphinx_batch(897) malloc: *** error for object 0x500007fc12053f5e: pointer being freed was not allocated
    *** set a breakpoint in malloc_error_break to debug
    ./test.sh: line 22: 897 Abort trap: 6 pocketsphinx_batch -adcin 'yes' -cepdir 'wav' -cepext '.wav' -ctl 'etc/test.fileids' -jsgf 'etc/test.gram' -dict 'etc/test.dic' -bestpath 'yes' -fwdflat 'no' -samprate 16000 -nbest 10 -nbestdir "result" -hmm 'hub4wsj_sc_8k' -hyp 'result/hyp.txt'
    </sil></sil>
    </sil></sil>

    This errors disapears if I set the beam and the wordbeam to 0

     

    Last edit: floboc 2012-10-18
    • Nickolay V. Shmyrev

      If I set bestpath to yes, I obtain in 0.hyp the transcription "zéro huit" or "deux onze" in 1If I set bestpath to yes, I obtain in 0.hyp the transcription "zéro huit" or "deux onze" in 12.hyp.

      Try a version from subversion trunk, this issue should be fixed.

      And by the way, all the transcription given by nbest (with nbest = 10) are identical with the same score, it's strange to me. Do you think it's normal ?

      There is no support to filter the duplicates yet, it just return all possible decoding variants.

       

Log in to post a comment.