Hello,
I am trying to improve my accuracy by using the n-best list instead of just the best hypothesis.
When using pocketsphinx_batch i get weird results:
- if I use it without nbest, I get a coherent decoding result like:
zéro (0 -3923)
un (1 -2162)
seize (10 -2677)
onze (11 -1854)
douze (12 -2560)
treize (13 -2408)
quatorze (14 -2386)
seize (15 -2590)
seize (16 -1999)
dix-sept (17 -2736)
six (18 -3464)
dix-neuf (19 -2991)
deux (2 -2500)
-if I add "-nbest 10" and "-nbestdir my_nbest_folder" to the arguments of pocket sphinx_batch, the result is weird. I get all the files with my 10 nest hypothesis but it doesn't match, the best result I got without it is not present in the nbest list. For instance in the 17.hyp file I got:
dix -799
dix -799
dix -799
deux -865
deux -865
deux -865
onze -993
onze -993
onze -993
neuf -1175
You can see that my best hypothesis "dix-sept" is not present in this list.
I insist on the fact that I didn't change anything except the two nbest arguments.
Do you have any idea of hat could be the source of my problem.
For your interest, I am using JSGF grammar.
There is no error or warning in the log.
My settings are:
-adchdr 0 0
-adcin no yes
-agc none max
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-argfile
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 0.000000e+00
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-bghist no no
-build_outdirs yes yes
-cepdir wav
-cepext .mfc .wav
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-ctl etc/test.fileids
-ctlcount -1 -1
-ctlincr 1 1
-ctloffset 0 0
-ctm
-debug 0
-dict etc/test.dic
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgctl
-fsgdir
-fsgext
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm hub4wsj_sc_8k
-hyp result/hyp.txt
-hypseg
-input_endian little little
-jsgf etc/test.gram
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latsize 5000 5000
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm
-lmctl
-lmname default default
-lmnamectl
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+02
-lpbeam 1e-40 0.000000e+00
-lponlybeam 7e-29 0.000000e+00
-lw 6.5 0.000000e+00
-maxhmmpf -1 3000
-maxnewoov 20 20
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mllrctl
-mllrdir
-mllrext
-mmap yes yes
-nbest 0 10
-nbestdir result
-nbestext .hyp .hyp
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1.000000e+00
-outlatbeam 1e-5 1.000000e-05
-outlatdir
-outlatext .lat .lat
-outlatfmt s3 s3
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-5 1.000000e-05
-pl_window 0 0
-rawlogdir
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump
-senin no no
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-usewdphones no no
-uw 1.0 1.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 0.000000e+00
-wip 0.65 0.000000e+00
-wlen 0.025625 2.562500e-02
Do you know what could be the source of the problem ?
Thank you
Last edit: floboc 2012-10-15
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Here is my project folder: http://www.sendspace.com/file/l2tybt
I use the script "test.sh" to run some decoding with custom parameters (like no beam pruning, etc.)
Thank you for your time
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
N-best lists are created from the lattice. In order to construct lattice properly you need -bestpath to be enabled. Without that nbest will not work as you demonstrated.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Of course, you are right!
I still have a problem.
If I set bestpath to yes, I obtain in 0.hyp the transcription "zéro huit" or "deux onze" in 12.hyp.
It shouldn't be possible because these possibilities doesn't exist in my JSGF grammar (the files are the same as before, only the bestpath option changed).
Do you know why ?
My grammar is (without special caracters since I don't know how to display them on the forum) :
JSGF V1.0;
grammar samplegrammar;
"unitsWO01" = deux | trois | quatre | cinq | six | sept | huit | neuf;
"unitsWO0" = un | "unitsWO01";
"units" = zéro | "unitsWO0";
"elevens" = dix | onze | douze | treize | quatorze | quinze | seize | dix sept | dix huit | dix neuf;
"tensWO10" = vingt | trente | quarante | cinquante | soixante | quatre vingt;
"specialTens" = soixante | quatre vingt;
public "max2digits" = "units" | "elevens" | "tensWO10> "[unitsWO0]" | "specialTens" "elevens";
And by the way, all the transcription given by nbest (with nbest = 10) are identical with the same score, it's strange to me. Do you think it's normal ?
Thank you
EDIT:
another problem I had with running pocketsphinx_batch with the following parameters (sill the same project, nothing else changed) :
If I set bestpath to yes, I obtain in 0.hyp the transcription "zéro huit" or "deux onze" in 1If I set bestpath to yes, I obtain in 0.hyp the transcription "zéro huit" or "deux onze" in 12.hyp.
Try a version from subversion trunk, this issue should be fixed.
And by the way, all the transcription given by nbest (with nbest = 10) are identical with the same score, it's strange to me. Do you think it's normal ?
There is no support to filter the duplicates yet, it just return all possible decoding variants.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I am trying to improve my accuracy by using the n-best list instead of just the best hypothesis.
When using pocketsphinx_batch i get weird results:
- if I use it without nbest, I get a coherent decoding result like:
zéro (0 -3923)
un (1 -2162)
seize (10 -2677)
onze (11 -1854)
douze (12 -2560)
treize (13 -2408)
quatorze (14 -2386)
seize (15 -2590)
seize (16 -1999)
dix-sept (17 -2736)
six (18 -3464)
dix-neuf (19 -2991)
deux (2 -2500)
-if I add "-nbest 10" and "-nbestdir my_nbest_folder" to the arguments of pocket sphinx_batch, the result is weird. I get all the files with my 10 nest hypothesis but it doesn't match, the best result I got without it is not present in the nbest list. For instance in the 17.hyp file I got:
dix -799
dix -799
dix -799
deux -865
deux -865
deux -865
onze -993
onze -993
onze -993
neuf -1175
You can see that my best hypothesis "dix-sept" is not present in this list.
I insist on the fact that I didn't change anything except the two nbest arguments.
Do you have any idea of hat could be the source of my problem.
For your interest, I am using JSGF grammar.
There is no error or warning in the log.
My settings are:
-adchdr 0 0
-adcin no yes
-agc none max
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-argfile
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 0.000000e+00
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-bghist no no
-build_outdirs yes yes
-cepdir wav
-cepext .mfc .wav
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-ctl etc/test.fileids
-ctlcount -1 -1
-ctlincr 1 1
-ctloffset 0 0
-ctm
-debug 0
-dict etc/test.dic
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgctl
-fsgdir
-fsgext
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm hub4wsj_sc_8k
-hyp result/hyp.txt
-hypseg
-input_endian little little
-jsgf etc/test.gram
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latsize 5000 5000
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm
-lmctl
-lmname default default
-lmnamectl
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+02
-lpbeam 1e-40 0.000000e+00
-lponlybeam 7e-29 0.000000e+00
-lw 6.5 0.000000e+00
-maxhmmpf -1 3000
-maxnewoov 20 20
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mllrctl
-mllrdir
-mllrext
-mmap yes yes
-nbest 0 10
-nbestdir result
-nbestext .hyp .hyp
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1.000000e+00
-outlatbeam 1e-5 1.000000e-05
-outlatdir
-outlatext .lat .lat
-outlatfmt s3 s3
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-5 1.000000e-05
-pl_window 0 0
-rawlogdir
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump
-senin no no
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-usewdphones no no
-uw 1.0 1.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 0.000000e+00
-wip 0.65 0.000000e+00
-wlen 0.025625 2.562500e-02
Do you know what could be the source of the problem ?
Thank you
Last edit: floboc 2012-10-15
Hello
In order to enable us to analyze the problem you need to provide the data files you are using.
Here is my project folder: http://www.sendspace.com/file/l2tybt
I use the script "test.sh" to run some decoding with custom parameters (like no beam pruning, etc.)
Thank you for your time
Hello
N-best lists are created from the lattice. In order to construct lattice properly you need -bestpath to be enabled. Without that nbest will not work as you demonstrated.
Of course, you are right!
I still have a problem.
If I set bestpath to yes, I obtain in 0.hyp the transcription "zéro huit" or "deux onze" in 12.hyp.
It shouldn't be possible because these possibilities doesn't exist in my JSGF grammar (the files are the same as before, only the bestpath option changed).
Do you know why ?
My grammar is (without special caracters since I don't know how to display them on the forum) :
JSGF V1.0;
grammar samplegrammar;
"unitsWO01" = deux | trois | quatre | cinq | six | sept | huit | neuf;
"unitsWO0" = un | "unitsWO01";
"units" = zéro | "unitsWO0";
"elevens" = dix | onze | douze | treize | quatorze | quinze | seize | dix sept | dix huit | dix neuf;
"tensWO10" = vingt | trente | quarante | cinquante | soixante | quatre vingt;
"specialTens" = soixante | quatre vingt;
public "max2digits" = "units" | "elevens" | "tensWO10> "[unitsWO0]" | "specialTens" "elevens";
And by the way, all the transcription given by nbest (with nbest = 10) are identical with the same score, it's strange to me. Do you think it's normal ?
Thank you
EDIT:
another problem I had with running pocketsphinx_batch with the following parameters (sill the same project, nothing else changed) :
pocketsphinx_batch \
-adcin 'yes' \
-cepdir 'wav' \
-cepext '.wav' \
-ctl 'etc/test.fileids' \
-jsgf 'etc/test.gram' \
-dict 'etc/test.dic' \
-bestpath 'yes' \
-fwdflat 'no' \
-samprate 16000 \
-nbest 10 \
-nbestdir "result" \
-hmm 'hub4wsj_sc_8k' \
-hyp 'result/hyp.txt'
I got a segmentation fault when decoding the fourth file:
INFO: fsg_search.c(1407): Start node un.0:2:6
INFO: fsg_search.c(1407): Start node <sil>.0:2:49
INFO: fsg_search.c(1446): End node <sil>.79:81:152 (-388)
INFO: fsg_search.c(1662): lattice start node
.0 end node <sil>.79</sil></sil>INFO: ps_lattice.c(1352): Normalizer P(O) = alpha(<sil>:79:152) = -69683
INFO: ps_lattice.c(1390): Joint P(O,S) = -69683 P(S|O) = 0
pocketsphinx_batch(897) malloc: *** error for object 0x500007fc12053f5e: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
./test.sh: line 22: 897 Abort trap: 6 pocketsphinx_batch -adcin 'yes' -cepdir 'wav' -cepext '.wav' -ctl 'etc/test.fileids' -jsgf 'etc/test.gram' -dict 'etc/test.dic' -bestpath 'yes' -fwdflat 'no' -samprate 16000 -nbest 10 -nbestdir "result" -hmm 'hub4wsj_sc_8k' -hyp 'result/hyp.txt'
</sil></sil>
This errors disapears if I set the beam and the wordbeam to 0Last edit: floboc 2012-10-18
Try a version from subversion trunk, this issue should be fixed.
There is no support to filter the duplicates yet, it just return all possible decoding variants.