I have tried to run sphinx3_livepretend with a fsg file rather than a LM. From
what I understood by reading this forum, I have to use this options in the
configuration file : -op_mode 2 -fsg gram.fsg.
But then, the program exits with the following display :
INFO: kb.c(306): SEARCH MODE INDEX 2
INFO: srch.c(373): Search Initialization.
INFO: fsg_search.c(270): FSG(beam: -422133, pbeam: -383758, wbeam: -268630;
wip: 0, pip: 0)
INFO: word_fsg.c(893): Reading FSG file
'/home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/gramTest.fsg'
(altpron=1, filler=1, lw=9.50, silprob=1.00e-01, fillprob=1.00e-01)
INFO: word_fsg.c(355): Adding filler words to FSG
INFO: word_fsg.c(649): FSG: 25 states, 285 transitions (13 null, 7 alt, 250
filler, 0 unknown)
INFO: word_fsg.c(296): Computing transitive closure for null transitions
INFO: word_fsg.c(336): 9 null transitions added
INFO: word_fsg.c(418): Value of silcipid 9
INFO: word_fsg.c(420): No of CI phones 45
INFO: ctxt_table.c(255): Building within-word triphones
INFO: ctxt_table.c(287): 0 within-word triphone instances mapped to CI-phones
INFO: ctxt_table.c(309): Building cross-word triphones
INFO: ctxt_table.c(370): 0 cross-word triphones mapped to CI-phones
sphinx3_livedecode: fsg_psubtree.c:320: psubtree_add_trans: Assertion
`!dict_filler_word(dict, wid)' failed.
How can I use my grammar file ?
Thanks.
Simon.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
When you asking for help it's imporant to provide us a way to reproduce your problem. We just can't guess what happens from the limited information you have. As well as our chances to help you are lower. You need to provide the grammar at least in order to help us to check what happens. You need to provide exact command line options you are using. You need to provide full log of the decoder output, not just a part of it
Sphinx3 is an obsolete piece of software which we don't recommend anyone to use unless they really know what they are duing. That's not because we want to create people problems or not because we are joking. We suggest you to use pocketsphinx because
a) it's easier to use
b) it's easier to fix issues
c) It's faster in default configuratin
d) it's more accurate in default configuration
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The reason for this issue is the french fillers dictionary
/home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/fillers_dict . You
need to remove the following line from it:
[euh]eeee
Such doubled fillers aren't supported
Regarding pocketsphinx, I have tried it and I have not been able to make it
work. Is a sphinx3 acoustic model perfectly compatible whith it ?
Yes, models are compatible, but again you need to be more specific. We are
interested to help you to setup decoder properly, but again you need to
provide details.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello
I have tried to run sphinx3_livepretend with a fsg file rather than a LM. From
what I understood by reading this forum, I have to use this options in the
configuration file : -op_mode 2 -fsg gram.fsg.
But then, the program exits with the following display :
INFO: kb.c(306): SEARCH MODE INDEX 2
INFO: srch.c(373): Search Initialization.
INFO: fsg_search.c(270): FSG(beam: -422133, pbeam: -383758, wbeam: -268630;
wip: 0, pip: 0)
INFO: word_fsg.c(893): Reading FSG file
'/home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/gramTest.fsg'
(altpron=1, filler=1, lw=9.50, silprob=1.00e-01, fillprob=1.00e-01)
INFO: word_fsg.c(355): Adding filler words to FSG
INFO: word_fsg.c(649): FSG: 25 states, 285 transitions (13 null, 7 alt, 250
filler, 0 unknown)
INFO: word_fsg.c(296): Computing transitive closure for null transitions
INFO: word_fsg.c(336): 9 null transitions added
INFO: word_fsg.c(418): Value of silcipid 9
INFO: word_fsg.c(420): No of CI phones 45
INFO: ctxt_table.c(255): Building within-word triphones
INFO: ctxt_table.c(287): 0 within-word triphone instances mapped to CI-phones
INFO: ctxt_table.c(309): Building cross-word triphones
INFO: ctxt_table.c(370): 0 cross-word triphones mapped to CI-phones
sphinx3_livedecode: fsg_psubtree.c:320: psubtree_add_trans: Assertion
`!dict_filler_word(dict, wid)' failed.
How can I use my grammar file ?
Thanks.
Simon.
I have 2 generic advices to you on that problem:
When you asking for help it's imporant to provide us a way to reproduce your problem. We just can't guess what happens from the limited information you have. As well as our chances to help you are lower. You need to provide the grammar at least in order to help us to check what happens. You need to provide exact command line options you are using. You need to provide full log of the decoder output, not just a part of it
Sphinx3 is an obsolete piece of software which we don't recommend anyone to use unless they really know what they are duing. That's not because we want to create people problems or not because we are joking. We suggest you to use pocketsphinx because
a) it's easier to use
b) it's easier to fix issues
c) It's faster in default configuratin
d) it's more accurate in default configuration
Well, the command was : sphinx3_livepretend ./myCtlFile . Config0.cfgfile
Here is the config file :
-samprate 16000
-hmm /home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/parameters/F0
-dict /home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/FEWwords_dict
-fdict /home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/fillers_dict
-op_mode 2
-fsg /home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/gramTest.fsg
-mdef /home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/architecture/F0.5500.mdef
And the grammar (this is the jsgf file, I have converted it with
sphinx_jsgf2fsg) :
JSGF V1.0;
grammar temps;
public <temps> = (((la pluie | la neige) tombait (a verse | doucement | a gros
flocon)) | (les carotte sont cuites)); </temps>
The dictionnary only contains the words used in that grammar.
The full log is :
INFO: info.c(65): Host: 'devsyn107'
INFO: info.c(69): Directory:
'/home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models'
INFO: info.c(73): sphinx3_livepretend Compiled on: Jul 27 2010, AT: 15:50:45
INFO: cmd_ln.c(512): Parsing command line:
\
-samprate 16000 \
-hmm /home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/parameters/F0 \
-dict /home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/FEWwords_dict \
-fdict /home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/fillers_dict \
-op_mode 2 \
-fsg /home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/gramTest.fsg \
-mdef /home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/architecture/F0.5500.mdef
Current configuration:
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-backtrace yes yes
-beam 1.0e-55 1.000000e-55
-bestpath no no
-bestpathlw 0.000000e+00
-bestscoredir
-bestsenscrdir
-bghist no no
-bptbldir
-bptblsize 32768 32768
-cb2mllr .1cls. .1cls.
-ceplen 13 13
-ci_pbeam 1e-80 1.000000e-80
-cmn current current
-cmninit 8.0 8.0
-cond_ds no no
-ctl
-ctlcount 1000000000 1000000000
-ctloffset 0 0
-ctl_lm
-ctl_mllr
-dagfudge 2 2
-dict /home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/FEWwords_dict
-dist_ds no no
-dither no no
-doublebw no no
-ds 1 1
-epl 3 3
-fdict /home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/fillers_dict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillpen
-fillprob 0.1 1.000000e-01
-frate 100 100
-fsg /home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/gramTest.fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-gs
-gs4gs yes yes
-hmm /home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/parameters/F0
-hmmdump no no
-hmmdumpef 200000000 200000000
-hmmdumpsf 200000000 200000000
-hmmhistbinsize 5000 5000
-hyp
-hypseg
-hypsegscore_unscale yes yes
-inlatdir
-inlatwin 50 50
-input_endian little little
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latcompress yes yes
-latext lat.gz lat.gz
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm
-lmctlfn
-lmdumpdir
-lmname
-log3table yes yes
-logbase 1.0003 1.000300e+00
-logspec no no
-lowerf 133.33334 1.333333e+02
-lts_mismatch no no
-lw 9.5 9.500000e+00
-machine_endian little little
-maxcdsenpf 100000 100000
-maxedge 2000000 2000000
-maxhistpf 100 100
-maxhmmpf 20000 20000
-maxhyplen 1000 1000
-maxlmop 100000000 100000000
-maxlpf 40000 40000
-maxppath 1000000 1000000
-maxwpf 20 20
-mdef /home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/architecture/F0.5500.mdef
-mean
-min_endfr 3 3
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mode fwdtree fwdtree
-nbest 200 200
-nbestdir
-nbestext nbest.gz nbest.gz
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-Nlextree 3 3
-Nstalextree 25 25
-op_mode -1 2
-outlatdir
-outlatfmt s3 s3
-pbeam 1.0e-50 1.000000e-50
-pheurtype 0 0
-phonepen 1.0 1.000000e+00
-phypdump yes yes
-pl_beam 1.0e-80 1.000000e-80
-pl_window 1 1
-ppathdebug no no
-ptranskip 0 0
-rawext .raw .raw
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-senmgau .cont. .cont.
-silprob 0.1 1.000000e-01
-smoothspec no no
-subvq
-subvqbeam 3.0e-3 3.000000e-03
-svq4svq no no
-svspec
-tighten_factor 0.5 5.000000e-01
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-tracewhmm
-transform legacy legacy
-treeugprob yes yes
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-uw 0.7 7.000000e-01
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-vqeval 3 3
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 1.0e-35 1.000000e-35
-wend_beam 1.0e-80 1.000000e-80
-wip 0.7 7.000000e-01
-wlen 0.025625 2.562500e-02
-worddumpef 200000000 200000000
-worddumpsf 200000000 200000000
INFO: kbcore.c(433): Begin Initialization of Core Models:
ERROR: "cmd_ln.c", line 724: Cannot open configuration file /home/sibour/Speec
h2Text/sphinx3-0.8/lium_acoustic_models/parameters/F0/feat.params for reading
INFO: kbcore.c(453): Parsed model-specific feature parameters from /home/sibou
r/Speech2Text/sphinx3-0.8/lium_acoustic_models/parameters/F0/feat.params
INFO: Initialization of the log add table
INFO: Log-Add table size = 29350 x 2 >> 0
INFO:
INFO: feat.c(848): Initializing feature stream to type: '1s_c_d_dd',
ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean= 12.00, mean= 0.0
INFO: kbcore.c(480): .cont.
INFO: Initialization of feat_t, report:
INFO: Feature type = 1s_c_d_dd
INFO: Cepstral size = 13
INFO: Number of streams = 1
INFO: Vector size of stream: 39
INFO: Number of subvectors = 0
INFO: Whether CMN is used = 1
INFO: Whether AGC is used = 0
INFO: Whether variance is normalized = 0
INFO:
INFO: Reading HMM in Sphinx 3 Model format
INFO: Model Definition File: /home/sibour/Speech2Text/sphinx3-0.8/lium_acousti
c_models/architecture/F0.5500.mdef
INFO: Mean File:
/home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/parameters/F0/means
INFO: Variance File: /home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models
/parameters/F0/variances
INFO: Mixture Weight File: /home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_
models/parameters/F0/mixture_weights
INFO: Transition Matrices File: /home/sibour/Speech2Text/sphinx3-0.8/lium_acou
stic_models/parameters/F0/transition_matrices
INFO: mdef.c(682): Reading model definition: /home/sibour/Speech2Text/sphinx3-
0.8/lium_acoustic_models/architecture/F0.5500.mdef
INFO: Initialization of mdef_t, report:
INFO: 45 CI-phone, 82089 CD-phone, 5 emitstate/phone, 225 CI-sen, 5725 Sen,
12416 Sen-Seq
INFO:
INFO: kbcore.c(288): Using optimized GMM computation for Continuous HMM, -topn
will be ignored
INFO: cont_mgau.c(163): Reading mixture gaussian file '/home/sibour/Speech2Tex
t/sphinx3-0.8/lium_acoustic_models/parameters/F0/means'
INFO: cont_mgau.c(422): 5725 mixture Gaussians, 22 components, 1 streams,
veclen 39
INFO: cont_mgau.c(163): Reading mixture gaussian file '/home/sibour/Speech2Tex
t/sphinx3-0.8/lium_acoustic_models/parameters/F0/variances'
INFO: cont_mgau.c(422): 5725 mixture Gaussians, 22 components, 1 streams,
veclen 39
INFO: cont_mgau.c(510): Reading mixture weights file '/home/sibour/Speech2Text
/sphinx3-0.8/lium_acoustic_models/parameters/F0/mixture_weights'
INFO: cont_mgau.c(665): Read 5725 x 22 mixture weights
INFO: cont_mgau.c(693): Removing uninitialized Gaussian densities
0 2 5339
WARNING: "cont_mgau.c", line 767: 85 densities removed (3 mixtures removed
entirely)
INFO: cont_mgau.c(783): Applying variance floor
INFO: cont_mgau.c(801): 187 variance values floored
INFO: cont_mgau.c(849): Precomputing Mahalanobis distance invariants
INFO: tmat.c(169): Reading HMM transition probability matrices: /home/sibour/S
peech2Text/sphinx3-0.8/lium_acoustic_models/parameters/F0/transition_matrices
INFO: Initialization of tmat_t, report:
INFO: Read 45 transition matrices of size 5x6
INFO:
INFO: dict.c(475): Reading main dictionary:
/home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/FEWwords_dict
INFO: dict.c(478): 20 words read
INFO: dict.c(483): Reading filler dictionary:
/home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/fillers_dict
INFO: dict.c(486): 12 words read
INFO: Initialization of dict_t, report:
INFO: No of CI phone: 0
INFO: Max word: 4128
INFO: No of word: 32
INFO:
INFO: Initialization of fillpen_t, report:
INFO: Language weight =9.500000
INFO: Word Insertion Penalty =0.700000
INFO: Silence probability =0.100000
INFO: Filler probability =0.100000
INFO:
INFO: dict2pid.c(599): Building PID tables for dictionary
INFO: Initialization of dict2pid_t, report:
INFO: Dict2pid is in composite triphone mode
INFO: 319 composite states; 67 composite sseq
INFO:
INFO: kbcore.c(632): Inside kbcore: Verifying models consistency ......
INFO: kbcore.c(654): End of Initialization of Core Models:
INFO: Initialization of beam_t, report:
INFO: Parameters used in Beam Pruning of Viterbi Search:
INFO: Beam=-422133
INFO: PBeam=-383758
INFO: WBeam=-268630 (Skip=0)
INFO: WEndBeam=-614012
INFO: No of CI Phone assumed=45
INFO:
INFO: Initialization of fast_gmm_t, report:
INFO: Parameters used in Fast GMM computation:
INFO: Frame-level: Down Sampling Ratio 1, Conditional Down Sampling? 0,
Distance-based Down Sampling? 0
INFO: GMM-level: CI phone beam -614012. MAX CD 100000
INFO: Gaussian-level: GS map would be used for Gaussian Selection? =1, SVQ
would be used as Gaussian Score? =0 SubVQ Beam -19363
INFO:
INFO: Initialization of pl_t, report:
INFO: Parameters used in phoneme lookahead:
INFO: Phoneme look-ahead type = 0
INFO: Phoneme look-ahead beam size = 65945
INFO: No of CI Phones assumed=45
INFO:
INFO: Initialization of ascr_t, report:
INFO: No. of CI senone =225
INFO: No. of senone = 5725
INFO: No. of composite senone = 319
INFO: No. of senone sequence = 12416
INFO: No. of composite senone sequence=67
INFO: Parameters used in phoneme lookahead:
INFO: Phoneme lookahead window = 1
INFO:
INFO: kb.c(306): SEARCH MODE INDEX 2
INFO: srch.c(373): Search Initialization.
INFO: fsg_search.c(270): FSG(beam: -422133, pbeam: -383758, wbeam: -268630;
wip: 0, pip: 0)
INFO: word_fsg.c(893): Reading FSG file
'/home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/gramTest.fsg'
(altpron=1, filler=1, lw=9.50, silprob=1.00e-01, fillprob=1.00e-01)
INFO: word_fsg.c(355): Adding filler words to FSG
INFO: word_fsg.c(649): FSG: 27 states, 307 transitions (15 null, 7 alt, 270
filler, 0 unknown)
INFO: word_fsg.c(296): Computing transitive closure for null transitions
INFO: word_fsg.c(336): 19 null transitions added
INFO: word_fsg.c(418): Value of silcipid 9
INFO: word_fsg.c(420): No of CI phones 45
INFO: ctxt_table.c(255): Building within-word triphones
INFO: ctxt_table.c(287): 0 within-word triphone instances mapped to CI-phones
INFO: ctxt_table.c(309): Building cross-word triphones
INFO: ctxt_table.c(370): 0 cross-word triphones mapped to CI-phones
sphinx3_livepretend: fsg_psubtree.c:320: psubtree_add_trans: Assertion
`!dict_filler_word(dict, wid)' failed.
Abandon
Regarding pocketsphinx, I have tried it and I have not been able to make it
work. Is a sphinx3 acoustic model perfectly compatible whith it ?
Thanks.
You forgot "-agc emax". It's better to download french model from realiable
location (sourceforge) and follow documentation provided (README file)
The reason for this issue is the french fillers dictionary
/home/sibour/Speech2Text/sphinx3-0.8/lium_acoustic_models/fillers_dict . You
need to remove the following line from it:
Such doubled fillers aren't supported
Yes, models are compatible, but again you need to be more specific. We are
interested to help you to setup decoder properly, but again you need to
provide details.