I'm having a problem in decoding, and i can't solve. i use sphinx3 in kUbuntu
9,10
victor@victor-kubuntu:~/fdm$ sudo perl scripts_pl/decode/slave.pl
password for victor:
MODULE: DECODE Decoding using models previously trained
Decoding 2 segments starting at 0 (part 1 of 1)
0%
This step had 2 ERROR messages and 1 WARNING messages. Please check the log
file for details.
Aligning results to find error rate
word_align.pl failed with error code 65280 at scripts_pl/decode/slave.pl line
173.
victor@victor-kubuntu:~/fdm$
fdm.html:
MODULE: DECODE Decoding using models previously trained (2010-03-01 01:34)
Decoding 2 segments starting at 0 (part 1 of 1)
sphinx3_decode Log File
This step had 2 ERROR messages and 1 WARNING messages. Please check the log
file for details.
completed
Aligning results to find error rate
/home/victor/fdm/logdir/decode/fdm-1-1.log
INFO: info.c(65): Host: 'victor-kubuntu'
INFO: info.c(66): Directory: '/home/victor/fdm'
INFO: info.c(70): /home/victor/fdm/bin/.libs/lt-sphinx3_decode Compiled on:
Feb 9 2010, AT: 02:23:44
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-cep2spec no no
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-dither no yes
-doublebw no no
-feat 1s_c_d_dd 1s_c_d_dd
-frate 100 100
-input_endian little little
-lda
-ldadim 0 0
-lifter 0 0
-logspec no no
-lowerf 133.33334 1.333333e+02
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-smoothspec no no
-spec2cep no no
-svspec
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2.560000e-02
INFO: Initialization of the log add table
INFO: Log-Add table size = 29356 x 2 >> 0
INFO:
INFO: feat.c(848): Initializing feature stream to type: '1s_c_d_dd',
ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean= 12.00, mean= 0.0
INFO: kbcore.c(488): .cont.
INFO: Initialization of feat_t, report:
INFO: Feature type = 1s_c_d_dd
INFO: Cepstral size = 13
INFO: Number of streams = 1
INFO: Vector size of stream: 39
INFO: Number of subvectors = 0
INFO: Whether CMN is used = 1
INFO: Whether AGC is used = 0
INFO: Whether variance is normalized = 0
INFO:
INFO: Reading HMM in Sphinx 3 Model format
INFO: Model Definition File:
/home/victor/fdm/model_parameters/fdm.cd_cont_1000/mdef
INFO: Mean File: /home/victor/fdm/model_parameters/fdm.cd_cont_1000/means
INFO: Variance File:
/home/victor/fdm/model_parameters/fdm.cd_cont_1000/variances
INFO: Mixture Weight File:
/home/victor/fdm/model_parameters/fdm.cd_cont_1000/mixture_weights
INFO: Transition Matrices File:
/home/victor/fdm/model_parameters/fdm.cd_cont_1000/transition_matrices
INFO: mdef.c(682): Reading model definition:
/home/victor/fdm/model_parameters/fdm.cd_cont_1000/mdef
INFO: Initialization of mdef_t, report:
INFO: 17 CI-phone, 244 CD-phone, 3 emitstate/phone, 51 CI-sen, 234 Sen, 116
Sen-Seq
INFO:
INFO: kbcore.c(298): Using optimized GMM computation for Continuous HMM, -topn
will be ignored
INFO: cont_mgau.c(163): Reading mixture gaussian file
'/home/victor/fdm/model_parameters/fdm.cd_cont_1000/means'
INFO: cont_mgau.c(422): 234 mixture Gaussians, 8 components, 1 streams, veclen
39
INFO: cont_mgau.c(163): Reading mixture gaussian file
'/home/victor/fdm/model_parameters/fdm.cd_cont_1000/variances'
INFO: cont_mgau.c(422): 234 mixture Gaussians, 8 components, 1 streams, veclen
39
INFO: cont_mgau.c(523): Reading mixture weights file
'/home/victor/fdm/model_parameters/fdm.cd_cont_1000/mixture_weights'
INFO: cont_mgau.c(678): Read 234 x 8 mixture weights
INFO: cont_mgau.c(706): Removing uninitialized Gaussian densities
30 31 60 67 68 69 70 71 74 75 76 77 81 82 84 86 88 91 93 101 110 113 116 117
118 123 124 125 126 127 128 129 134 135 136 138 139 141 142 145 148 155 158
164 172 174 176 177 178 203 209 210 215 218 220 222 224 226 231
WARNING: "cont_mgau.c", line 780: 766 densities removed (59 mixtures removed
entirely)
INFO: cont_mgau.c(796): Applying variance floor
INFO: cont_mgau.c(814): 1089 variance values floored
INFO: cont_mgau.c(862): Precomputing Mahalanobis distance invariants
INFO: tmat.c(169): Reading HMM transition probability matrices:
/home/victor/fdm/model_parameters/fdm.cd_cont_1000/transition_matrices
INFO: Initialization of tmat_t, report:
INFO: Read 17 transition matrices of size 3x4
INFO:
ERROR: "cmd_ln.c", line 877: Unknown argument: -mdef_fillers
ERROR: "cmd_ln.c", line 877: Unknown argument: -mdef_fillers
INFO: dict.c(384): Reading main dictionary: /home/victor/fdm/etc/fdm.dic
INFO: dict.c(387): 14 words read
INFO: dict.c(392): Reading filler dictionary: /home/victor/fdm/etc/fdm.filler
INFO: dict.c(395): 3 words read
INFO: dict.c(428): Added 0 fillers from mdef file
INFO: Initialization of dict_t, report:
INFO: No of CI phone: 0
INFO: Max word: 4113
INFO: No of word: 17
INFO:
INFO: lm.c(607): LM read('/home/victor/fdm/etc/fdm.ug.lm.DMP', lw= 23.00, wip=
0.20, uw= 0.70)
INFO: lm.c(609): Reading LM file /home/victor/fdm/etc/fdm.ug.lm.DMP (LM name
"default")
INFO: lm_3g_dmp.c(630): Reading LM in 16 bits format
INFO: lm_3g_dmp.c(686): Read 16 unigrams
INFO: lm_3g_dmp.c(759): 45 bigrams
INFO: lm_3g_dmp.c(832): 122 bigrams
INFO: lm_3g_dmp.c(902): 9 bigram prob entries
INFO: lm_3g_dmp.c(936): 4 trigram bowt entries
INFO: lm_3g_dmp.c(967): 5 trigram prob entries
INFO: lm_3g_dmp.c(998): 1 trigram segtable entries (512 segsize)
INFO: lm_3g_dmp.c(1053): 16 word strings
INFO: lm.c(691): The LM routine is operating at 16 bits mode
INFO: Initialization of fillpen_t, report:
INFO: Language weight =23.000000
INFO: Word Insertion Penalty =0.200000
INFO: Silence probability =0.100000
INFO: Filler probability =0.100000
INFO:
INFO: dict2pid.c(599): Building PID tables for dictionary
INFO: Initialization of dict2pid_t, report:
INFO: Dict2pid is in composite triphone mode
INFO: 73 composite states; 25 composite sseq
INFO:
INFO: kbcore.c(633): Inside kbcore: Verifying models consistency ......
INFO: kbcore.c(655): End of Initialization of Core Models:
INFO: Initialization of beam_t, report:
INFO: Parameters used in Beam Pruning of Viterbi Search:
INFO: Beam=-921172
INFO: PBeam=-383821
INFO: WBeam=-614114 (Skip=0)
INFO: WEndBeam=-614114
INFO: No of CI Phone assumed=17
INFO:
INFO: Initialization of fast_gmm_t, report:
INFO: Parameters used in Fast GMM computation:
INFO: Frame-level: Down Sampling Ratio 1, Conditional Down Sampling? 0,
Distance-based Down Sampling? 0
INFO: GMM-level: CI phone beam -614114. MAX CD 100000
INFO: Gaussian-level: GS map would be used for Gaussian Selection? =1, SVQ
would be used as Gaussian Score? =0 SubVQ Beam -19366
INFO:
INFO: Initialization of pl_t, report:
INFO: Parameters used in phoneme lookahead:
INFO: Phoneme look-ahead type = 0
INFO: Phoneme look-ahead beam size = -614114
INFO: No of CI Phones assumed=17
INFO:
INFO: Initialization of ascr_t, report:
INFO: No. of CI senone =51
INFO: No. of senone = 234
INFO: No. of composite senone = 73
INFO: No. of senone sequence = 116
INFO: No. of composite senone sequence=25
INFO: Parameters used in phoneme lookahead:
INFO: Phoneme lookahead window = 1
INFO:
INFO: kb.c(308): SEARCH MODE INDEX 4
INFO: srch.c(374): Search Initialization.
INFO: srch_time_switch_tree.c(284): -Nstalextree is omitted in TST search.
INFO: lextree.c(223): Creating Unigram Table for lm (name: default)
INFO: lextree.c(236): Size of word table after unigram + words in class: 14.
INFO: lextree.c(245): Size of word table after adding alternative prons: 14.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 7
INFO: Number of nodes 64
INFO: Number of links in the tree 116
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 96
INFO: The size of a gnode_t 12
INFO:
INFO: srch_time_switch_tree.c(344): Lextrees (0) for lm 0, its name is
default, it has 64 nodes(ug)
INFO: lextree.c(223): Creating Unigram Table for lm (name: default)
INFO: lextree.c(236): Size of word table after unigram + words in class: 14.
INFO: lextree.c(245): Size of word table after adding alternative prons: 14.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 7
INFO: Number of nodes 64
INFO: Number of links in the tree 116
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 96
INFO: The size of a gnode_t 12
INFO:
INFO: srch_time_switch_tree.c(344): Lextrees (1) for lm 0, its name is
default, it has 64 nodes(ug)
INFO: lextree.c(223): Creating Unigram Table for lm (name: default)
INFO: lextree.c(236): Size of word table after unigram + words in class: 14.
INFO: lextree.c(245): Size of word table after adding alternative prons: 14.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 7
INFO: Number of nodes 64
INFO: Number of links in the tree 116
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 96
INFO: The size of a gnode_t 12
INFO:
INFO: srch_time_switch_tree.c(344): Lextrees (2) for lm 0, its name is
default, it has 64 nodes(ug)
INFO: srch_time_switch_tree.c(351): Time for building trees, 0.0000 CPU 0.0002
Clk
INFO: srch_time_switch_tree.c(373): Lextrees(0), 1 nodes(filler)
INFO: srch_time_switch_tree.c(373): Lextrees(1), 1 nodes(filler)
INFO: srch_time_switch_tree.c(373): Lextrees(2), 1 nodes(filler)
INFO: vithist.c(169): Initializing Viterbi-history module
INFO: Initialization of srch_t, report:
INFO: Operation Mode = 4, Operation Name = fwdtree
INFO:
INFO: stat.c(225): SUMMARY: 0 fr , No report
root 4079 0.0 0.0 3676 1436 pts/1 S+ 01:53 0:00 /home/victor/fdm/bin/.libs/lt-
sphinx3_decode -senmgau .cont. -hmm
/home/victor/fdm/model_parameters/fdm.cd_cont_1000 -lw 23 -feat 1s_c_d_dd
-beam 1e-120 -wbeam 1e-80 -dict /home/victor/fdm/etc/fdm.dic -fdict
/home/victor/fdm/etc/fdm.filler -lm /home/victor/fdm/etc/fdm.ug.lm.DMP -wip
0.2 -ctl /home/victor/fdm/etc/fdm_test.fileids -ctloffset 0 -ctlcount 2
-cepdir /home/victor/fdm/feat -cepext .mfc -hyp
/home/victor/fdm/result/fdm-1-1.match -agc none -varnorm no -cmn current
root 4094 0.0 0.0 1752 480 pts/1 S+ 01:53 0:00 sh -c ps aguxwww | grep
sphinx3_decode
root 4096 0.0 0.0 3056 796 pts/1 R+ 01:53 0:00 grep sphinx3_decode
Mon Mar 1 01:53:28 2010
ERROR: "cmd_ln.c", line 877: Unknown argument: -mdef_fillers --> sphinxbase
also fdm.align containing
Use of uninitialized value $total_words in division (/) at
scripts_pl/decode/word_align.pl line 139, <ref> line 1.
Use of uninitialized value $total_cost in division (/) at
scripts_pl/decode/word_align.pl line 139, <ref> line 1.
Illegal division by zero at scripts_pl/decode/word_align.pl line 139, <ref>
line 1. </ref></ref></ref>
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Looks like you compiled rather old sphinx3, where in scripts mdef argument is
set. New sphinx3 decoding script doesn't have this argument in
scripts_pl/decode/s3decode.pl
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
victor@victor-kubuntu:~/fdm$ sphinx3_decode 2>&1 | grep mdef_fillers
-mdef_fillers no Automatically add filler words from the model definition file
victor@victor-kubuntu:~/fdm$
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
ERROR: "cmd_ln.c", line 630: No arguments given, exiting
Arguments list definition:
-adchdr 0 Number of bytes to skip at the beginning of a waveform file (44 for WAV, 1024 for Sphere)
-adcin no Input is waveform data rather than cepstra (-cepdir and -cepext are still used)
-agc none Automatic gain control for c0 ('max', 'emax', 'noise', or 'none')
-agcthresh 2.0 Initial threshold for automatic gain control
-alpha 0.97 Preemphasis parameter
-backtrace yes Whether detailed backtrace information (word segmentation/scores) shown in log
-beam 1.0e-55 Beam selecting active HMMs (relative to best) in each frame
-bestpath no Whether to run bestpath DAG search after forward Viterbi pass
-bestpathlw Language weight for bestpath DAG search (default: same as -lw)
-bestscoredir (Mode 3) Directory for writing best score/frame (used to set beamwidth; one file/utterance)
-bestsenscrdir When Best senone score directory.
-bghist no Bigram-mode: If TRUE only one BP entry/frame; else one per LM state
-bptbldir Directory in which to dump word Viterbi back pointer table (for debugging)
-bptblsize 32768 Number of BPtable entries to allocate initially (grown as necessary)
-cb2mllr .1cls. Senone to MLLR transformation matrix mapping file (or .1cls.)
-cep2spec no Input is cepstral files, output is log spectral files
-cepdir Input cepstrum files directory (prefixed to filespecs in control file)
-cepext .mfc Input cepstrum files extension (prefixed to filespecs in control file)
-ceplen 13 Number of components in the input feature vector
-ci_pbeam 1e-80 CI phone beam for CI-based GMM Selection.
-cmn current Cepstral mean normalization scheme ('current', 'prior', or 'none')
-cmninit 8.0 Initial values (comma-separated) for cepstral mean when 'prior' is used
-cond_ds no Conditional Down-sampling, override normal down sampling. require specify a gaussian selection map
-ctl Control file listing utterances to be processed
-ctlcount 1000000000 No. of utterances to be processed (after skipping -ctloffset entries)
-ctloffset 0 No. of utterances at the beginning of -ctl file to be skipped
-ctl_lm (Not used in mode 2 and 3) Control file that list the corresponding LMs
-ctl_mllr Control file that list the corresponding MLLR matrix for an utterance
-dagfudge 2 (0..2); 1 or 2: add edge if endframe == startframe; 2: if start == end-1
-debug Verbosity level for debugging messages
-dict Main pronunciation dictionary (lexicon) input file
-dist_ds no Distance-based Down-sampling, override normal down sampling.
-dither no Add 1/2-bit noise
-doublebw no Use double bandwidth filters (same center freq)
-ds 1 Ratio of Down-sampling the frame computation.
-epl 3 (Mode 4 only) Entries Per Lextree; #successive entries into one lextree before lextree-entries shifted to the next
-fdict Silence and filler (noise) word pronunciation dictionary input file
-feat 1s_c_d_dd Feature stream type, depends on the acoustic model
-featparams File containing feature extraction parameters.
-fillpen Filler word probabilities input file (used in place of -silpen and -noisepen)
-fillprob 0.1 Default non-silence filler word probability
-frate 100 Frame rate
-fsg (FSG Mode (Mode 2) only) Finite state grammar
-fsgusealtpron yes (FSG Mode (Mode 2) only) Use alternative pronunciations for FSG
-fsgusefiller yes (FSG Mode (Mode 2) only) Insert filler words at each state.
-gs Gaussian Selection Mapping.
-gs4gs yes A flag that specified whether the input GS map will be used for Gaussian Selection. If it is disabled, the map will only provide information to other modules.
-hmm Directory for specifying Sphinx 3's hmm, the following files are assummed to be present, mdef, mean, var, mixw, tmat. If -mdef, -mean, -var, -mixw or -tmat are specified, they will override this command.
-hmmdump no Whether to dump active HMM details to stderr (for debugging)
-hmmdumpef 200000000 (Mode 3 only) Ending frame for dumping all active HMMs (for debugging/diagnosis/analysis)
-hmmdumpsf 200000000 (Mode 3 only) Starting frame for dumping all active HMMs (for debugging/diagnosis/analysis)
-hmmhistbinsize 5000 (Only used in Mode 4 and 5) Performance histogram: #frames vs #HMMs active; #HMMs/bin in this histogram
-hyp Recognition result file, with only words
-hypseg Recognition result file, with word segmentations and scores
-hypsegscore_unscale yes When displaying the results, whether to unscale back the acoustic score with the best score in a frame
-inlatdir Input word-lattice directory with per-utt files for restricting words searched
-inlatwin 50 Input word-lattice words starting within +/- <this argument=""> of current frame considered during search
-input_endian little Endianness of input data, big or little, ignored if NIST or MS Wav
-kdmaxbbi -1 Maximum number of Gaussians per leaf node in kd-Trees
-kdmaxdepth 0 Maximum depth of kd-Trees to use
-kdtree kd-Tree file for Gaussian selection (for .s2semi models only)
-latcompress yes Whether lattice is compressed.
-latext lat.gz Filename extension for lattice files (gzip compressed, by default - remove .gz for uncompressed)
-lda File containing transformation matrix to be applied to features (single-stream features only)
-ldadim 0 Dimensionality of output of feature transformation (0 to use entire matrix)
-lextreedump 0 Whether to dump the lextree structure to stderr (for debugging), 1 for Ravi's format, 2 for Dot format, Larger than 2 will be treated as Ravi's format
-lifter 0 Length of sin-curve for liftering, or 0 for no liftering.
-lm Word trigram language model input file
-lmctlfn Specify a set of language model </this>
-lmdumpdir The directory for dumping the DMP file.
-lmname Name of language model in -lmctlfn to use for all utterances
-log3table yes Determines whether to use the logs3 table or to compute the values at run time.
-logbase 1.0003 Base in which all log-likelihoods calculated
-logfn Log file (default stdout/stderr)
-logspec no Write out logspectral files instead of cepstra
-lowerf 133.33334 Lower edge of filters
-lts_mismatch no Use CMUDict letter-to-sound rules to generate pronunciations for LM words doesn't appear in the dictionary . Use it with care. It assumes that the phone set in the mdef and dict are the same as the LTS rule.
-lw 9.5 Language weight
-maxcdsenpf 100000 Max no. of distinct CD senone will be computed.
-maxedge 2000000 Max DAG edges allowed in utterance; aborted if exceeded; controls memory usage
-maxhistpf 100 (Only used in Mode 4 and 5) Max no. of histories to maintain at each frame
-maxhmmpf 20000 (Only used in Mode 4 and 5) Max no. of active HMMs to maintain at each frame; approx.
-maxlmop 100000000 Max LMops in utterance after which it is aborted; controls CPU use (see maxlpf)
-maxlpf 40000 Max LMops/frame after which utterance aborted; controls CPU use (see maxlmop)
-maxppath 1000000 Max partial paths created after which utterance aborted; controls CPU/memory use
-maxwpf 20 (Only used in Mode 4 and 5) Max no. of distinct word exits to maintain at each frame
-mdef Model definition input file
-mean Mixture gaussian means input file
-min_endfr 3 Nodes ignored during search if they persist for fewer than so many end frames
-mixw Senone mixture weights input file
-mixwfloor 0.0000001 Senone mixture weights floor (applied to data from -mixw file)
-mllr MLLR transfomation matrix to be applied to mixture gaussian means
-mode fwdtree Decoding mode, one of allphone, fsg, fwdflat, fwdtree.
-nbest 200 Max. n-best hypotheses to generate per utterance
-nbestdir Input word-lattice directory with per-utt files for restricting words searched
-nbestext nbest.gz N-best filename extension (.gz or .Z extension for compression)
-ncep 13 Number of cep coefficients
-nfft 512 Size of FFT
-nfilt 40 Number of filter banks
-Nlextree 3 (Mode 4 only) No. of lextrees to be instantiated; entries into them staggered in time
-Nstalextree 25 (Mode 5 only) No. of lextrees to be instantiated statically;
-op_mode -1 Operation mode, for internal use only.
-outlatdir Directory in which to dump word lattices
-outlatfmt s3 Format in which to dump word lattices (either 's3' or 'htk')
-pbeam 1.0e-50 Beam selecting HMMs transitioning to successors in each frame
-pheurtype 0 0 = bypass, 1= sum of max, 2 = sum of avg, 3 = sum of 1st senones only
-phonepen 1.0 (Mode 2 and 3 only) Word insertion penalty
-phsegdir (Allphone mode only) Output directory for phone segmentation files
-pl_beam 1.0e-80 Beam for phoneme look-ahead.
-pl_window 1 Window size (actually window size-1) of phoneme look-ahead.
-ppathdebug no Generate debugging information for N-best search.
-ptranskip 0 (Not used in Mode 3) Use wbeam for phone transitions every so many frames (if >= 1)
-remove_dc no Remove DC offset from each frame
-round_filters yes Round mel filter frequencies to DFT points
-samprate 16000 Sampling rate
-seed -1 Seed for random number generator; if less than zero, pick our own
-sendump (S2 GMM computation only) Senone dump (compressed mixture weights) input file
-senmgau .cont. Senone to mixture-gaussian mapping file (or .semi. or .cont.)
-silprob 0.1 Default silence word probability
-smoothspec no Write out cepstral-smoothed logspectral files
-spec2cep no Input is log spectral files, output is cepstral files
-subvq Sub-vector quantized form of acoustic model
-subvqbeam 3.0e-3 Beam selecting best components within each mixture Gaussian
-svq4svq no A flag that specified whether the input SVQ will be used as approximate scores of the Gaussians
-svspec Subvector specification (e.g., 24,0-11/25,12-23/26-38 or 0-12/13-25/26-38)
-tighten_factor 0.5 From 0 to 1, it tightens the beam width when the frame is dropped
-tmat HMM state transition matrix input file
-tmatfloor 0.0001 HMM state transition probability floor (applied to -tmat file)
-topn 4 (S3.0 GMM Computation only) No. of top scoring densities computed in each mixture gaussian codebook (semi-continuous models only)
-topn_beam 0 (S2 GMM Computation only) Beam width used to determine top-N Gaussians (or a list, per-feature)
-tracewhmm (Mode 3 only) Word whose active HMMs are to be traced (for debugging/diagnosis/analysis)
-transform legacy Which type of transform to use to calculate cepstra (legacy, dct, or htk)
-treeugprob yes If true, Use unigram probs in lextree
-unit_area yes Normalize mel filters to unit area
-upperf 6855.4976 Upper edge of filters
-utt Utterance file to be processed (-ctlcount argument times)
-uw 0.7 Unigram weight
-var Mixture gaussian variances input file
-varfloor 0.0001 Mixture gaussian variance floor (applied to data from -var file)
-varnorm no Variance normalize each utterance (only if CMN == current)
-verbose no Show input filenames
-vqeval 3 Number of subvectors to use for SubVQ-based frame evaluation (3 for all)
-warp_params Parameters defining the warping function
-warp_type inverse_linear Warping function type (or shape)
-wbeam 1.0e-35 Beam selecting word-final HMMs exiting in each frame
-wend_beam 1.0e-80 Beam selecting word-final HMMs exiting in each frame
-wip 0.7 Word insertion penalty
-wlen 0.025625 Hamming window length
-worddumpef 200000000 (Mode 3 only) Ending frame for dumping all active words (for debugging/diagnosis/analysis)
-worddumpsf 200000000 (Mode 3 only) Starting frame for dumping all active words (for debugging/diagnosis/analysis)
ERROR: "cmd_ln.c", line 648: cmd_ln_parse_r failed
ERROR: "cmd_ln.c", line 697: cmd_ln_parse failed, forced exit
victor@victor-kubuntu:~/fdm/bin/.libs$
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
To help us find the issue, please submit sphinx3 configuration log
(config.log) and sphinx3 build log which you can make with redirecting output
to a file.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
hi.I think I solved the problem.
initially introduced my database in the folder samples wav / fdm_clstk (etc /
fdm_train.fileids and etc / fdm_train.transcription). wav folder / files and
corresponding fdmtest_clstk I leave it empty because it is assumed that for
the training data, or am I right?
I also got data on them and I managed to decode
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
initially introduced my database in the folder samples wav / fdm_clstk (etc
/ fdm_train.fileids and etc / fdm_train.transcription). wav folder / files and
corresponding fdmtest_clstk I leave it empty because it is assumed that for
the training data, or am I right?
I don't understand your question, but it's good the issue is resolved. One
less problem to care about.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
hello again,
I'm having a problem in decoding, and i can't solve. i use sphinx3 in kUbuntu
9,10
victor@victor-kubuntu:~/fdm$ sudo perl scripts_pl/decode/slave.pl
password for victor:
MODULE: DECODE Decoding using models previously trained
Decoding 2 segments starting at 0 (part 1 of 1)
0%
This step had 2 ERROR messages and 1 WARNING messages. Please check the log
file for details.
Aligning results to find error rate
word_align.pl failed with error code 65280 at scripts_pl/decode/slave.pl line
173.
victor@victor-kubuntu:~/fdm$
fdm.html:
MODULE: DECODE Decoding using models previously trained (2010-03-01 01:34)
Decoding 2 segments starting at 0 (part 1 of 1)
sphinx3_decode Log File
This step had 2 ERROR messages and 1 WARNING messages. Please check the log
file for details.
completed
Aligning results to find error rate
/home/victor/fdm/logdir/decode/fdm-1-1.log
INFO: info.c(65): Host: 'victor-kubuntu'
INFO: info.c(66): Directory: '/home/victor/fdm'
INFO: info.c(70): /home/victor/fdm/bin/.libs/lt-sphinx3_decode Compiled on:
Feb 9 2010, AT: 02:23:44
INFO: cmd_ln.c(510): Parsing command line:
/home/victor/fdm/bin/.libs/lt-sphinx3_decode \
-senmgau .cont. \
-hmm /home/victor/fdm/model_parameters/fdm.cd_cont_1000 \
-lw 23 \
-feat 1s_c_d_dd \
-beam 1e-120 \
-wbeam 1e-80 \
-dict /home/victor/fdm/etc/fdm.dic \
-fdict /home/victor/fdm/etc/fdm.filler \
-lm /home/victor/fdm/etc/fdm.ug.lm.DMP \
-wip 0.2 \
-ctl /home/victor/fdm/etc/fdm_test.fileids \
-ctloffset 0 \
-ctlcount 2 \
-cepdir /home/victor/fdm/feat \
-cepext .mfc \
-hyp /home/victor/fdm/result/fdm-1-1.match \
-agc none \
-varnorm no \
-cmn current
Current configuration:
-adchdr 0 0
-adcin no no
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-backtrace yes yes
-beam 1.0e-55 1.000000e-120
-bestpath no no
-bestpathlw 0.000000e+00
-bestscoredir
-bestsenscrdir
-bghist no no
-bptbldir
-bptblsize 32768 32768
-cb2mllr .1cls. .1cls.
-cep2spec no no
-cepdir /home/victor/fdm/feat
-cepext .mfc .mfc
-ceplen 13 13
-ci_pbeam 1e-80 1.000000e-80
-cmn current current
-cmninit 8.0 8.0
-cond_ds no no
-ctl /home/victor/fdm/etc/fdm_test.fileids
-ctlcount 1000000000 2
-ctloffset 0 0
-ctl_lm
-ctl_mllr
-dagfudge 2 2
-debug 0
-dict /home/victor/fdm/etc/fdm.dic
-dist_ds no no
-dither no no
-doublebw no no
-ds 1 1
-epl 3 3
-fdict /home/victor/fdm/etc/fdm.filler
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillpen
-fillprob 0.1 1.000000e-01
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-gs
-gs4gs yes yes
-hmm /home/victor/fdm/model_parameters/fdm.cd_cont_1000
-hmmdump no no
-hmmdumpef 200000000 200000000
-hmmdumpsf 200000000 200000000
-hmmhistbinsize 5000 5000
-hyp /home/victor/fdm/result/fdm-1-1.match
-hypseg
-hypsegscore_unscale yes yes
-inlatdir
-inlatwin 50 50
-input_endian little little
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latcompress yes yes
-latext lat.gz lat.gz
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm /home/victor/fdm/etc/fdm.ug.lm.DMP
-lmctlfn
-lmdumpdir
-lmname
-log3table yes yes
-logbase 1.0003 1.000300e+00
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+02
-lts_mismatch no no
-lw 9.5 2.300000e+01
-maxcdsenpf 100000 100000
-maxedge 2000000 2000000
-maxhistpf 100 100
-maxhmmpf 20000 20000
-maxlmop 100000000 100000000
-maxlpf 40000 40000
-maxppath 1000000 1000000
-maxwpf 20 20
-mdef
-mean
-min_endfr 3 3
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mode fwdtree fwdtree
-nbest 200 200
-nbestdir
-nbestext nbest.gz nbest.gz
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-Nlextree 3 3
-Nstalextree 25 25
-op_mode -1 -1
-outlatdir
-outlatfmt s3 s3
-pbeam 1.0e-50 1.000000e-50
-pheurtype 0 0
-phonepen 1.0 1.000000e+00
-phsegdir
-pl_beam 1.0e-80 1.000000e-80
-pl_window 1 1
-ppathdebug no no
-ptranskip 0 0
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump
-senmgau .cont. .cont.
-silprob 0.1 1.000000e-01
-smoothspec no no
-spec2cep no no
-subvq
-subvqbeam 3.0e-3 3.000000e-03
-svq4svq no no
-svspec
-tighten_factor 0.5 5.000000e-01
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-tracewhmm
-transform legacy legacy
-treeugprob yes yes
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-utt
-uw 0.7 7.000000e-01
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-vqeval 3 3
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 1.0e-35 1.000000e-80
-wend_beam 1.0e-80 1.000000e-80
-wip 0.7 2.000000e-01
-wlen 0.025625 2.562500e-02
-worddumpef 200000000 200000000
-worddumpsf 200000000 200000000
INFO: kbcore.c(441): Begin Initialization of Core Models:
INFO: cmd_ln.c(510): Parsing command line:
\
-alpha 0.97 \
-dither yes \
-doublebw no \
-nfilt 40 \
-ncep 13 \
-lowerf 133.33334 \
-upperf 6855.4976 \
-nfft 512 \
-wlen 0.0256 \
-transform legacy \
-feat 1s_c_d_dd \
-agc none \
-cmn current \
-varnorm no
Current configuration:
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-cep2spec no no
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-dither no yes
-doublebw no no
-feat 1s_c_d_dd 1s_c_d_dd
-frate 100 100
-input_endian little little
-lda
-ldadim 0 0
-lifter 0 0
-logspec no no
-lowerf 133.33334 1.333333e+02
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-smoothspec no no
-spec2cep no no
-svspec
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2.560000e-02
INFO: Initialization of the log add table
INFO: Log-Add table size = 29356 x 2 >> 0
INFO:
INFO: feat.c(848): Initializing feature stream to type: '1s_c_d_dd',
ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean= 12.00, mean= 0.0
INFO: kbcore.c(488): .cont.
INFO: Initialization of feat_t, report:
INFO: Feature type = 1s_c_d_dd
INFO: Cepstral size = 13
INFO: Number of streams = 1
INFO: Vector size of stream: 39
INFO: Number of subvectors = 0
INFO: Whether CMN is used = 1
INFO: Whether AGC is used = 0
INFO: Whether variance is normalized = 0
INFO:
INFO: Reading HMM in Sphinx 3 Model format
INFO: Model Definition File:
/home/victor/fdm/model_parameters/fdm.cd_cont_1000/mdef
INFO: Mean File: /home/victor/fdm/model_parameters/fdm.cd_cont_1000/means
INFO: Variance File:
/home/victor/fdm/model_parameters/fdm.cd_cont_1000/variances
INFO: Mixture Weight File:
/home/victor/fdm/model_parameters/fdm.cd_cont_1000/mixture_weights
INFO: Transition Matrices File:
/home/victor/fdm/model_parameters/fdm.cd_cont_1000/transition_matrices
INFO: mdef.c(682): Reading model definition:
/home/victor/fdm/model_parameters/fdm.cd_cont_1000/mdef
INFO: Initialization of mdef_t, report:
INFO: 17 CI-phone, 244 CD-phone, 3 emitstate/phone, 51 CI-sen, 234 Sen, 116
Sen-Seq
INFO:
INFO: kbcore.c(298): Using optimized GMM computation for Continuous HMM, -topn
will be ignored
INFO: cont_mgau.c(163): Reading mixture gaussian file
'/home/victor/fdm/model_parameters/fdm.cd_cont_1000/means'
INFO: cont_mgau.c(422): 234 mixture Gaussians, 8 components, 1 streams, veclen
39
INFO: cont_mgau.c(163): Reading mixture gaussian file
'/home/victor/fdm/model_parameters/fdm.cd_cont_1000/variances'
INFO: cont_mgau.c(422): 234 mixture Gaussians, 8 components, 1 streams, veclen
39
INFO: cont_mgau.c(523): Reading mixture weights file
'/home/victor/fdm/model_parameters/fdm.cd_cont_1000/mixture_weights'
INFO: cont_mgau.c(678): Read 234 x 8 mixture weights
INFO: cont_mgau.c(706): Removing uninitialized Gaussian densities
30 31 60 67 68 69 70 71 74 75 76 77 81 82 84 86 88 91 93 101 110 113 116 117
118 123 124 125 126 127 128 129 134 135 136 138 139 141 142 145 148 155 158
164 172 174 176 177 178 203 209 210 215 218 220 222 224 226 231
WARNING: "cont_mgau.c", line 780: 766 densities removed (59 mixtures removed
entirely)
INFO: cont_mgau.c(796): Applying variance floor
INFO: cont_mgau.c(814): 1089 variance values floored
INFO: cont_mgau.c(862): Precomputing Mahalanobis distance invariants
INFO: tmat.c(169): Reading HMM transition probability matrices:
/home/victor/fdm/model_parameters/fdm.cd_cont_1000/transition_matrices
INFO: Initialization of tmat_t, report:
INFO: Read 17 transition matrices of size 3x4
INFO:
ERROR: "cmd_ln.c", line 877: Unknown argument: -mdef_fillers
ERROR: "cmd_ln.c", line 877: Unknown argument: -mdef_fillers
INFO: dict.c(384): Reading main dictionary: /home/victor/fdm/etc/fdm.dic
INFO: dict.c(387): 14 words read
INFO: dict.c(392): Reading filler dictionary: /home/victor/fdm/etc/fdm.filler
INFO: dict.c(395): 3 words read
INFO: dict.c(428): Added 0 fillers from mdef file
INFO: Initialization of dict_t, report:
INFO: No of CI phone: 0
INFO: Max word: 4113
INFO: No of word: 17
INFO:
INFO: lm.c(607): LM read('/home/victor/fdm/etc/fdm.ug.lm.DMP', lw= 23.00, wip=
0.20, uw= 0.70)
INFO: lm.c(609): Reading LM file /home/victor/fdm/etc/fdm.ug.lm.DMP (LM name
"default")
INFO: lm_3g_dmp.c(630): Reading LM in 16 bits format
INFO: lm_3g_dmp.c(686): Read 16 unigrams
INFO: lm_3g_dmp.c(759): 45 bigrams
INFO: lm_3g_dmp.c(832): 122 bigrams
INFO: lm_3g_dmp.c(902): 9 bigram prob entries
INFO: lm_3g_dmp.c(936): 4 trigram bowt entries
INFO: lm_3g_dmp.c(967): 5 trigram prob entries
INFO: lm_3g_dmp.c(998): 1 trigram segtable entries (512 segsize)
INFO: lm_3g_dmp.c(1053): 16 word strings
INFO: lm.c(691): The LM routine is operating at 16 bits mode
INFO: Initialization of fillpen_t, report:
INFO: Language weight =23.000000
INFO: Word Insertion Penalty =0.200000
INFO: Silence probability =0.100000
INFO: Filler probability =0.100000
INFO:
INFO: dict2pid.c(599): Building PID tables for dictionary
INFO: Initialization of dict2pid_t, report:
INFO: Dict2pid is in composite triphone mode
INFO: 73 composite states; 25 composite sseq
INFO:
INFO: kbcore.c(633): Inside kbcore: Verifying models consistency ......
INFO: kbcore.c(655): End of Initialization of Core Models:
INFO: Initialization of beam_t, report:
INFO: Parameters used in Beam Pruning of Viterbi Search:
INFO: Beam=-921172
INFO: PBeam=-383821
INFO: WBeam=-614114 (Skip=0)
INFO: WEndBeam=-614114
INFO: No of CI Phone assumed=17
INFO:
INFO: Initialization of fast_gmm_t, report:
INFO: Parameters used in Fast GMM computation:
INFO: Frame-level: Down Sampling Ratio 1, Conditional Down Sampling? 0,
Distance-based Down Sampling? 0
INFO: GMM-level: CI phone beam -614114. MAX CD 100000
INFO: Gaussian-level: GS map would be used for Gaussian Selection? =1, SVQ
would be used as Gaussian Score? =0 SubVQ Beam -19366
INFO:
INFO: Initialization of pl_t, report:
INFO: Parameters used in phoneme lookahead:
INFO: Phoneme look-ahead type = 0
INFO: Phoneme look-ahead beam size = -614114
INFO: No of CI Phones assumed=17
INFO:
INFO: Initialization of ascr_t, report:
INFO: No. of CI senone =51
INFO: No. of senone = 234
INFO: No. of composite senone = 73
INFO: No. of senone sequence = 116
INFO: No. of composite senone sequence=25
INFO: Parameters used in phoneme lookahead:
INFO: Phoneme lookahead window = 1
INFO:
INFO: kb.c(308): SEARCH MODE INDEX 4
INFO: srch.c(374): Search Initialization.
INFO: srch_time_switch_tree.c(284): -Nstalextree is omitted in TST search.
INFO: lextree.c(223): Creating Unigram Table for lm (name: default)
INFO: lextree.c(236): Size of word table after unigram + words in class: 14.
INFO: lextree.c(245): Size of word table after adding alternative prons: 14.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 7
INFO: Number of nodes 64
INFO: Number of links in the tree 116
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 96
INFO: The size of a gnode_t 12
INFO:
INFO: srch_time_switch_tree.c(344): Lextrees (0) for lm 0, its name is
default, it has 64 nodes(ug)
INFO: lextree.c(223): Creating Unigram Table for lm (name: default)
INFO: lextree.c(236): Size of word table after unigram + words in class: 14.
INFO: lextree.c(245): Size of word table after adding alternative prons: 14.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 7
INFO: Number of nodes 64
INFO: Number of links in the tree 116
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 96
INFO: The size of a gnode_t 12
INFO:
INFO: srch_time_switch_tree.c(344): Lextrees (1) for lm 0, its name is
default, it has 64 nodes(ug)
INFO: lextree.c(223): Creating Unigram Table for lm (name: default)
INFO: lextree.c(236): Size of word table after unigram + words in class: 14.
INFO: lextree.c(245): Size of word table after adding alternative prons: 14.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 7
INFO: Number of nodes 64
INFO: Number of links in the tree 116
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 96
INFO: The size of a gnode_t 12
INFO:
INFO: srch_time_switch_tree.c(344): Lextrees (2) for lm 0, its name is
default, it has 64 nodes(ug)
INFO: srch_time_switch_tree.c(351): Time for building trees, 0.0000 CPU 0.0002
Clk
INFO: srch_time_switch_tree.c(373): Lextrees(0), 1 nodes(filler)
INFO: srch_time_switch_tree.c(373): Lextrees(1), 1 nodes(filler)
INFO: srch_time_switch_tree.c(373): Lextrees(2), 1 nodes(filler)
INFO: vithist.c(169): Initializing Viterbi-history module
INFO: Initialization of srch_t, report:
INFO: Operation Mode = 4, Operation Name = fwdtree
INFO:
INFO: stat.c(225): SUMMARY: 0 fr , No report
root 4079 0.0 0.0 3676 1436 pts/1 S+ 01:53 0:00 /home/victor/fdm/bin/.libs/lt-
sphinx3_decode -senmgau .cont. -hmm
/home/victor/fdm/model_parameters/fdm.cd_cont_1000 -lw 23 -feat 1s_c_d_dd
-beam 1e-120 -wbeam 1e-80 -dict /home/victor/fdm/etc/fdm.dic -fdict
/home/victor/fdm/etc/fdm.filler -lm /home/victor/fdm/etc/fdm.ug.lm.DMP -wip
0.2 -ctl /home/victor/fdm/etc/fdm_test.fileids -ctloffset 0 -ctlcount 2
-cepdir /home/victor/fdm/feat -cepext .mfc -hyp
/home/victor/fdm/result/fdm-1-1.match -agc none -varnorm no -cmn current
root 4094 0.0 0.0 1752 480 pts/1 S+ 01:53 0:00 sh -c ps aguxwww | grep
sphinx3_decode
root 4096 0.0 0.0 3056 796 pts/1 R+ 01:53 0:00 grep sphinx3_decode
Mon Mar 1 01:53:28 2010
ERROR: "cmd_ln.c", line 877: Unknown argument: -mdef_fillers --> sphinxbase
/home/victor/sphinxbase/src/libsphinxbase/util/cmd_ln.c:
anytype_t
cmd_ln_access_r(cmd_ln_t cmdln, const char name)
{
void val;
if (hash_table_lookup(cmdln->ht, name, &val) < 0) {
E_ERROR("Unknown argument: %s\n", name);
return NULL;
}
return (anytype_t *)val;
}
i need help please
Did you get any output from the decoder? You can look in the "result"
directory.
yes. The following files:
fdm-1-1.match
fdm.match
fdm.match4521
also fdm.align containing
Use of uninitialized value $total_words in division (/) at
scripts_pl/decode/word_align.pl line 139, <ref> line 1.
Use of uninitialized value $total_cost in division (/) at
scripts_pl/decode/word_align.pl line 139, <ref> line 1.
Illegal division by zero at scripts_pl/decode/word_align.pl line 139, <ref>
line 1. </ref></ref></ref>
I forgot to say that the first three files were empty
Looks like you compiled rather old sphinx3, where in scripts mdef argument is
set. New sphinx3 decoding script doesn't have this argument in
scripts_pl/decode/s3decode.pl
I took all via SVN
Try
sphinx3_decode 2>&1 | grep mdef_fillers. It should show you:
$ ./sphinx3_decode 2>&1 | grep mdef_filler
-mdef_fillers no Automatically add filler words from the model definition file
If not, it's probably using old sphinx3 libraries from some other location.
victor@victor-kubuntu:~/fdm$ sphinx3_decode 2>&1 | grep mdef_fillers
-mdef_fillers no Automatically add filler words from the model definition file
victor@victor-kubuntu:~/fdm$
You need to try /home/victor/fdm/bin/.libs/lt-sphinx3_decode as in the log
where it failed, not sphinx3_decode.
i don't understand very well. i speak little English
you're referring to this:
victor@victor-kubuntu:~/fdm/bin/.libs$ ./lt-sphinx3_decode
INFO: info.c(65): Host: 'victor-kubuntu'
INFO: info.c(66): Directory: '/home/victor/fdm/bin/.libs'
INFO: info.c(70): ./lt-sphinx3_decode Compiled on: Feb 9 2010, AT: 02:23:44
INFO: cmd_ln.c(463): Looking for default argument file: default.arg
INFO: cmd_ln.c(466): Can't find default argument file default.arg.
INFO: cmd_ln.c(510): Parsing command line:
./lt-sphinx3_decode
ERROR: "cmd_ln.c", line 630: No arguments given, exiting
Arguments list definition:
-adchdr 0 Number of bytes to skip at the beginning of a waveform file (44 for WAV, 1024 for Sphere)
-adcin no Input is waveform data rather than cepstra (-cepdir and -cepext are still used)
-agc none Automatic gain control for c0 ('max', 'emax', 'noise', or 'none')
-agcthresh 2.0 Initial threshold for automatic gain control
-alpha 0.97 Preemphasis parameter
-backtrace yes Whether detailed backtrace information (word segmentation/scores) shown in log
-beam 1.0e-55 Beam selecting active HMMs (relative to best) in each frame
-bestpath no Whether to run bestpath DAG search after forward Viterbi pass
-bestpathlw Language weight for bestpath DAG search (default: same as -lw)
-bestscoredir (Mode 3) Directory for writing best score/frame (used to set beamwidth; one file/utterance)
-bestsenscrdir When Best senone score directory.
-bghist no Bigram-mode: If TRUE only one BP entry/frame; else one per LM state
-bptbldir Directory in which to dump word Viterbi back pointer table (for debugging)
-bptblsize 32768 Number of BPtable entries to allocate initially (grown as necessary)
-cb2mllr .1cls. Senone to MLLR transformation matrix mapping file (or .1cls.)
-cep2spec no Input is cepstral files, output is log spectral files
-cepdir Input cepstrum files directory (prefixed to filespecs in control file)
-cepext .mfc Input cepstrum files extension (prefixed to filespecs in control file)
-ceplen 13 Number of components in the input feature vector
-ci_pbeam 1e-80 CI phone beam for CI-based GMM Selection.
-cmn current Cepstral mean normalization scheme ('current', 'prior', or 'none')
-cmninit 8.0 Initial values (comma-separated) for cepstral mean when 'prior' is used
-cond_ds no Conditional Down-sampling, override normal down sampling. require specify a gaussian selection map
-ctl Control file listing utterances to be processed
-ctlcount 1000000000 No. of utterances to be processed (after skipping -ctloffset entries)
-ctloffset 0 No. of utterances at the beginning of -ctl file to be skipped
-ctl_lm (Not used in mode 2 and 3) Control file that list the corresponding LMs
-ctl_mllr Control file that list the corresponding MLLR matrix for an utterance
-dagfudge 2 (0..2); 1 or 2: add edge if endframe == startframe; 2: if start == end-1
-debug Verbosity level for debugging messages
-dict Main pronunciation dictionary (lexicon) input file
-dist_ds no Distance-based Down-sampling, override normal down sampling.
-dither no Add 1/2-bit noise
-doublebw no Use double bandwidth filters (same center freq)
-ds 1 Ratio of Down-sampling the frame computation.
-epl 3 (Mode 4 only) Entries Per Lextree; #successive entries into one lextree before lextree-entries shifted to the next
-fdict Silence and filler (noise) word pronunciation dictionary input file
-feat 1s_c_d_dd Feature stream type, depends on the acoustic model
-featparams File containing feature extraction parameters.
-fillpen Filler word probabilities input file (used in place of -silpen and -noisepen)
-fillprob 0.1 Default non-silence filler word probability
-frate 100 Frame rate
-fsg (FSG Mode (Mode 2) only) Finite state grammar
-fsgusealtpron yes (FSG Mode (Mode 2) only) Use alternative pronunciations for FSG
-fsgusefiller yes (FSG Mode (Mode 2) only) Insert filler words at each state.
-gs Gaussian Selection Mapping.
-gs4gs yes A flag that specified whether the input GS map will be used for Gaussian Selection. If it is disabled, the map will only provide information to other modules.
-hmm Directory for specifying Sphinx 3's hmm, the following files are assummed to be present, mdef, mean, var, mixw, tmat. If -mdef, -mean, -var, -mixw or -tmat are specified, they will override this command.
-hmmdump no Whether to dump active HMM details to stderr (for debugging)
-hmmdumpef 200000000 (Mode 3 only) Ending frame for dumping all active HMMs (for debugging/diagnosis/analysis)
-hmmdumpsf 200000000 (Mode 3 only) Starting frame for dumping all active HMMs (for debugging/diagnosis/analysis)
-hmmhistbinsize 5000 (Only used in Mode 4 and 5) Performance histogram: #frames vs #HMMs active; #HMMs/bin in this histogram
-hyp Recognition result file, with only words
-hypseg Recognition result file, with word segmentations and scores
-hypsegscore_unscale yes When displaying the results, whether to unscale back the acoustic score with the best score in a frame
-inlatdir Input word-lattice directory with per-utt files for restricting words searched
-inlatwin 50 Input word-lattice words starting within +/- <this argument=""> of current frame considered during search
-input_endian little Endianness of input data, big or little, ignored if NIST or MS Wav
-kdmaxbbi -1 Maximum number of Gaussians per leaf node in kd-Trees
-kdmaxdepth 0 Maximum depth of kd-Trees to use
-kdtree kd-Tree file for Gaussian selection (for .s2semi models only)
-latcompress yes Whether lattice is compressed.
-latext lat.gz Filename extension for lattice files (gzip compressed, by default - remove .gz for uncompressed)
-lda File containing transformation matrix to be applied to features (single-stream features only)
-ldadim 0 Dimensionality of output of feature transformation (0 to use entire matrix)
-lextreedump 0 Whether to dump the lextree structure to stderr (for debugging), 1 for Ravi's format, 2 for Dot format, Larger than 2 will be treated as Ravi's format
-lifter 0 Length of sin-curve for liftering, or 0 for no liftering.
-lm Word trigram language model input file
-lmctlfn Specify a set of language model </this>
-lmdumpdir The directory for dumping the DMP file.
-lmname Name of language model in -lmctlfn to use for all utterances
-log3table yes Determines whether to use the logs3 table or to compute the values at run time.
-logbase 1.0003 Base in which all log-likelihoods calculated
-logfn Log file (default stdout/stderr)
-logspec no Write out logspectral files instead of cepstra
-lowerf 133.33334 Lower edge of filters
-lts_mismatch no Use CMUDict letter-to-sound rules to generate pronunciations for LM words doesn't appear in the dictionary . Use it with care. It assumes that the phone set in the mdef and dict are the same as the LTS rule.
-lw 9.5 Language weight
-maxcdsenpf 100000 Max no. of distinct CD senone will be computed.
-maxedge 2000000 Max DAG edges allowed in utterance; aborted if exceeded; controls memory usage
-maxhistpf 100 (Only used in Mode 4 and 5) Max no. of histories to maintain at each frame
-maxhmmpf 20000 (Only used in Mode 4 and 5) Max no. of active HMMs to maintain at each frame; approx.
-maxlmop 100000000 Max LMops in utterance after which it is aborted; controls CPU use (see maxlpf)
-maxlpf 40000 Max LMops/frame after which utterance aborted; controls CPU use (see maxlmop)
-maxppath 1000000 Max partial paths created after which utterance aborted; controls CPU/memory use
-maxwpf 20 (Only used in Mode 4 and 5) Max no. of distinct word exits to maintain at each frame
-mdef Model definition input file
-mean Mixture gaussian means input file
-min_endfr 3 Nodes ignored during search if they persist for fewer than so many end frames
-mixw Senone mixture weights input file
-mixwfloor 0.0000001 Senone mixture weights floor (applied to data from -mixw file)
-mllr MLLR transfomation matrix to be applied to mixture gaussian means
-mode fwdtree Decoding mode, one of allphone, fsg, fwdflat, fwdtree.
-nbest 200 Max. n-best hypotheses to generate per utterance
-nbestdir Input word-lattice directory with per-utt files for restricting words searched
-nbestext nbest.gz N-best filename extension (.gz or .Z extension for compression)
-ncep 13 Number of cep coefficients
-nfft 512 Size of FFT
-nfilt 40 Number of filter banks
-Nlextree 3 (Mode 4 only) No. of lextrees to be instantiated; entries into them staggered in time
-Nstalextree 25 (Mode 5 only) No. of lextrees to be instantiated statically;
-op_mode -1 Operation mode, for internal use only.
-outlatdir Directory in which to dump word lattices
-outlatfmt s3 Format in which to dump word lattices (either 's3' or 'htk')
-pbeam 1.0e-50 Beam selecting HMMs transitioning to successors in each frame
-pheurtype 0 0 = bypass, 1= sum of max, 2 = sum of avg, 3 = sum of 1st senones only
-phonepen 1.0 (Mode 2 and 3 only) Word insertion penalty
-phsegdir (Allphone mode only) Output directory for phone segmentation files
-pl_beam 1.0e-80 Beam for phoneme look-ahead.
-pl_window 1 Window size (actually window size-1) of phoneme look-ahead.
-ppathdebug no Generate debugging information for N-best search.
-ptranskip 0 (Not used in Mode 3) Use wbeam for phone transitions every so many frames (if >= 1)
-remove_dc no Remove DC offset from each frame
-round_filters yes Round mel filter frequencies to DFT points
-samprate 16000 Sampling rate
-seed -1 Seed for random number generator; if less than zero, pick our own
-sendump (S2 GMM computation only) Senone dump (compressed mixture weights) input file
-senmgau .cont. Senone to mixture-gaussian mapping file (or .semi. or .cont.)
-silprob 0.1 Default silence word probability
-smoothspec no Write out cepstral-smoothed logspectral files
-spec2cep no Input is log spectral files, output is cepstral files
-subvq Sub-vector quantized form of acoustic model
-subvqbeam 3.0e-3 Beam selecting best components within each mixture Gaussian
-svq4svq no A flag that specified whether the input SVQ will be used as approximate scores of the Gaussians
-svspec Subvector specification (e.g., 24,0-11/25,12-23/26-38 or 0-12/13-25/26-38)
-tighten_factor 0.5 From 0 to 1, it tightens the beam width when the frame is dropped
-tmat HMM state transition matrix input file
-tmatfloor 0.0001 HMM state transition probability floor (applied to -tmat file)
-topn 4 (S3.0 GMM Computation only) No. of top scoring densities computed in each mixture gaussian codebook (semi-continuous models only)
-topn_beam 0 (S2 GMM Computation only) Beam width used to determine top-N Gaussians (or a list, per-feature)
-tracewhmm (Mode 3 only) Word whose active HMMs are to be traced (for debugging/diagnosis/analysis)
-transform legacy Which type of transform to use to calculate cepstra (legacy, dct, or htk)
-treeugprob yes If true, Use unigram probs in lextree
-unit_area yes Normalize mel filters to unit area
-upperf 6855.4976 Upper edge of filters
-utt Utterance file to be processed (-ctlcount argument times)
-uw 0.7 Unigram weight
-var Mixture gaussian variances input file
-varfloor 0.0001 Mixture gaussian variance floor (applied to data from -var file)
-varnorm no Variance normalize each utterance (only if CMN == current)
-verbose no Show input filenames
-vqeval 3 Number of subvectors to use for SubVQ-based frame evaluation (3 for all)
-warp_params Parameters defining the warping function
-warp_type inverse_linear Warping function type (or shape)
-wbeam 1.0e-35 Beam selecting word-final HMMs exiting in each frame
-wend_beam 1.0e-80 Beam selecting word-final HMMs exiting in each frame
-wip 0.7 Word insertion penalty
-wlen 0.025625 Hamming window length
-worddumpef 200000000 (Mode 3 only) Ending frame for dumping all active words (for debugging/diagnosis/analysis)
-worddumpsf 200000000 (Mode 3 only) Starting frame for dumping all active words (for debugging/diagnosis/analysis)
ERROR: "cmd_ln.c", line 648: cmd_ln_parse_r failed
ERROR: "cmd_ln.c", line 697: cmd_ln_parse failed, forced exit
victor@victor-kubuntu:~/fdm/bin/.libs$
Hm, that might be sphinx3 bug also. Needs more investigation. Overnight
support usually too nervious. Sorry
no import. I will continue investigating
thanks anyway
To help us find the issue, please submit sphinx3 configuration log
(config.log) and sphinx3 build log which you can make with redirecting output
to a file.
hi.I think I solved the problem.
initially introduced my database in the folder samples wav / fdm_clstk (etc /
fdm_train.fileids and etc / fdm_train.transcription). wav folder / files and
corresponding fdmtest_clstk I leave it empty because it is assumed that for
the training data, or am I right?
I also got data on them and I managed to decode
I don't understand your question, but it's good the issue is resolved. One
less problem to care about.