hello
i'm currently using sphinx3_livedecode to decode in linux (fedora8) i trained
the model using sphinxtrain with 16khz,16bit mono pcm format when i try to use
the etc,model_parameters,model_architecture for decoding as
INFO: kbcore.c(451): Begin Initialization of Core Models:
INFO: logs3.c(149): Initializing logbase: 1.000300e+00 (add table: 1)
INFO: Initialization of the log add table
INFO: Log-Add table size = 29350
INFO:
INFO: feat.c(665): Initializing feature stream to type: '1s_c_d_dd',
CMN='current', VARNORM='no', AGC='none'
INFO: kbcore.c(479): .cont.
INFO: Initialization of feat_t, report:
INFO: Feature type = 1s_c_d_dd
INFO: Cepstral size = 13
INFO: Cepstral size Used = 13
INFO: Number of stream = 1
INFO: Vector size of stream: 39
INFO: Whether CMN is used = 1
INFO: Whether AGC is used = 0
INFO: Whether variance is normalized = 0
INFO:
INFO: Reading HMM in Sphinx 3 Model format
INFO: Model Definition File:
/usr/local/sphinx3/telugu/model_architecture/telugu.1000.mdef
INFO: Mean File:
/usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/means
INFO: Variance File:
/usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/variances
INFO: Mixture Weight File:
/usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/mixture_weights
INFO: Transition Matrices File: /usr/local/sphinx3/telugu/model_parameters/tel
ugu.cd_cont_1000/transition_matrices
INFO: mdef.c(637): Reading model definition:
/usr/local/sphinx3/telugu/model_architecture/telugu.1000.mdef
INFO: Initialization of mdef_t, report:
INFO: 20 CI-phone, 265 CD-phone, 3 emitstate/phone, 60 CI-sen, 252 Sen, 91
Sen-Seq
INFO:
INFO: kbcore.c(324): Using optimized GMM computation for Continuous HMM, -topn
will be ignored
INFO: cont_mgau.c(157): Reading mixture gaussian file
'/usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/means'
INFO: cont_mgau.c(323): 252 mixture Gaussians, 8 components, 1 streams, veclen
39
INFO: cont_mgau.c(157): Reading mixture gaussian file
'/usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/variances'
INFO: cont_mgau.c(323): 252 mixture Gaussians, 8 components, 1 streams, veclen
39
INFO: cont_mgau.c(409): Reading mixture weights file '/usr/local/sphinx3/telug
u/model_parameters/telugu.cd_cont_1000/mixture_weights'
INFO: cont_mgau.c(539): Read 252 x 8 mixture weights
INFO: cont_mgau.c(565): Removing uninitialized Gaussian densities
INFO: cont_mgau.c(633): Applying variance floor
INFO: cont_mgau.c(651): 0 variance values floored
INFO: cont_mgau.c(698): Precomputing Mahalanobis distance invariants
INFO: tmat.c(162): Reading HMM transition probability matrices: /usr/local/sph
inx3/telugu/model_parameters/telugu.cd_cont_1000/transition_matrices
INFO: Initialization of tmat_t, report:
INFO: Read 20 transition matrices of size 3x4
INFO:
INFO: dict.c(430): Reading main dictionary:
/usr/local/sphinx3/telugu/etc/telugu.dic
INFO: dict.c(433): 35 words read
INFO: dict.c(438): Reading filler dictionary:
/usr/local/sphinx3/telugu/etc/telugu.filler
INFO: dict.c(441): 3 words read
INFO: Initialization of dict_t, report:
INFO: No of word: 38
INFO:
INFO: lm.c(393): LM read('/usr/local/sphinx3/telugu/etc/5587.lm.DMP', lw=
9.50, wip= 0.70, uw= 0.70)
INFO: lm.c(394): Reading LM file /usr/local/sphinx3/telugu/etc/5587.lm.DMP (LM
name "default")
INFO: lm_3g_dmp.c(359): 56 ug
INFO: lm_3g_dmp.c(403): 113 bigrams
INFO: lm_3g_dmp.c(410): 126 trigrams
INFO: lm_3g_dmp.c(435): 23 bigram prob entries
INFO: lm_3g_dmp.c(453): 20 trigram bowt entries
INFO: lm_3g_dmp.c(469): 10 trigram prob entries
INFO: lm_3g_dmp.c(484): 1 trigram segtable entries (512 segsize)
INFO: lm_3g_dmp.c(518): 56 word strings
ERROR: "wid.c", line 234: A is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: AA is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: D is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: DH is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: EE is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: EH is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: G is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: IH is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: K is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: L is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: M is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: N is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: OH is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: R is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: S is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: T is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: TH is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: UH is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: V is not a word in dictionary and it is not a class
tag.
INFO: wid.c(242): 19 LM words not in dictionary; ignored
INFO: Initialization of fillpen_t, report:
INFO: Language weight =9.500000
INFO: Word Insertion Penalty =0.700000
INFO: Silence probability =0.100000
INFO: Filler probability =0.100000
INFO:
INFO: dict2pid.c(525): Building PID tables for dictionary
INFO: Initialization of dict2pid_t, report:
INFO: Dict2pid is in composite triphone mode
INFO: 69 composite states; 23 composite sseq
INFO:
INFO: kbcore.c(626): Inside kbcore: Verifying models consistency ......
INFO: kbcore.c(646): End of Initialization of Core Models:
INFO: Initialization of beam_t, report:
INFO: Parameters used in Beam Pruning of Viterbi Search:
INFO: Beam=-422133
INFO: PBeam=-383758
INFO: WBeam=-268630 (Skip=0)
INFO: WEndBeam=-614012
INFO: No of CI Phone assumed=20
INFO:
INFO: Initialization of fast_gmm_t, report:
INFO: Parameters used in Fast GMM computation:
INFO: Frame-level: Down Sampling Ratio 1, Conditional Down Sampling? 0,
Distance-based Down Sampling? 0
INFO: GMM-level: CI phone beam -614012. MAX CD 100000
INFO: Gaussian-level: GS map would be used for Gaussian Selection? =1, SVQ
would be used as Gaussian Score? =0 SubVQ Beam -19363
INFO:
INFO: Initialization of pl_t, report:
INFO: Parameters used in phoneme lookahead:
INFO: Phoneme look-ahead type = 0
INFO: Phoneme look-ahead beam size = 65945
INFO: No of CI Phones assumed=20
INFO:
INFO: Initialization of ascr_t, report:
INFO: No. of CI senone =60
INFO: No. of senone = 252
INFO: No. of composite senone = 69
INFO: No. of senone sequence = 91
INFO: No. of composite senone sequence=23
INFO: Parameters used in phoneme lookahead:
INFO: Phoneme lookahead window = 1
INFO:
INFO: vithist.c(163): Initializing Viterbi-history module
INFO: Initialization of vithist_t, report:
INFO: Word beam = -268630
INFO: Bigram Mode =0
INFO: Rescore Mode =1
INFO: Trace sil Mode =1
INFO:
INFO: srch.c(410): Search Initialization.
WARNING: "srch_time_switch_tree.c", line 165: -Nstalextree is omitted in TST
search.
INFO: lextree.c(218): Creating Unigram Table for lm (name: default)
INFO: lextree.c(231): Size of word table after unigram + words in class: 35.
INFO: lextree.c(239): Size of word table after adding alternative prons: 35.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 6
INFO: Number of node 117
INFO: Number of links in the tree 127
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 52
INFO: The size of a gnode_t 12
INFO: srch_time_switch_tree.c(216): Lextrees (0) for lm 0, its name is
default, it has 117 nodes(ug)
INFO: lextree.c(218): Creating Unigram Table for lm (name: default)
INFO: lextree.c(231): Size of word table after unigram + words in class: 35.
INFO: lextree.c(239): Size of word table after adding alternative prons: 35.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 6
INFO: Number of node 117
INFO: Number of links in the tree 127
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 52
INFO: The size of a gnode_t 12
INFO: srch_time_switch_tree.c(216): Lextrees (1) for lm 0, its name is
default, it has 117 nodes(ug)
INFO: lextree.c(218): Creating Unigram Table for lm (name: default)
INFO: lextree.c(231): Size of word table after unigram + words in class: 35.
INFO: lextree.c(239): Size of word table after adding alternative prons: 35.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 6
INFO: Number of node 117
INFO: Number of links in the tree 127
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 52
INFO: The size of a gnode_t 12
INFO: srch_time_switch_tree.c(216): Lextrees (2) for lm 0, its name is
default, it has 117 nodes(ug)
INFO: srch_time_switch_tree.c(221): Time for building trees, 0.0010 CPU 0.0003
Clk
INFO: srch_time_switch_tree.c(242): Lextrees(0), 1 nodes(filler)
INFO: srch_time_switch_tree.c(242): Lextrees(1), 1 nodes(filler)
INFO: srch_time_switch_tree.c(242): Lextrees(2), 1 nodes(filler)
INFO: Initialization of srch_t, report:
INFO: Operation Mode = 4, Operation Name = OP_TST_DECODE
INFO:
INFO: live_decode_API.c(233): Partial hypothesis WILL be dumped
INFO: live_decode_API.c(241): Input data will NOT be byte swapped
press ENTER to start recording
FWDVIT: (* 109 312Z104037)
FWDXCT: * 109 312Z104037 S 8189562 T -277618 A -277896 L 278 0 7880374 -8101
<sil> 242 0 -13059 242 </sil>
INFO: stat.c(146): 242 frm; 48 cdsen/fr, 60 cisen/fr, 384 cdgau/fr, 480
cigau/fr, Sen 0.01, CPU 0.01 Clk ; 45 hmm/fr, 1 wd/fr, Search: 0.00 CPU 0.00
Clk ( 109 312Z104037)
INFO: fast_algo_struct.c(371): HMMHist( 109 312Z104037): 242(100)
INFO: lm.c(590): 0 tg(), 0 tgcache, 0 bo; 0 fills, 0 in mem (0.0%)
INFO: lm.c(593): 2 bg(), 2 bo; 1 fills, 35 in mem (30.7%)
Hypothesis:
here is my telugu.cfg mentioned in the command line
-mdef /usr/local/sphinx3/telugu/model_architecture/telugu.1000.mdef
-mean /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/means
-var /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/variances
-mixw /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/mixture_weights
-tmat /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/transition_matrices
-dict /usr/local/sphinx3/telugu/etc/telugu.dic
-fdict /usr/local/sphinx3/telugu/etc/telugu.filler
-lm /usr/local/sphinx3/telugu/etc/5587.lm.DMP
-hyp /usr/local/sphinx3/telugu/result123.txt
it would be a great help to me if ksomeone could help to resolve the error
thanks in advance
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
ERROR: "wid.c", line 234: A is not a word in dictionary and it is not a
class tag.
ERROR: "wid.c", line 234: AA is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: D is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: DH is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: EE is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: EH is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: G is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: IH is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: K is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: L is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: M is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: N is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: OH is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: R is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: S is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: T is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: TH is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: UH is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: V is not a word in dictionary and it is not a class
tag.
It's pretty clear in statement that words like "UH" are missing in the
dictionary. The presence of these words in language model basically mean that
you used wrong file for the language model training. You had to upload sample
prompts file to the lmtool server, it looks like you uploaded dictionary
instead.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you very much for your reply I've could resolve the problem now but when
i start recording and finish recording nothing is recognized i mean hypothesis
is empty as following
INFO: kbcore.c(451): Begin Initialization of Core Models:
INFO: logs3.c(149): Initializing logbase: 1.000300e+00 (add table: 1)
INFO: Initialization of the log add table
INFO: Log-Add table size = 29350
INFO:
INFO: feat.c(665): Initializing feature stream to type: '1s_c_d_dd',
CMN='current', VARNORM='no', AGC='none'
INFO: kbcore.c(479): .cont.
INFO: Initialization of feat_t, report:
INFO: Feature type = 1s_c_d_dd
INFO: Cepstral size = 13
INFO: Cepstral size Used = 13
INFO: Number of stream = 1
INFO: Vector size of stream: 39
INFO: Whether CMN is used = 1
INFO: Whether AGC is used = 0
INFO: Whether variance is normalized = 0
INFO:
INFO: Reading HMM in Sphinx 3 Model format
INFO: Model Definition File:
/usr/local/sphinx3/telugu/model_architecture/telugu.1000.mdef
INFO: Mean File:
/usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/means
INFO: Variance File:
/usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/variances
INFO: Mixture Weight File:
/usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/mixture_weights
INFO: Transition Matrices File: /usr/local/sphinx3/telugu/model_parameters/tel
ugu.cd_cont_1000/transition_matrices
INFO: mdef.c(637): Reading model definition:
/usr/local/sphinx3/telugu/model_architecture/telugu.1000.mdef
INFO: Initialization of mdef_t, report:
INFO: 20 CI-phone, 265 CD-phone, 3 emitstate/phone, 60 CI-sen, 252 Sen, 91
Sen-Seq
INFO:
INFO: kbcore.c(324): Using optimized GMM computation for Continuous HMM, -topn
will be ignored
INFO: cont_mgau.c(157): Reading mixture gaussian file
'/usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/means'
INFO: cont_mgau.c(323): 252 mixture Gaussians, 8 components, 1 streams, veclen
39
INFO: cont_mgau.c(157): Reading mixture gaussian file
'/usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/variances'
INFO: cont_mgau.c(323): 252 mixture Gaussians, 8 components, 1 streams, veclen
39
INFO: cont_mgau.c(409): Reading mixture weights file '/usr/local/sphinx3/telug
u/model_parameters/telugu.cd_cont_1000/mixture_weights'
INFO: cont_mgau.c(539): Read 252 x 8 mixture weights
INFO: cont_mgau.c(565): Removing uninitialized Gaussian densities
INFO: cont_mgau.c(633): Applying variance floor
INFO: cont_mgau.c(651): 0 variance values floored
INFO: cont_mgau.c(698): Precomputing Mahalanobis distance invariants
INFO: tmat.c(162): Reading HMM transition probability matrices: /usr/local/sph
inx3/telugu/model_parameters/telugu.cd_cont_1000/transition_matrices
INFO: Initialization of tmat_t, report:
INFO: Read 20 transition matrices of size 3x4
INFO:
INFO: dict.c(430): Reading main dictionary:
/usr/local/sphinx3/telugu/etc/telugu.dic
INFO: dict.c(433): 35 words read
INFO: dict.c(438): Reading filler dictionary:
/usr/local/sphinx3/telugu/etc/telugu.filler
INFO: dict.c(441): 3 words read
INFO: Initialization of dict_t, report:
INFO: No of word: 38
INFO:
INFO: lm.c(393): LM read('/usr/local/sphinx3/telugu/etc/1890.lm.DMP', lw=
9.50, wip= 0.70, uw= 0.70)
INFO: lm.c(394): Reading LM file /usr/local/sphinx3/telugu/etc/1890.lm.DMP (LM
name "default")
INFO: lm_3g_dmp.c(359): 38 ug
INFO: lm_3g_dmp.c(403): 72 bigrams
INFO: lm_3g_dmp.c(410): 36 trigrams
INFO: lm_3g_dmp.c(435): 3 bigram prob entries
INFO: lm_3g_dmp.c(453): 3 trigram bowt entries
INFO: lm_3g_dmp.c(469): 2 trigram prob entries
INFO: lm_3g_dmp.c(484): 1 trigram segtable entries (512 segsize)
INFO: lm_3g_dmp.c(518): 38 word strings
INFO: Initialization of fillpen_t, report:
INFO: Language weight =9.500000
INFO: Word Insertion Penalty =0.700000
INFO: Silence probability =0.100000
INFO: Filler probability =0.100000
INFO:
INFO: dict2pid.c(525): Building PID tables for dictionary
INFO: Initialization of dict2pid_t, report:
INFO: Dict2pid is in composite triphone mode
INFO: 69 composite states; 23 composite sseq
INFO:
INFO: kbcore.c(626): Inside kbcore: Verifying models consistency ......
INFO: kbcore.c(646): End of Initialization of Core Models:
INFO: Initialization of beam_t, report:
INFO: Parameters used in Beam Pruning of Viterbi Search:
INFO: Beam=-422133
INFO: PBeam=-383758
INFO: WBeam=-268630 (Skip=0)
INFO: WEndBeam=-614012
INFO: No of CI Phone assumed=20
INFO:
INFO: Initialization of fast_gmm_t, report:
INFO: Parameters used in Fast GMM computation:
INFO: Frame-level: Down Sampling Ratio 1, Conditional Down Sampling? 0,
Distance-based Down Sampling? 0
INFO: GMM-level: CI phone beam -614012. MAX CD 100000
INFO: Gaussian-level: GS map would be used for Gaussian Selection? =1, SVQ
would be used as Gaussian Score? =0 SubVQ Beam -19363
INFO:
INFO: Initialization of pl_t, report:
INFO: Parameters used in phoneme lookahead:
INFO: Phoneme look-ahead type = 0
INFO: Phoneme look-ahead beam size = 65945
INFO: No of CI Phones assumed=20
INFO:
INFO: Initialization of ascr_t, report:
INFO: No. of CI senone =60
INFO: No. of senone = 252
INFO: No. of composite senone = 69
INFO: No. of senone sequence = 91
INFO: No. of composite senone sequence=23
INFO: Parameters used in phoneme lookahead:
INFO: Phoneme lookahead window = 1
INFO:
INFO: vithist.c(163): Initializing Viterbi-history module
INFO: Initialization of vithist_t, report:
INFO: Word beam = -268630
INFO: Bigram Mode =0
INFO: Rescore Mode =1
INFO: Trace sil Mode =1
INFO:
INFO: srch.c(410): Search Initialization.
WARNING: "srch_time_switch_tree.c", line 165: -Nstalextree is omitted in TST
search.
INFO: lextree.c(218): Creating Unigram Table for lm (name: default)
INFO: lextree.c(231): Size of word table after unigram + words in class: 36.
INFO: lextree.c(239): Size of word table after adding alternative prons: 36.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 6
INFO: Number of node 118
INFO: Number of links in the tree 128
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 52
INFO: The size of a gnode_t 12
INFO: srch_time_switch_tree.c(216): Lextrees (0) for lm 0, its name is
default, it has 118 nodes(ug)
INFO: lextree.c(218): Creating Unigram Table for lm (name: default)
INFO: lextree.c(231): Size of word table after unigram + words in class: 36.
INFO: lextree.c(239): Size of word table after adding alternative prons: 36.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 6
INFO: Number of node 118
INFO: Number of links in the tree 128
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 52
INFO: The size of a gnode_t 12
INFO: srch_time_switch_tree.c(216): Lextrees (1) for lm 0, its name is
default, it has 118 nodes(ug)
INFO: lextree.c(218): Creating Unigram Table for lm (name: default)
INFO: lextree.c(231): Size of word table after unigram + words in class: 36.
INFO: lextree.c(239): Size of word table after adding alternative prons: 36.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 6
INFO: Number of node 118
INFO: Number of links in the tree 128
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 52
INFO: The size of a gnode_t 12
INFO: srch_time_switch_tree.c(216): Lextrees (2) for lm 0, its name is
default, it has 118 nodes(ug)
INFO: srch_time_switch_tree.c(221): Time for building trees, 0.0000 CPU 0.0002
Clk
INFO: srch_time_switch_tree.c(242): Lextrees(0), 1 nodes(filler)
INFO: srch_time_switch_tree.c(242): Lextrees(1), 1 nodes(filler)
INFO: srch_time_switch_tree.c(242): Lextrees(2), 1 nodes(filler)
INFO: Initialization of srch_t, report:
INFO: Operation Mode = 4, Operation Name = OP_TST_DECODE
INFO:
INFO: live_decode_API.c(233): Partial hypothesis WILL be dumped
INFO: live_decode_API.c(241): Input data will NOT be byte swapped
press ENTER to start recording
FWDVIT: (* 109 312Z163227)
FWDXCT: * 109 312Z163227 S 3996726 T -318532 A -318810 L 278 0 3638801 -8101
<sil> 140 0 -9117 140 </sil>
INFO: stat.c(146): 140 frm; 49 cdsen/fr, 60 cisen/fr, 395 cdgau/fr, 480
cigau/fr, Sen 0.01, CPU 0.02 Clk ; 50 hmm/fr, 1 wd/fr, Search: 0.00 CPU 0.00
Clk ( 109 312Z163227)
INFO: fast_algo_struct.c(371): HMMHist( 109 312Z163227): 140(100)
INFO: lm.c(590): 0 tg(), 0 tgcache, 0 bo; 0 fills, 0 in mem (0.0%)
INFO: lm.c(593): 2 bg(), 2 bo; 1 fills, 36 in mem (49.3%)
Hypothesis:
and the result stored(hypothesis ) in the destination i provided is(as in
config file ie result is stored in result123.txt) is
like (* 109 312Z162216) which varies for each utterance
could i be helped to resolve this issue what does the content in result123.txt
mean,what is to be done to see that the word uttered is recognized and stored
at reuslt123.txt
i'm using 100 speaker model with 35 utterances at word level
please help me out
thanks in advance
divya
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It looks like you incorrectly trained your model. Most probably feature
extraction code is wrong. You also skipped the critical part of model training
- testing with sphinx3_decode as described in tutorial. If you need to get
this working you either need to doublecheck everything yourself or share the
data you are training with.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
hello
i'm currently using sphinx3_livedecode to decode in linux (fedora8) i trained
the model using sphinxtrain with 16khz,16bit mono pcm format when i try to use
the etc,model_parameters,model_architecture for decoding as
sphinx3_livedecode telugu.cfg
i get the following output:
sphinx3_livedecode telugu.cfg
INFO: cmd_ln.c(399): Parsing command line:
\
-mdef /usr/local/sphinx3/telugu/model_architecture/telugu.1000.mdef \
-mean /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/means \
-var /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/variances \
-mixw /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/mixture_weights \
-tmat /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/transition_matrices \
-dict /usr/local/sphinx3/telugu/etc/telugu.dic \
-fdict /usr/local/sphinx3/telugu/etc/telugu.filler \
-lm /usr/local/sphinx3/telugu/etc/5587.lm.DMP \
-hyp /usr/local/sphinx3/telugu/result123.txt
Current configuration:
-agc none none
-alpha 0.97 9.700000e-01
-backtrace 1 1
-beam 1.0e-55 1.000000e-55
-bestpath 0 0
-bestpathlw
-bestscoredir
-bestsenscrdir
-bghist 0 0
-blocksize 200000 200000
-bptbldir
-bptblsize 32768 32768
-bt_wsil 1 1
-cb2mllr .1cls. .1cls.
-ci_pbeam 1e-80 1.000000e-80
-cmn current current
-composite 1 1
-cond_ds 0 0
-ctl
-ctlcount 1000000000 1000000000
-ctloffset 0 0
-ctl_lm
-ctl_mllr
-dagfudge 2 2
-dict /usr/local/sphinx3/telugu/etc/telugu.dic
-dist_ds 0 0
-dither no no
-doublebw 0 0
-ds 1 1
-epl 3 3
-fbtype mel_scale mel_scale
-fdict /usr/local/sphinx3/telugu/etc/telugu.filler
-feat 1s_c_d_dd 1s_c_d_dd
-fillpen
-fillprob 0.1 1.000000e-01
-frate 100 100
-fsg
-fsgusealtpron 1 1
-fsgusefiller 1 1
-gs
-gs4gs 1 1
-hmm
-hmmdump 0 0
-hmmdumpef 200000000 200000000
-hmmdumpsf 200000000 200000000
-hmmhistbinsize 5000 5000
-hyp /usr/local/sphinx3/telugu/result123.txt
-hypseg
-hypsegfmt 0 0
-hypsegscore_unscale 1 1
-inlatdir
-inlatwin 50 50
-input_endian little little
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latcompress 1 1
-latext lat.gz lat.gz
-lextreedump 0 0
-lm /usr/local/sphinx3/telugu/etc/5587.lm.DMP
-lmctlfn
-lmdumpdir
-lminmemory 0 0
-lmname
-lmrescore 1 1
-log3table 1 1
-logbase 1.0003 1.000300e+00
-lowerf 133.33334 1.333333e+02
-lts_mismatch 0 0
-lw 9.5 9.500000e+00
-machine_endian little little
-maxcdsenpf 100000 100000
-maxcepvecs 256 256
-maxedge 2000000 2000000
-maxhistpf 100 100
-maxhmmpf 20000 20000
-maxhyplen 1000 1000
-maxlmop 100000000 100000000
-maxlpf 40000 40000
-maxwpf 20 20
-mdef /usr/local/sphinx3/telugu/model_architecture/telugu.1000.mdef
-mean /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/means
-min_endfr 3 3
-mixw /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/mixture_weights
-mixwfloor 0.0000001 1.000000e-07
-mllr
-multiplex_multi 1 1
-multiplex_single 1 1
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-Nlextree 3 3
-Nstalextree 25 25
-op_mode 4 4
-outlatdir
-outlatfmt 0 0
-outlatoldfmt 1 1
-pbeam 1.0e-50 1.000000e-50
-pheurtype 0 0
-phonepen 1.0 1.000000e+00
-phypdump 1 1
-pl_beam 1.0e-80 1.000000e-80
-pl_window 1 1
-ptranskip 0 0
-rawext .raw .raw
-samprate 16000.0 1.600000e+04
-seed -1 -1
-senmgau .cont. .cont.
-silprob 0.1 1.000000e-01
-subvq
-subvqbeam 3.0e-3 3.000000e-03
-svq4svq 0 0
-tighten_factor 0.5 5.000000e-01
-tmat /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/transition_matrices
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-tracewhmm
-treeugprob 1 1
-upperf 6855.4976 6.855498e+03
-uw 0.7 7.000000e-01
-var /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/variances
-varfloor 0.0001 1.000000e-04
-varnorm no no
-vqeval 3 3
-warp_params 1.0 1.0
-warp_type inverse_linear inverse_linear
-wbeam 1.0e-35 1.000000e-35
-wend_beam 1.0e-80 1.000000e-80
-wip 0.7 7.000000e-01
-wlen 0.025625 2.562500e-02
-worddumpef 200000000 200000000
-worddumpsf 200000000 200000000
INFO: kbcore.c(451): Begin Initialization of Core Models:
INFO: logs3.c(149): Initializing logbase: 1.000300e+00 (add table: 1)
INFO: Initialization of the log add table
INFO: Log-Add table size = 29350
INFO:
INFO: feat.c(665): Initializing feature stream to type: '1s_c_d_dd',
CMN='current', VARNORM='no', AGC='none'
INFO: kbcore.c(479): .cont.
INFO: Initialization of feat_t, report:
INFO: Feature type = 1s_c_d_dd
INFO: Cepstral size = 13
INFO: Cepstral size Used = 13
INFO: Number of stream = 1
INFO: Vector size of stream: 39
INFO: Whether CMN is used = 1
INFO: Whether AGC is used = 0
INFO: Whether variance is normalized = 0
INFO:
INFO: Reading HMM in Sphinx 3 Model format
INFO: Model Definition File:
/usr/local/sphinx3/telugu/model_architecture/telugu.1000.mdef
INFO: Mean File:
/usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/means
INFO: Variance File:
/usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/variances
INFO: Mixture Weight File:
/usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/mixture_weights
INFO: Transition Matrices File: /usr/local/sphinx3/telugu/model_parameters/tel
ugu.cd_cont_1000/transition_matrices
INFO: mdef.c(637): Reading model definition:
/usr/local/sphinx3/telugu/model_architecture/telugu.1000.mdef
INFO: Initialization of mdef_t, report:
INFO: 20 CI-phone, 265 CD-phone, 3 emitstate/phone, 60 CI-sen, 252 Sen, 91
Sen-Seq
INFO:
INFO: kbcore.c(324): Using optimized GMM computation for Continuous HMM, -topn
will be ignored
INFO: cont_mgau.c(157): Reading mixture gaussian file
'/usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/means'
INFO: cont_mgau.c(323): 252 mixture Gaussians, 8 components, 1 streams, veclen
39
INFO: cont_mgau.c(157): Reading mixture gaussian file
'/usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/variances'
INFO: cont_mgau.c(323): 252 mixture Gaussians, 8 components, 1 streams, veclen
39
INFO: cont_mgau.c(409): Reading mixture weights file '/usr/local/sphinx3/telug
u/model_parameters/telugu.cd_cont_1000/mixture_weights'
INFO: cont_mgau.c(539): Read 252 x 8 mixture weights
INFO: cont_mgau.c(565): Removing uninitialized Gaussian densities
INFO: cont_mgau.c(633): Applying variance floor
INFO: cont_mgau.c(651): 0 variance values floored
INFO: cont_mgau.c(698): Precomputing Mahalanobis distance invariants
INFO: tmat.c(162): Reading HMM transition probability matrices: /usr/local/sph
inx3/telugu/model_parameters/telugu.cd_cont_1000/transition_matrices
INFO: Initialization of tmat_t, report:
INFO: Read 20 transition matrices of size 3x4
INFO:
INFO: dict.c(430): Reading main dictionary:
/usr/local/sphinx3/telugu/etc/telugu.dic
INFO: dict.c(433): 35 words read
INFO: dict.c(438): Reading filler dictionary:
/usr/local/sphinx3/telugu/etc/telugu.filler
INFO: dict.c(441): 3 words read
INFO: Initialization of dict_t, report:
INFO: No of word: 38
INFO:
INFO: lm.c(393): LM read('/usr/local/sphinx3/telugu/etc/5587.lm.DMP', lw=
9.50, wip= 0.70, uw= 0.70)
INFO: lm.c(394): Reading LM file /usr/local/sphinx3/telugu/etc/5587.lm.DMP (LM
name "default")
INFO: lm_3g_dmp.c(359): 56 ug
INFO: lm_3g_dmp.c(403): 113 bigrams
INFO: lm_3g_dmp.c(410): 126 trigrams
INFO: lm_3g_dmp.c(435): 23 bigram prob entries
INFO: lm_3g_dmp.c(453): 20 trigram bowt entries
INFO: lm_3g_dmp.c(469): 10 trigram prob entries
INFO: lm_3g_dmp.c(484): 1 trigram segtable entries (512 segsize)
INFO: lm_3g_dmp.c(518): 56 word strings
ERROR: "wid.c", line 234: A is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: AA is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: D is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: DH is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: EE is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: EH is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: G is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: IH is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: K is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: L is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: M is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: N is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: OH is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: R is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: S is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: T is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: TH is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: UH is not a word in dictionary and it is not a class
tag.
ERROR: "wid.c", line 234: V is not a word in dictionary and it is not a class
tag.
INFO: wid.c(242): 19 LM words not in dictionary; ignored
INFO: Initialization of fillpen_t, report:
INFO: Language weight =9.500000
INFO: Word Insertion Penalty =0.700000
INFO: Silence probability =0.100000
INFO: Filler probability =0.100000
INFO:
INFO: dict2pid.c(525): Building PID tables for dictionary
INFO: Initialization of dict2pid_t, report:
INFO: Dict2pid is in composite triphone mode
INFO: 69 composite states; 23 composite sseq
INFO:
INFO: kbcore.c(626): Inside kbcore: Verifying models consistency ......
INFO: kbcore.c(646): End of Initialization of Core Models:
INFO: Initialization of beam_t, report:
INFO: Parameters used in Beam Pruning of Viterbi Search:
INFO: Beam=-422133
INFO: PBeam=-383758
INFO: WBeam=-268630 (Skip=0)
INFO: WEndBeam=-614012
INFO: No of CI Phone assumed=20
INFO:
INFO: Initialization of fast_gmm_t, report:
INFO: Parameters used in Fast GMM computation:
INFO: Frame-level: Down Sampling Ratio 1, Conditional Down Sampling? 0,
Distance-based Down Sampling? 0
INFO: GMM-level: CI phone beam -614012. MAX CD 100000
INFO: Gaussian-level: GS map would be used for Gaussian Selection? =1, SVQ
would be used as Gaussian Score? =0 SubVQ Beam -19363
INFO:
INFO: Initialization of pl_t, report:
INFO: Parameters used in phoneme lookahead:
INFO: Phoneme look-ahead type = 0
INFO: Phoneme look-ahead beam size = 65945
INFO: No of CI Phones assumed=20
INFO:
INFO: Initialization of ascr_t, report:
INFO: No. of CI senone =60
INFO: No. of senone = 252
INFO: No. of composite senone = 69
INFO: No. of senone sequence = 91
INFO: No. of composite senone sequence=23
INFO: Parameters used in phoneme lookahead:
INFO: Phoneme lookahead window = 1
INFO:
INFO: vithist.c(163): Initializing Viterbi-history module
INFO: Initialization of vithist_t, report:
INFO: Word beam = -268630
INFO: Bigram Mode =0
INFO: Rescore Mode =1
INFO: Trace sil Mode =1
INFO:
INFO: srch.c(410): Search Initialization.
WARNING: "srch_time_switch_tree.c", line 165: -Nstalextree is omitted in TST
search.
INFO: lextree.c(218): Creating Unigram Table for lm (name: default)
INFO: lextree.c(231): Size of word table after unigram + words in class: 35.
INFO: lextree.c(239): Size of word table after adding alternative prons: 35.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 6
INFO: Number of node 117
INFO: Number of links in the tree 127
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 52
INFO: The size of a gnode_t 12
INFO: srch_time_switch_tree.c(216): Lextrees (0) for lm 0, its name is
default, it has 117 nodes(ug)
INFO: lextree.c(218): Creating Unigram Table for lm (name: default)
INFO: lextree.c(231): Size of word table after unigram + words in class: 35.
INFO: lextree.c(239): Size of word table after adding alternative prons: 35.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 6
INFO: Number of node 117
INFO: Number of links in the tree 127
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 52
INFO: The size of a gnode_t 12
INFO: srch_time_switch_tree.c(216): Lextrees (1) for lm 0, its name is
default, it has 117 nodes(ug)
INFO: lextree.c(218): Creating Unigram Table for lm (name: default)
INFO: lextree.c(231): Size of word table after unigram + words in class: 35.
INFO: lextree.c(239): Size of word table after adding alternative prons: 35.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 6
INFO: Number of node 117
INFO: Number of links in the tree 127
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 52
INFO: The size of a gnode_t 12
INFO: srch_time_switch_tree.c(216): Lextrees (2) for lm 0, its name is
default, it has 117 nodes(ug)
INFO: srch_time_switch_tree.c(221): Time for building trees, 0.0010 CPU 0.0003
Clk
INFO: srch_time_switch_tree.c(242): Lextrees(0), 1 nodes(filler)
INFO: srch_time_switch_tree.c(242): Lextrees(1), 1 nodes(filler)
INFO: srch_time_switch_tree.c(242): Lextrees(2), 1 nodes(filler)
INFO: Initialization of srch_t, report:
INFO: Operation Mode = 4, Operation Name = OP_TST_DECODE
INFO:
INFO: live_decode_API.c(233): Partial hypothesis WILL be dumped
INFO: live_decode_API.c(241): Input data will NOT be byte swapped
press ENTER to start recording
press ENTER to finish recording
INFO: cmn_prior.c(74): mean= 12.00, mean= 0.0
.
..
.
.
.
..
.
.
.
..
.
.
..
.
.
.
..
.
.
.
Backtrace( 109 312Z104037)
FV: 109 312Z104037> WORD SFrm EFrm AScr(UnNorm) LMScore AScr+LScr AScale
fv: 109 312Z104037> <sil> 0 241 7880374 -74100 7806274 8158270
FV:</sil> 109 312Z104037> TOTAL 7880374 -74100
FWDVIT: (* 109 312Z104037)
FWDXCT: * 109 312Z104037 S 8189562 T -277618 A -277896 L 278 0 7880374 -8101
<sil> 242 0 -13059 242 </sil>
INFO: stat.c(146): 242 frm; 48 cdsen/fr, 60 cisen/fr, 384 cdgau/fr, 480
cigau/fr, Sen 0.01, CPU 0.01 Clk ; 45 hmm/fr, 1 wd/fr, Search: 0.00 CPU 0.00
Clk ( 109 312Z104037)
INFO: fast_algo_struct.c(371): HMMHist( 109 312Z104037): 242(100)
INFO: lm.c(590): 0 tg(), 0 tgcache, 0 bo; 0 fills, 0 in mem (0.0%)
INFO: lm.c(593): 2 bg(), 2 bo; 1 fills, 35 in mem (30.7%)
Hypothesis:
here is my telugu.cfg mentioned in the command line
-mdef /usr/local/sphinx3/telugu/model_architecture/telugu.1000.mdef
-mean /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/means
-var /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/variances
-mixw /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/mixture_weights
-tmat /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/transition_matrices
-dict /usr/local/sphinx3/telugu/etc/telugu.dic
-fdict /usr/local/sphinx3/telugu/etc/telugu.filler
-lm /usr/local/sphinx3/telugu/etc/5587.lm.DMP
-hyp /usr/local/sphinx3/telugu/result123.txt
it would be a great help to me if ksomeone could help to resolve the error
thanks in advance
If you are asking about this
It's pretty clear in statement that words like "UH" are missing in the
dictionary. The presence of these words in language model basically mean that
you used wrong file for the language model training. You had to upload sample
prompts file to the lmtool server, it looks like you uploaded dictionary
instead.
Thank you very much for your reply I've could resolve the problem now but when
i start recording and finish recording nothing is recognized i mean hypothesis
is empty as following
sphinx3_livedecode telugu.cfg
INFO: cmd_ln.c(399): Parsing command line:
\
-mdef /usr/local/sphinx3/telugu/model_architecture/telugu.1000.mdef \
-mean /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/means \
-var /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/variances \
-mixw /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/mixture_weights \
-tmat /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/transition_matrices \
-dict /usr/local/sphinx3/telugu/etc/telugu.dic \
-fdict /usr/local/sphinx3/telugu/etc/telugu.filler \
-lm /usr/local/sphinx3/telugu/etc/1890.lm.DMP \
-hyp /usr/local/sphinx3/telugu/result123.txt
Current configuration:
-agc none none
-alpha 0.97 9.700000e-01
-backtrace 1 1
-beam 1.0e-55 1.000000e-55
-bestpath 0 0
-bestpathlw
-bestscoredir
-bestsenscrdir
-bghist 0 0
-blocksize 200000 200000
-bptbldir
-bptblsize 32768 32768
-bt_wsil 1 1
-cb2mllr .1cls. .1cls.
-ci_pbeam 1e-80 1.000000e-80
-cmn current current
-composite 1 1
-cond_ds 0 0
-ctl
-ctlcount 1000000000 1000000000
-ctloffset 0 0
-ctl_lm
-ctl_mllr
-dagfudge 2 2
-dict /usr/local/sphinx3/telugu/etc/telugu.dic
-dist_ds 0 0
-dither no no
-doublebw 0 0
-ds 1 1
-epl 3 3
-fbtype mel_scale mel_scale
-fdict /usr/local/sphinx3/telugu/etc/telugu.filler
-feat 1s_c_d_dd 1s_c_d_dd
-fillpen
-fillprob 0.1 1.000000e-01
-frate 100 100
-fsg
-fsgusealtpron 1 1
-fsgusefiller 1 1
-gs
-gs4gs 1 1
-hmm
-hmmdump 0 0
-hmmdumpef 200000000 200000000
-hmmdumpsf 200000000 200000000
-hmmhistbinsize 5000 5000
-hyp /usr/local/sphinx3/telugu/result123.txt
-hypseg
-hypsegfmt 0 0
-hypsegscore_unscale 1 1
-inlatdir
-inlatwin 50 50
-input_endian little little
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latcompress 1 1
-latext lat.gz lat.gz
-lextreedump 0 0
-lm /usr/local/sphinx3/telugu/etc/1890.lm.DMP
-lmctlfn
-lmdumpdir
-lminmemory 0 0
-lmname
-lmrescore 1 1
-log3table 1 1
-logbase 1.0003 1.000300e+00
-lowerf 133.33334 1.333333e+02
-lts_mismatch 0 0
-lw 9.5 9.500000e+00
-machine_endian little little
-maxcdsenpf 100000 100000
-maxcepvecs 256 256
-maxedge 2000000 2000000
-maxhistpf 100 100
-maxhmmpf 20000 20000
-maxhyplen 1000 1000
-maxlmop 100000000 100000000
-maxlpf 40000 40000
-maxwpf 20 20
-mdef /usr/local/sphinx3/telugu/model_architecture/telugu.1000.mdef
-mean /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/means
-min_endfr 3 3
-mixw /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/mixture_weights
-mixwfloor 0.0000001 1.000000e-07
-mllr
-multiplex_multi 1 1
-multiplex_single 1 1
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-Nlextree 3 3
-Nstalextree 25 25
-op_mode 4 4
-outlatdir
-outlatfmt 0 0
-outlatoldfmt 1 1
-pbeam 1.0e-50 1.000000e-50
-pheurtype 0 0
-phonepen 1.0 1.000000e+00
-phypdump 1 1
-pl_beam 1.0e-80 1.000000e-80
-pl_window 1 1
-ptranskip 0 0
-rawext .raw .raw
-samprate 16000.0 1.600000e+04
-seed -1 -1
-senmgau .cont. .cont.
-silprob 0.1 1.000000e-01
-subvq
-subvqbeam 3.0e-3 3.000000e-03
-svq4svq 0 0
-tighten_factor 0.5 5.000000e-01
-tmat /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/transition_matrices
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-tracewhmm
-treeugprob 1 1
-upperf 6855.4976 6.855498e+03
-uw 0.7 7.000000e-01
-var /usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/variances
-varfloor 0.0001 1.000000e-04
-varnorm no no
-vqeval 3 3
-warp_params 1.0 1.0
-warp_type inverse_linear inverse_linear
-wbeam 1.0e-35 1.000000e-35
-wend_beam 1.0e-80 1.000000e-80
-wip 0.7 7.000000e-01
-wlen 0.025625 2.562500e-02
-worddumpef 200000000 200000000
-worddumpsf 200000000 200000000
INFO: kbcore.c(451): Begin Initialization of Core Models:
INFO: logs3.c(149): Initializing logbase: 1.000300e+00 (add table: 1)
INFO: Initialization of the log add table
INFO: Log-Add table size = 29350
INFO:
INFO: feat.c(665): Initializing feature stream to type: '1s_c_d_dd',
CMN='current', VARNORM='no', AGC='none'
INFO: kbcore.c(479): .cont.
INFO: Initialization of feat_t, report:
INFO: Feature type = 1s_c_d_dd
INFO: Cepstral size = 13
INFO: Cepstral size Used = 13
INFO: Number of stream = 1
INFO: Vector size of stream: 39
INFO: Whether CMN is used = 1
INFO: Whether AGC is used = 0
INFO: Whether variance is normalized = 0
INFO:
INFO: Reading HMM in Sphinx 3 Model format
INFO: Model Definition File:
/usr/local/sphinx3/telugu/model_architecture/telugu.1000.mdef
INFO: Mean File:
/usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/means
INFO: Variance File:
/usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/variances
INFO: Mixture Weight File:
/usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/mixture_weights
INFO: Transition Matrices File: /usr/local/sphinx3/telugu/model_parameters/tel
ugu.cd_cont_1000/transition_matrices
INFO: mdef.c(637): Reading model definition:
/usr/local/sphinx3/telugu/model_architecture/telugu.1000.mdef
INFO: Initialization of mdef_t, report:
INFO: 20 CI-phone, 265 CD-phone, 3 emitstate/phone, 60 CI-sen, 252 Sen, 91
Sen-Seq
INFO:
INFO: kbcore.c(324): Using optimized GMM computation for Continuous HMM, -topn
will be ignored
INFO: cont_mgau.c(157): Reading mixture gaussian file
'/usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/means'
INFO: cont_mgau.c(323): 252 mixture Gaussians, 8 components, 1 streams, veclen
39
INFO: cont_mgau.c(157): Reading mixture gaussian file
'/usr/local/sphinx3/telugu/model_parameters/telugu.cd_cont_1000/variances'
INFO: cont_mgau.c(323): 252 mixture Gaussians, 8 components, 1 streams, veclen
39
INFO: cont_mgau.c(409): Reading mixture weights file '/usr/local/sphinx3/telug
u/model_parameters/telugu.cd_cont_1000/mixture_weights'
INFO: cont_mgau.c(539): Read 252 x 8 mixture weights
INFO: cont_mgau.c(565): Removing uninitialized Gaussian densities
INFO: cont_mgau.c(633): Applying variance floor
INFO: cont_mgau.c(651): 0 variance values floored
INFO: cont_mgau.c(698): Precomputing Mahalanobis distance invariants
INFO: tmat.c(162): Reading HMM transition probability matrices: /usr/local/sph
inx3/telugu/model_parameters/telugu.cd_cont_1000/transition_matrices
INFO: Initialization of tmat_t, report:
INFO: Read 20 transition matrices of size 3x4
INFO:
INFO: dict.c(430): Reading main dictionary:
/usr/local/sphinx3/telugu/etc/telugu.dic
INFO: dict.c(433): 35 words read
INFO: dict.c(438): Reading filler dictionary:
/usr/local/sphinx3/telugu/etc/telugu.filler
INFO: dict.c(441): 3 words read
INFO: Initialization of dict_t, report:
INFO: No of word: 38
INFO:
INFO: lm.c(393): LM read('/usr/local/sphinx3/telugu/etc/1890.lm.DMP', lw=
9.50, wip= 0.70, uw= 0.70)
INFO: lm.c(394): Reading LM file /usr/local/sphinx3/telugu/etc/1890.lm.DMP (LM
name "default")
INFO: lm_3g_dmp.c(359): 38 ug
INFO: lm_3g_dmp.c(403): 72 bigrams
INFO: lm_3g_dmp.c(410): 36 trigrams
INFO: lm_3g_dmp.c(435): 3 bigram prob entries
INFO: lm_3g_dmp.c(453): 3 trigram bowt entries
INFO: lm_3g_dmp.c(469): 2 trigram prob entries
INFO: lm_3g_dmp.c(484): 1 trigram segtable entries (512 segsize)
INFO: lm_3g_dmp.c(518): 38 word strings
INFO: Initialization of fillpen_t, report:
INFO: Language weight =9.500000
INFO: Word Insertion Penalty =0.700000
INFO: Silence probability =0.100000
INFO: Filler probability =0.100000
INFO:
INFO: dict2pid.c(525): Building PID tables for dictionary
INFO: Initialization of dict2pid_t, report:
INFO: Dict2pid is in composite triphone mode
INFO: 69 composite states; 23 composite sseq
INFO:
INFO: kbcore.c(626): Inside kbcore: Verifying models consistency ......
INFO: kbcore.c(646): End of Initialization of Core Models:
INFO: Initialization of beam_t, report:
INFO: Parameters used in Beam Pruning of Viterbi Search:
INFO: Beam=-422133
INFO: PBeam=-383758
INFO: WBeam=-268630 (Skip=0)
INFO: WEndBeam=-614012
INFO: No of CI Phone assumed=20
INFO:
INFO: Initialization of fast_gmm_t, report:
INFO: Parameters used in Fast GMM computation:
INFO: Frame-level: Down Sampling Ratio 1, Conditional Down Sampling? 0,
Distance-based Down Sampling? 0
INFO: GMM-level: CI phone beam -614012. MAX CD 100000
INFO: Gaussian-level: GS map would be used for Gaussian Selection? =1, SVQ
would be used as Gaussian Score? =0 SubVQ Beam -19363
INFO:
INFO: Initialization of pl_t, report:
INFO: Parameters used in phoneme lookahead:
INFO: Phoneme look-ahead type = 0
INFO: Phoneme look-ahead beam size = 65945
INFO: No of CI Phones assumed=20
INFO:
INFO: Initialization of ascr_t, report:
INFO: No. of CI senone =60
INFO: No. of senone = 252
INFO: No. of composite senone = 69
INFO: No. of senone sequence = 91
INFO: No. of composite senone sequence=23
INFO: Parameters used in phoneme lookahead:
INFO: Phoneme lookahead window = 1
INFO:
INFO: vithist.c(163): Initializing Viterbi-history module
INFO: Initialization of vithist_t, report:
INFO: Word beam = -268630
INFO: Bigram Mode =0
INFO: Rescore Mode =1
INFO: Trace sil Mode =1
INFO:
INFO: srch.c(410): Search Initialization.
WARNING: "srch_time_switch_tree.c", line 165: -Nstalextree is omitted in TST
search.
INFO: lextree.c(218): Creating Unigram Table for lm (name: default)
INFO: lextree.c(231): Size of word table after unigram + words in class: 36.
INFO: lextree.c(239): Size of word table after adding alternative prons: 36.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 6
INFO: Number of node 118
INFO: Number of links in the tree 128
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 52
INFO: The size of a gnode_t 12
INFO: srch_time_switch_tree.c(216): Lextrees (0) for lm 0, its name is
default, it has 118 nodes(ug)
INFO: lextree.c(218): Creating Unigram Table for lm (name: default)
INFO: lextree.c(231): Size of word table after unigram + words in class: 36.
INFO: lextree.c(239): Size of word table after adding alternative prons: 36.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 6
INFO: Number of node 118
INFO: Number of links in the tree 128
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 52
INFO: The size of a gnode_t 12
INFO: srch_time_switch_tree.c(216): Lextrees (1) for lm 0, its name is
default, it has 118 nodes(ug)
INFO: lextree.c(218): Creating Unigram Table for lm (name: default)
INFO: lextree.c(231): Size of word table after unigram + words in class: 36.
INFO: lextree.c(239): Size of word table after adding alternative prons: 36.
INFO: lextree_t, report:
INFO: Parameters of the lexical tree.
INFO: Type of the tree 0 (0:unigram, 1: 2g, 2: 3g etc.)
INFO: Number of left contexts 6
INFO: Number of node 118
INFO: Number of links in the tree 128
INFO: The previous word for this tree
INFO: The size of a node of the lexical tree 52
INFO: The size of a gnode_t 12
INFO: srch_time_switch_tree.c(216): Lextrees (2) for lm 0, its name is
default, it has 118 nodes(ug)
INFO: srch_time_switch_tree.c(221): Time for building trees, 0.0000 CPU 0.0002
Clk
INFO: srch_time_switch_tree.c(242): Lextrees(0), 1 nodes(filler)
INFO: srch_time_switch_tree.c(242): Lextrees(1), 1 nodes(filler)
INFO: srch_time_switch_tree.c(242): Lextrees(2), 1 nodes(filler)
INFO: Initialization of srch_t, report:
INFO: Operation Mode = 4, Operation Name = OP_TST_DECODE
INFO:
INFO: live_decode_API.c(233): Partial hypothesis WILL be dumped
INFO: live_decode_API.c(241): Input data will NOT be byte swapped
press ENTER to start recording
press ENTER to finish recording
INFO: cmn_prior.c(74): mean= 12.00, mean= 0.0
.
..
.
.
.
..
.
.
.
..
.
Backtrace( 109 312Z163227)
FV: 109 312Z163227> WORD SFrm EFrm AScr(UnNorm) LMScore AScr+LScr AScale
fv: 109 312Z163227> <sil> 0 139 3638801 -74100 3564701 3957611
FV:</sil> 109 312Z163227> TOTAL 3638801 -74100
FWDVIT: (* 109 312Z163227)
FWDXCT: * 109 312Z163227 S 3996726 T -318532 A -318810 L 278 0 3638801 -8101
<sil> 140 0 -9117 140 </sil>
INFO: stat.c(146): 140 frm; 49 cdsen/fr, 60 cisen/fr, 395 cdgau/fr, 480
cigau/fr, Sen 0.01, CPU 0.02 Clk ; 50 hmm/fr, 1 wd/fr, Search: 0.00 CPU 0.00
Clk ( 109 312Z163227)
INFO: fast_algo_struct.c(371): HMMHist( 109 312Z163227): 140(100)
INFO: lm.c(590): 0 tg(), 0 tgcache, 0 bo; 0 fills, 0 in mem (0.0%)
INFO: lm.c(593): 2 bg(), 2 bo; 1 fills, 36 in mem (49.3%)
Hypothesis:
and the result stored(hypothesis ) in the destination i provided is(as in
config file ie result is stored in result123.txt) is
like (* 109 312Z162216) which varies for each utterance
could i be helped to resolve this issue what does the content in result123.txt
mean,what is to be done to see that the word uttered is recognized and stored
at reuslt123.txt
i'm using 100 speaker model with 35 utterances at word level
please help me out
thanks in advance
divya
It looks like you incorrectly trained your model. Most probably feature
extraction code is wrong. You also skipped the critical part of model training
- testing with sphinx3_decode as described in tutorial. If you need to get
this working you either need to doublecheck everything yourself or share the
data you are training with.