I built 1 gau per state model.
During training I didn't got any error from Trainer. But accuracy of model is very low. So tried to increase gaussia parameters.
Do you use too few utterances to train a model? Acoustic model training requred a large amount of training data. Less than 1000 utterances are usually not enough. This is the most important question. If you have only small amount of training data, then the training will fail some point or the other. Regardless whether the procedure is correct.
Otherwise, I am willing to take a look at why the tool gave your fatal error. Just send me tar and gunziped file direct to me archan at cs dot cmu dot edu.
Arthur
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
No of Speaker : 49
For training : Data of 40 Speaker.
For Testing : Data of 9 Speaker, And live data.
Text corpus Specifications
DataSet :
CI phones : a aa ai an au ao b bh c ch d dh ee f
g gh h i ii j jh k kh l m n n2 o
oo p q r ri s sh shh t th u uu v x
xh y
Total CI phones : 46
Filler Phones : SIL
Total Filler Phones : 1
HMM State : 3
Trascription Lines : 4895
Total corpus words : 61955.
Unique words : 746.
Wave File Specifications
1) 16 KHz sampling frequency.
2) 16 Bits per sampling.
3) Mono channel.
4) Encoding format PCM (Uncompressed).
5) Spoken Data of 6.126 Hours.
Feature Built
Attempt 1)
System Configuration :
Processor : Celeron (Coppermine) 768.167MHz
Cache Memory : 128 KB
Main Memory : 512 MB
OS : Linux version 2.6.5-1.358 (Red Hat Linux 3.3.3-7)
C Compiler : gcc version 3.3.3
Dates : 13 July 2005
Time :
real 17m48.215s
user 10m31.664s
sys 0m39.108s
Effective Time Period : 17 Minutes 48.215 seconds
Output Message :
Successfully built feature files.
Feature files stored at "/root/Sphinx/Hindi/sound/feat"
Status available at "/root/Sphinx/Hindi/logDir/wave2feat.status"
Attempt 1)
System Configuration :
Processor : Celeron (Coppermine) 768.167MHz
Cache Memory : 128 KB
Main Memory : 512 MB
OS : Linux version 2.6.5-1.358 (Red Hat Linux 3.3.3-7)
C Compiler : gcc version 3.3.3
Dated : 20th July 2005
Time :
real 315m34.202s
user 119m48.455s
sys 193m18.313s
Effectime Time Period : 5 Hours 15 Minutes 34.202 seconds
Output Message:
Logged into result/ciModel.result file.
Building Decision Tree
Attempt 1)
System Configuration :
Processor : Celeron (Coppermine) 768.167MHz
Cache Memory : 128 KB
Main Memory : 512 MB
OS : Linux version 2.6.5-1.358 (Red Hat Linux 3.3.3-7)
C Compiler : gcc version 3.3.3
Dated : 20th July 2005
Time :
real 6m49.444s
user 6m19.686s
sys 0m25.598s
Effectime Time Period : 6 Minutes 49.444 seconds
Output Message:
Output stored at result/buildTree.result
Building CI Tied HMM Model
Attempt 1)
System Configuration :
Processor : Celeron (Coppermine) 768.167MHz
Cache Memory : 128 KB
Main Memory : 512 MB
OS : Linux version 2.6.5-1.358 (Red Hat Linux 3.3.3-7)
C Compiler : gcc version 3.3.3
Dated : 20th July 2005
Time :
real 541m42.754s
user 189m31.698s
sys 344m20.284s
Effectime Time Period : 9 Hours 1 Minutes 42.754 seconds
Output Message:
Output stored at result/ciTied.result
Organization of Training setup is like this
Gau1
+---error To maintain the error logs
+---linguistic To store the linguisitc information
+---prunetrees
+---trees
+---logDir To maintains the status of informations of command
+---cd
+---cd_tied
+---ci
+---linguistic
+---model Model are store in this directory
+---cd For CD UnTied model
+---accumulate To store intermmediate data during iteration
+---acc0
+---acc1
+---acc10
+---acc2
+---acc3
+---acc4
+---acc5
+---acc6
+---acc7
+---acc8
+---acc9
+---flatInit Flat initialization of CD Tied model
+---hmm Final HMM model of CD Tied model
+---Text Text Version of HMM
+---model_architecture To keep mode_def
+---cd_tied For CD Tied model
+---accumulate To store intermmediate data during iteration
+---acc0
+---acc1
+---acc10
+---acc2
+---acc3
+---acc4
+---acc5
+---acc6
+---acc7
+---acc8
+---acc9
+---flatInit Flat initialization of CD Tied model
+---hmm Final HMM model of CD Tied model
+---Text Text Version of HMM
+---model_architecture To keep mode_def
+---temp
+---ci For CI model
+---accumulate To store intermmediate data during iteration
+---acc0
+---acc1
+---acc10
+---acc2
+---acc3
+---acc4
+---acc5
+---acc6
+---acc7
+---acc8
+---acc9
+---flatInit Flat initialization of CI model
+---hmm Final HMM model of CI model
+---Text Text Version of HMM
+---model_architecture To keep mode_def
+---phones
+---cd To keep cdTriphone list, dict, transcript and filler dict
+---ci To keep ciPhone list, dict, transcript and filler dict
+---result To store status of shell script log
+---script Shell script written to build ciModel, cdUntiedModel, cdTiedModel, etc....
+---sound
+---feat To keep feature (MFC) files.
+---raw
+---wav To keep wav files.
Please help me...... I invested my lot of time in this project......
Your comment are most welcome, please suggest your tthoughts
Once again thanx for your willingness to help me
Tushat P.
India
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Tushat,
1, Don't worry. We'll be with you.
2, I cannot access Gau1.zip, could you put it in a proper path?
You can also send me a mail direct at archan at cs dot cmu dot edu. This will protect your setup.
3, I read your whole message and here is the thing. Apparently, in attemp1, your transcription might have been off-by-one. Have you fix it?
4, I also found that
next_prime: failed to find next primt for 0
next_prime: failed to find next primt for 0
bw: baum_welch.c:166: baum_welch_update: Assertion `n_state > 0' failed.
That sounds like a problem in SphinxTrain hash table issues.
Again, it is a pity I could not get your setup this time, please kindly send me the link again. I will try my best to help you.
Arthur Chan
Maintainer
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Tushar,
First of all, training always get wrong no matter how good the training tools are. Your confusion is common. I have been through this for 5 times in the past. So don't worry.
As for what wrong of the training, I still don't know yet. I will guess this is just some kind of off-by-one problem. (Like you use the wrong transcription for wrong wave file. That happens a lot.)
You also got to understand the training data you have is more suitable for training small task such as command and control task. It might be because of that, you cannot go up to two mixtures. That happens. However, I don't think the code should break in this case. I simply hate to see the code break so my help is volunteer.
Another suggestion: If you got only 10-15% accuracy, have you check whether the front-end setting in your decoding is exactly the same as what you used in wave2feat. Small difference will cause huge mistake. This is also the top killer of most speech recognition project. (May be I should also write an article call "10 good way to mess up the recognizer").
I am still waiting for your setup. will try to help you as soon as got it. Before that, don't worry, keep you expectation in a correct level. Then you will feel much better.
Maintainer.
Arthur
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello
I built 1 gau per state model.
During training I didn't got any error from Trainer. But accuracy of model is very low. So tried to increase gaussia parameters.
I used following commands
1) inc_comp
/root/Sphinx/SphinxTrain/bin.i686-pc-linux-gnu/inc_comp -ninc 1 -dcountfn m
odel/cd_tied/hmm/mix_weight -inmixwfn model/cd_tied/hmm/mix_weight -outmixwfn model/cd_tied/hmm/mi
x_weight_Gau1 -inmeanfn model/cd_tied/hmm/mean -outmeanfn model/cd_tied/hmm/mean_Gau1 -invarfn mod
el/cd_tied/hmm/var -outvarfn model/cd_tied/hmm/var_Gau1 -ceplen 13 -feat 1s_c_d_dd
2) Then I try to use init_mixw command
/root/Sphinx/SphinxTrain/bin.i686-pc-linux-gnu/init_mixw \ -src_moddeffn /root/Sphinx/Hindi/Gau2/model/ci/model_architecture/model_def \ -src_ts2cbfn .cont. \ -src_mixwfn /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/mix_weight_Gau1 \ -src_meanfn /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/mean_Gau1 \ -src_varfn /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/var_Gau1 \ -src_tmatfn /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/trans_mat_Gau1 \ -dest_moddeffn /root/Sphinx/Hindi/Gau2/model/cd_tied/flatInit/initialized_model_def \ -dest_ts2cbfn .cont. \ -dest_mixwfn /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/mix_weight_Gau \ -dest_meanfn /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/mean_Gau \ -dest_varfn /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/var_Gau \ -dest_tmatfn /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/trans_mat_Gau \ -feat 1s_c_d_dd \ -ceplen 13
[Switch] [Default] [Value]
-help no no
-example no no
-src_moddeffn /root/Sphinx/Hindi/Gau2/model/ci/model_architecture/model_def
-src_ts2cbfn .cont.
-src_mixwfn /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/mix_weight_Gau1
-src_meanfn /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/mean_Gau1
-src_varfn /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/var_Gau1
-src_tmatfn /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/trans_mat_Gau1
-dest_moddeffn /root/Sphinx/Hindi/Gau2/model/cd_tied/flatInit/initialized_model_def
-dest_ts2cbfn .cont.
-dest_mixwfn /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/mix_weight_Gau
-dest_meanfn /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/mean_Gau
-dest_varfn /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/var_Gau
-dest_tmatfn /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/trans_mat_Gau
-feat 1s_12c_12d_3p_12dd 1s_c_d_dd
-ceplen 13 13
INFO: main.c(268): Reading src /root/Sphinx/Hindi/Gau2/model/ci/model_architecture/model_def
INFO: model_def_io.c(587): Model definition info:
INFO: model_def_io.c(588): 46 total models defined (46 base, 0 tri)
INFO: model_def_io.c(589): 184 total states
INFO: model_def_io.c(590): 138 total tied states
INFO: model_def_io.c(591): 138 total tied CI states
INFO: model_def_io.c(592): 46 total tied transition matrices
INFO: model_def_io.c(593): 4 max state/model
INFO: model_def_io.c(594): 4 min state/model
INFO: main.c(285): Generating continous ts2cb mapping
INFO: main.c(300): Reading src /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/mix_weight_Gau1
INFO: s3mixw_io.c(116): Read /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/mix_weight_Gau1 [273x1x2 array]
INFO: main.c(309): Reading src /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/trans_mat_Gau1
INFO: s3tmat_io.c(115): Read /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/trans_mat_Gau1 [46x3x4 array]
INFO: main.c(319): Reading src /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/mean_Gau1
INFO: s3gau_io.c(162): Read /root/Sphinx/Hindi/Gau2/model/cd_tied/hmm/mean_Gau1 [273x1x2 array]
FATAL_ERROR: "main.c", line 340: src mean n_cb (== 273) inconsistent w/ ts2cb mapping 138
I am unable to understand last error.....
Will you help me?
Thanks in Advace.
Tushar P.
Let me do the usual check first.
Do you use too few utterances to train a model? Acoustic model training requred a large amount of training data. Less than 1000 utterances are usually not enough. This is the most important question. If you have only small amount of training data, then the training will fail some point or the other. Regardless whether the procedure is correct.
Otherwise, I am willing to take a look at why the tool gave your fatal error. Just send me tar and gunziped file direct to me archan at cs dot cmu dot edu.
Arthur
Hello Arthchan
Thank you for reply.
Our training data is of 4895 utterance...(40 Speakers)
Out of this around 193 file get rejected by Trainer i.e. around 4%.
All my setup is following link
www.ncb.ernet.in/~tushar/Gau1.zip
Following is Specification of our corpus data.....
/*************/
/ SphinxTrain Report /
/ Date : 12/07/2005 */
/****************/
No of Speaker : 49
For training : Data of 40 Speaker.
For Testing : Data of 9 Speaker, And live data.
Text corpus Specifications
DataSet :
CI phones : a aa ai an au ao b bh c ch d dh ee f
g gh h i ii j jh k kh l m n n2 o
oo p q r ri s sh shh t th u uu v x
xh y
Total CI phones : 46
Wave File Specifications
1) 16 KHz sampling frequency.
2) 16 Bits per sampling.
3) Mono channel.
4) Encoding format PCM (Uncompressed).
5) Spoken Data of 6.126 Hours.
Feature Built
Attempt 1)
CI HMM Model
CD Untied HMM Model
Building Decision Tree
Building CI Tied HMM Model
Organization of Training setup is like this
Gau1
+---error To maintain the error logs
+---linguistic To store the linguisitc information
+---prunetrees
+---trees
+---logDir To maintains the status of informations of command
+---cd
+---cd_tied
+---ci
+---linguistic
+---model Model are store in this directory
+---cd For CD UnTied model
+---accumulate To store intermmediate data during iteration
+---acc0
+---acc1
+---acc10
+---acc2
+---acc3
+---acc4
+---acc5
+---acc6
+---acc7
+---acc8
+---acc9
+---flatInit Flat initialization of CD Tied model
+---hmm Final HMM model of CD Tied model
+---Text Text Version of HMM
+---model_architecture To keep mode_def
+---cd_tied For CD Tied model
+---accumulate To store intermmediate data during iteration
+---acc0
+---acc1
+---acc10
+---acc2
+---acc3
+---acc4
+---acc5
+---acc6
+---acc7
+---acc8
+---acc9
+---flatInit Flat initialization of CD Tied model
+---hmm Final HMM model of CD Tied model
+---Text Text Version of HMM
+---model_architecture To keep mode_def
+---temp
+---ci For CI model
+---accumulate To store intermmediate data during iteration
+---acc0
+---acc1
+---acc10
+---acc2
+---acc3
+---acc4
+---acc5
+---acc6
+---acc7
+---acc8
+---acc9
+---flatInit Flat initialization of CI model
+---hmm Final HMM model of CI model
+---Text Text Version of HMM
+---model_architecture To keep mode_def
+---phones
+---cd To keep cdTriphone list, dict, transcript and filler dict
+---ci To keep ciPhone list, dict, transcript and filler dict
+---result To store status of shell script log
+---script Shell script written to build ciModel, cdUntiedModel, cdTiedModel, etc....
+---sound
+---feat To keep feature (MFC) files.
+---raw
+---wav To keep wav files.
Please help me...... I invested my lot of time in this project......
Your comment are most welcome, please suggest your tthoughts
Once again thanx for your willingness to help me
Tushat P.
India
Hi Tushat,
1, Don't worry. We'll be with you.
2, I cannot access Gau1.zip, could you put it in a proper path?
You can also send me a mail direct at archan at cs dot cmu dot edu. This will protect your setup.
3, I read your whole message and here is the thing. Apparently, in attemp1, your transcription might have been off-by-one. Have you fix it?
4, I also found that
next_prime: failed to find next primt for 0
next_prime: failed to find next primt for 0
bw: baum_welch.c:166: baum_welch_update: Assertion `n_state > 0' failed.
That sounds like a problem in SphinxTrain hash table issues.
Again, it is a pity I could not get your setup this time, please kindly send me the link again. I will try my best to help you.
Arthur Chan
Maintainer
Hello Arthur.
I really happy for help.
Size of Trainning set-up is very large. It is 87MB. So it is difficult to send by mail.
I will try to upload other site. send you tha link.
Right now I am in confusion.... Is my traning wrong? please guess by error.
Will we I get accuracy atleat 90% for command and control and atleat 50% for Dictation mode.
For avaliable model i.e. 1 Gau per state model... Accuracy is very low may around 10-15%.
Is it possible to get more accuracy? What should be done to increase the accuracy.
Need your help and support..
Your help is highly valuable contribution to this project.
I am trying to send my set-up by uploading it other site.
Thank You Arthur
Tushar P.
Hi Tushar,
First of all, training always get wrong no matter how good the training tools are. Your confusion is common. I have been through this for 5 times in the past. So don't worry.
As for what wrong of the training, I still don't know yet. I will guess this is just some kind of off-by-one problem. (Like you use the wrong transcription for wrong wave file. That happens a lot.)
You also got to understand the training data you have is more suitable for training small task such as command and control task. It might be because of that, you cannot go up to two mixtures. That happens. However, I don't think the code should break in this case. I simply hate to see the code break so my help is volunteer.
Another suggestion: If you got only 10-15% accuracy, have you check whether the front-end setting in your decoding is exactly the same as what you used in wave2feat. Small difference will cause huge mistake. This is also the top killer of most speech recognition project. (May be I should also write an article call "10 good way to mess up the recognizer").
I am still waiting for your setup. will try to help you as soon as got it. Before that, don't worry, keep you expectation in a correct level. Then you will feel much better.
Maintainer.
Arthur