I feel like a complete fool about the other questions I have asked, but I think this one is a little more reasonable:
I am trying to use SphinxTrain to create models for use with Sphinx4. I have been following the instructions in tinydoc.txt, and I used the suggestion from http://www.speech.cs.cmu.edu/sphinxman/scriptman1.html#02a for the small closed vocabulary. In tinydoc.txt, I get down to RunAll, and it runs most of the way through with a couple warnings but no errors. Then at module 50, it starts giving lots of errors. Here is the logfile 50.cd_hmm_tied/time.4.1.norm.log:
if (($CFG_HMM_TYPE ne ".semi.") and ($CFG_HMM_TYPE ne ".cont.")) {
die "Please choose one CFG_HMM_TYPE out of '.cont.' or '.semi.', " .
"currently $CFG_HMM_TYPE\n";
}
$CFG_FEATURE = "s2_4x";
$CFG_NUM_STREAMS = 4;
$CFG_INITIAL_NUM_DENSITIES = 256;
$CFG_FINAL_NUM_DENSITIES = 256;
die "For semi continuous models, the initial and final models have the same density"
if ($CFG_INITIAL_NUM_DENSITIES != $CFG_FINAL_NUM_DENSITIES);
} elsif ($CFG_HMM_TYPE eq '.cont.') {
$CFG_DIRLABEL = 'cont';
$CFG_STATESPERHMM = 3;
$CFG_SKIPSTATE = 'no';
Single stream features - Sphinx 3
$CFG_FEATURE = "1s_c_d_dd";
$CFG_NUM_STREAMS = 1;
$CFG_INITIAL_NUM_DENSITIES = 1;
$CFG_FINAL_NUM_DENSITIES = 8;
die "The initial has to be less than the final number of densities"
if ($CFG_INITIAL_NUM_DENSITIES > $CFG_FINAL_NUM_DENSITIES);
}
This variable has to be defined, otherwise utils.pl will not load.
$CFG_DONE = 1;
return 1;
etc\time.dic:
COMPUTER COMPUTER
etc\time.filler:
<s> SIL
</s> SIL
etc\time.phone:
COMPUTER
SIL
etc\time_train.fileids:
computer
etc\time_train.transcription:
<s> COMPUTER </s> (computer)
Sorry if that looks intimidating. Most of it can probably be ignored. I just didn't know what would be useful. It's probably something wrong with my etc\ files. Does anyone know what the problem is or anything that I can try? Thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Those errors are due to the lack of enough samples to train the model (in fact the gaussians, I think). You'll get the same if you try to test with the an4 database.
Don't worry about those errors. Unless the module fails (you can see it in the <task>.html file) everything is OK.
Regards!
Coriscow
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I feel like a complete fool about the other questions I have asked, but I think this one is a little more reasonable:
I am trying to use SphinxTrain to create models for use with Sphinx4. I have been following the instructions in tinydoc.txt, and I used the suggestion from http://www.speech.cs.cmu.edu/sphinxman/scriptman1.html#02a for the small closed vocabulary. In tinydoc.txt, I get down to RunAll, and it runs most of the way through with a couple warnings but no errors. Then at module 50, it starts giving lots of errors. Here is the logfile 50.cd_hmm_tied/time.4.1.norm.log:
C:\SphinxTrain\jw\time\bin\norm.exe \ -accumdir C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1 \ -mixwfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/mixture_weights \ -tmatfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/transition_matrices \ -meanfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/means \ -varfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/variances \ -fullvar no
[Switch] [Default] [Value]
-help no no
-example no no
-accumdir C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1
-oaccumdir
-tmatfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/transition_matrices
-mixwfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/mixture_weights
-meanfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/means
-varfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/variances
-regmatfn
-dcountfn
-inmixwfn
-inmeanfn
-invarfn
-fullvar no no
-tiedvar no no
INFO: ........\src\programs\norm\main.c(228): Reading and accumulating counts from C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1
INFO: ........\src\libs\libio\s3mixw_io.c(116): Read C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1/mixw_counts [9x1x4 array]
INFO: ........\src\libs\libio\s3tmat_io.c(115): Read C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1/tmat_counts [2x3x4 array]
INFO: ........\src\libs\libio\s3gau_io.c(379): Read C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1/gauden_counts with means with vars [9x1x4 vector arrays]
INFO: ........\src\programs\norm\main.c(450): Normalizing mean for n_mgau= 9, n_stream= 1, n_density= 4
INFO: ........\src\programs\norm\main.c(474): Normalizing var
ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 5, feat= 0, density=0, component=13) < 0
ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 5, feat= 0, density=0, component=15) < 0
ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 5, feat= 0, density=0, component=17) < 0
ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 5, feat= 0, density=0, component=19) < 0
...and so on. Then here's the final one and the stuff below it:
ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 5, feat= 0, density=3, component=33) < 0
INFO: ........\src\libs\libio\s3mixw_io.c(232): Wrote C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/mixture_weights [9x1x4 array]
INFO: ........\src\libs\libio\s3tmat_io.c(174): Wrote C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/transition_matrices [2x3x4 array]
INFO: ........\src\libs\libio\s3gau_io.c(226): Wrote C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/means [9x1x4 array]
INFO: ........\src\libs\libio\s3gau_io.c(226): Wrote C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/variances [9x1x4 array]
Mon Dec 10 09:12:08 2007
Current Overall Likelihood Per Frame = 20.974335154827
And 50.cd_hmm_tied/time.8.1.norm.log:
C:\SphinxTrain\jw\time\bin\norm.exe \ -accumdir C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1 \ -mixwfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/mixture_weights \ -tmatfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/transition_matrices \ -meanfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/means \ -varfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/variances \ -fullvar no
[Switch] [Default] [Value]
-help no no
-example no no
-accumdir C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1
-oaccumdir
-tmatfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/transition_matrices
-mixwfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/mixture_weights
-meanfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/means
-varfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/variances
-regmatfn
-dcountfn
-inmixwfn
-inmeanfn
-invarfn
-fullvar no no
-tiedvar no no
INFO: ........\src\programs\norm\main.c(228): Reading and accumulating counts from C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1
INFO: ........\src\libs\libio\s3mixw_io.c(116): Read C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1/mixw_counts [9x1x8 array]
INFO: ........\src\libs\libio\s3tmat_io.c(115): Read C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1/tmat_counts [2x3x4 array]
INFO: ........\src\libs\libio\s3gau_io.c(379): Read C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1/gauden_counts with means with vars [9x1x8 vector arrays]
INFO: ........\src\programs\norm\main.c(450): Normalizing mean for n_mgau= 9, n_stream= 1, n_density= 8
INFO: ........\src\programs\norm\main.c(474): Normalizing var
ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 5, feat= 0, density=0, component=0) < 0
ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 5, feat= 0, density=0, component=1) < 0
ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 5, feat= 0, density=0, component=3) < 0
ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 5, feat= 0, density=0, component=5) < 0
...and so on. Then here's the final one and the stuff below it:
ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 5, feat= 0, density=7, component=33) < 0
INFO: ........\src\libs\libio\s3mixw_io.c(232): Wrote C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/mixture_weights [9x1x8 array]
INFO: ........\src\libs\libio\s3tmat_io.c(174): Wrote C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/transition_matrices [2x3x4 array]
INFO: ........\src\libs\libio\s3gau_io.c(226): Wrote C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/means [9x1x8 array]
INFO: ........\src\libs\libio\s3gau_io.c(226): Wrote C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/variances [9x1x8 array]
Mon Dec 10 09:12:15 2007
Current Overall Likelihood Per Frame = 22.9961566484517
Here are some other files that may be of interest:
etc\feat.params:
-alpha 0.97
-dither yes
-doublebw no
-nfilt 40
-ncep 13
-lowerf 133.33334
-upperf 6855.4976
-nfft 512
-wlen 0.0256
-transform legacy
-feat CFG_FEATURE
-agc CFG_AGC
-cmn CFG_CMN
-varnorm CFG_VARNORM
etc\sphinx_train.cfg:
Configuration script for sphinx trainer --mode:Perl--
$CFG_VERBOSE = 1; # Determines how much goes to the screen.
These are filled in at configuration time
$CFG_DB_NAME = "time";
$CFG_BASE_DIR = "C:/SphinxTrain/jw/time";
$CFG_SPHINXTRAIN_DIR = "C:\SphinxTrain";
Directory containing SphinxTrain binaries
$CFG_BIN_DIR = "$CFG_BASE_DIR/bin";
$CFG_GIF_DIR = "$CFG_BASE_DIR/gifs";
$CFG_SCRIPT_DIR = "$CFG_BASE_DIR/scripts_pl";
Experiment name, will be used to name model files and log files
$CFG_EXPTNAME = "$CFG_DB_NAME";
Audio waveform and feature file information
$CFG_WAVFILES_DIR = "$CFG_BASE_DIR/wav";
$CFG_WAVFILE_EXTENSION = 'sph';
$CFG_WAVFILE_TYPE = 'nist'; # one of nist, mswav, raw
$CFG_FEATFILES_DIR = "$CFG_BASE_DIR/feat";
$CFG_FEATFILE_EXTENSION = 'mfc';
$CFG_VECTOR_LENGTH = 13;
$CFG_MIN_ITERATIONS = 1; # BW Iterate at least this many times
$CFG_MAX_ITERATIONS = 30; # BW Don't iterate more than this, somethings likely wrong.
(none/max) Type of AGC to apply to input files
$CFG_AGC = 'none';
(current/none) Type of cepstral mean subtraction/normalization
to apply to input files
$CFG_CMN = 'current';
(yes/no) Normalize variance of input files to 1.0
$CFG_VARNORM = 'no';
(yes/no) Use letter-to-sound rules to guess pronunciations of
unknown words (English, 40-phone specific)
$CFG_LTSOOV = 'no';
(yes/no) Train full covariance matrices
$CFG_FULLVAR = 'no';
(yes/no) Use diagonals only of full covariance matrices for
Forward-Backward evaluation (recommended if CFG_FULLVAR is yes)
$CFG_DIAGFULL = 'no';
Directory to write queue manager logs to
$CFG_QMGR_DIR = "$CFG_BASE_DIR/qmanager";
Directory to write training logs to
$CFG_LOG_DIR = "$CFG_BASE_DIR/logdir";
Directory for re-estimation counts
$CFG_BWACCUM_DIR = "$CFG_BASE_DIR/bwaccumdir";
Directory for state segmentations (output of force alignment, input
to LDA training)
$CFG_STSEG_DIR = "$CFG_BASE_DIR/stseg";
Directory to write model parameter files to
$CFG_MODEL_DIR = "$CFG_BASE_DIR/model_parameters";
Directory containing transcripts and control files for
speaker-adaptive training
$CFG_LIST_DIR = "$CFG_BASE_DIR/etc";
*variables used in main training of models*
$CFG_DICTIONARY = "$CFG_LIST_DIR/$CFG_DB_NAME.dic";
$CFG_RAWPHONEFILE = "$CFG_LIST_DIR/$CFG_DB_NAME.phone";
$CFG_FILLERDICT = "$CFG_LIST_DIR/$CFG_DB_NAME.filler";
$CFG_LISTOFFILES = "$CFG_LIST_DIR/${CFG_DB_NAME}_train.fileids";
$CFG_TRANSCRIPTFILE = "$CFG_LIST_DIR/${CFG_DB_NAME}_train.transcription";
$CFG_FEATPARAMS = "$CFG_LIST_DIR/feat.params";
*variables used in characterizing models*
$CFG_HMM_TYPE = '.cont.'; # Sphinx III
$CFG_HMM_TYPE = '.semi.'; # Sphinx II
if (($CFG_HMM_TYPE ne ".semi.") and ($CFG_HMM_TYPE ne ".cont.")) {
die "Please choose one CFG_HMM_TYPE out of '.cont.' or '.semi.', " .
"currently $CFG_HMM_TYPE\n";
}
if ($CFG_HMM_TYPE eq '.semi.') {
$CFG_DIRLABEL = 'semi';
$CFG_STATESPERHMM = 5;
$CFG_SKIPSTATE = 'yes';
Four (4) stream features for Sphinx II
$CFG_FEATURE = "s2_4x";
$CFG_NUM_STREAMS = 4;
$CFG_INITIAL_NUM_DENSITIES = 256;
$CFG_FINAL_NUM_DENSITIES = 256;
die "For semi continuous models, the initial and final models have the same density"
if ($CFG_INITIAL_NUM_DENSITIES != $CFG_FINAL_NUM_DENSITIES);
} elsif ($CFG_HMM_TYPE eq '.cont.') {
$CFG_DIRLABEL = 'cont';
$CFG_STATESPERHMM = 3;
$CFG_SKIPSTATE = 'no';
Single stream features - Sphinx 3
$CFG_FEATURE = "1s_c_d_dd";
$CFG_NUM_STREAMS = 1;
$CFG_INITIAL_NUM_DENSITIES = 1;
$CFG_FINAL_NUM_DENSITIES = 8;
die "The initial has to be less than the final number of densities"
if ($CFG_INITIAL_NUM_DENSITIES > $CFG_FINAL_NUM_DENSITIES);
}
(yes/no) Train multiple-gaussian context-independent models (useful
for alignment, use 'no' otherwise) in the models created
specifically for forced alignment
$CFG_FALIGN_CI_MGAU = 'no';
(yes/no) Train multiple-gaussian context-independent models (useful
for alignment, use 'no' otherwise)
$CFG_CI_MGAU = 'no';
Number of tied states (senones) to create in decision-tree clustering
$CFG_N_TIED_STATES = 1000;
How many parts to run Forward-Backward estimatinon in
$CFG_NPART = 1;
(yes/no) Train a single decision tree for all phones (actually one
per state) (useful for grapheme-based models, use 'no' otherwise)
$CFG_CROSS_PHONE_TREES = 'no';
Use force-aligned transcripts (if available) as input to training
$CFG_FORCEDALIGN = 'no';
Use a specific set of models for force alignment. If not defined,
context-independent models for the current experiment will be used.
$CFG_FORCE_ALIGN_MDEF = "$CFG_BASE_DIR/model_architecture/$CFG_EXPTNAME.falign_ci.mdef";
if ($CFG_FALIGN_CI_MGAU eq 'yes') {
$CFG_FORCE_ALIGN_MODELDIR = "$CFG_MODEL_DIR/$CFG_EXPTNAME.falign_ci_${CFG_DIRLABEL}$CFG_FINAL_NUM_DENSITIES";
}
else {
$CFG_FORCE_ALIGN_MODELDIR = "$CFG_MODEL_DIR/$CFG_EXPTNAME.falign_ci$CFG_DIRLABEL";
}
Use a specific dictionary and filler dictionary for force alignment.
If these are not defined, a dictionary and filler dictionary will be
created from $CFG_DICTIONARY and $CFG_FILLERDICT, with noise words
removed from the filler dictionary and added to the dictionary (this
is because the force alignment is not very good at inserting them)
$CFG_FORCE_ALIGN_DICTIONARY = "$ST::CFG_BASE_DIR/falignout$ST::CFG_EXPTNAME.falign.dict";;
$CFG_FORCE_ALIGN_FILLERDICT = "$ST::CFG_BASE_DIR/falignout/$ST::CFG_EXPTNAME.falign.fdict";;
Use a particular beam width for force alignment. The wider
(i.e. smaller numerically) the beam, the fewer sentences will be
rejected for bad alignment.
$CFG_FORCE_ALIGN_BEAM = 1e-60;
Transformation file to use for LDA. LDA will be done if this is
defined and the file exists. 03.lda_train will generate this for
you (but you must run force alignment first)
$CFG_LDA_TRANSFORM = "${CFG_MODEL_DIR}/${CFG_EXPTNAME}.lda";
Dimensionality of LDA output
$CFG_LDA_DIMENSION = 29;
set convergence_ratio = 0.004
$CFG_CONVERGENCE_RATIO = 0.04;
Queue::POSIX for multiple CPUs on a local machine
Queue::PBS to use a PBS/TORQUE queue
$CFG_QUEUE_TYPE = "Queue";
Name of queue to use for PBS/TORQUE
$CFG_QUEUE_NAME = "workq";
(yes/no) Build questions for decision tree clustering automatically
$CFG_MAKE_QUESTS = "yes";
If CFG_MAKE_QUESTS is yes, questions are written to this file.
If CFG_MAKE_QUESTS is no, questions are read from this file.
$CFG_QUESTION_SET = "${CFG_BASE_DIR}/model_architecture/${CFG_EXPTNAME}.tree_questions";
$CFG_QUESTION_SET = "${CFG_BASE_DIR}/linguistic_questions";
$CFG_CP_OPERATION = "${CFG_BASE_DIR}/model_architecture/${CFG_EXPTNAME}.cpmeanvar";
This variable has to be defined, otherwise utils.pl will not load.
$CFG_DONE = 1;
return 1;
etc\time.dic:
COMPUTER COMPUTER
etc\time.filler:
<s> SIL
</s> SIL
etc\time.phone:
COMPUTER
SIL
etc\time_train.fileids:
computer
etc\time_train.transcription:
<s> COMPUTER </s> (computer)
Sorry if that looks intimidating. Most of it can probably be ignored. I just didn't know what would be useful. It's probably something wrong with my etc\ files. Does anyone know what the problem is or anything that I can try? Thanks.
Hi John!
Those errors are due to the lack of enough samples to train the model (in fact the gaussians, I think). You'll get the same if you try to test with the an4 database.
Don't worry about those errors. Unless the module fails (you can see it in the <task>.html file) everything is OK.
Regards!
Coriscow