I feel like a complete fool about the other questions I have asked, but I think this one is a little more reasonable:

I am trying to use SphinxTrain to create models for use with Sphinx4. I have been following the instructions in tinydoc.txt, and I used the suggestion from http://www.speech.cs.cmu.edu/sphinxman/scriptman1.html#02a for the small closed vocabulary. In tinydoc.txt, I get down to RunAll, and it runs most of the way through with a couple warnings but no errors. Then at module 50, it starts giving lots of errors. Here is the logfile 50.cd_hmm_tied/time.4.1.norm.log:

C:\SphinxTrain\jw\time\bin\norm.exe \
-accumdir C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1 \
-mixwfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/mixture_weights \
-tmatfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/transition_matrices \
-meanfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/means \
-varfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/variances \
-fullvar no

[Switch] [Default] [Value]
-help no no
-example no no
-accumdir C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1
-oaccumdir
-tmatfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/transition_matrices
-mixwfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/mixture_weights
-meanfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/means
-varfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/variances
-regmatfn
-dcountfn
-inmixwfn
-inmeanfn
-invarfn
-fullvar no no
-tiedvar no no
INFO: ........\src\programs\norm\main.c(228): Reading and accumulating counts from C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1
INFO: ........\src\libs\libio\s3mixw_io.c(116): Read C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1/mixw_counts [9x1x4 array]
INFO: ........\src\libs\libio\s3tmat_io.c(115): Read C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1/tmat_counts [2x3x4 array]
INFO: ........\src\libs\libio\s3gau_io.c(379): Read C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1/gauden_counts with means with vars [9x1x4 vector arrays]
INFO: ........\src\programs\norm\main.c(450): Normalizing mean for n_mgau= 9, n_stream= 1, n_density= 4
INFO: ........\src\programs\norm\main.c(474): Normalizing var
ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 5, feat= 0, density=0, component=13) < 0
ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 5, feat= 0, density=0, component=15) < 0
ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 5, feat= 0, density=0, component=17) < 0
ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 5, feat= 0, density=0, component=19) < 0

...and so on. Then here's the final one and the stuff below it:

ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 5, feat= 0, density=3, component=33) < 0
INFO: ........\src\libs\libio\s3mixw_io.c(232): Wrote C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/mixture_weights [9x1x4 array]
INFO: ........\src\libs\libio\s3tmat_io.c(174): Wrote C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/transition_matrices [2x3x4 array]
INFO: ........\src\libs\libio\s3gau_io.c(226): Wrote C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/means [9x1x4 array]
INFO: ........\src\libs\libio\s3gau_io.c(226): Wrote C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/variances [9x1x4 array]
Mon Dec 10 09:12:08 2007
Current Overall Likelihood Per Frame = 20.974335154827

And 50.cd_hmm_tied/time.8.1.norm.log:

[Switch] [Default] [Value]
-help no no
-example no no
-accumdir C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1
-oaccumdir
-tmatfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/transition_matrices
-mixwfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/mixture_weights
-meanfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/means
-varfn C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/variances
-regmatfn
-dcountfn
-inmixwfn
-inmeanfn
-invarfn
-fullvar no no
-tiedvar no no
INFO: ........\src\programs\norm\main.c(228): Reading and accumulating counts from C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1
INFO: ........\src\libs\libio\s3mixw_io.c(116): Read C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1/mixw_counts [9x1x8 array]
INFO: ........\src\libs\libio\s3tmat_io.c(115): Read C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1/tmat_counts [2x3x4 array]
INFO: ........\src\libs\libio\s3gau_io.c(379): Read C:\SphinxTrain\jw\time\bwaccumdir\time_buff_1/gauden_counts with means with vars [9x1x8 vector arrays]
INFO: ........\src\programs\norm\main.c(450): Normalizing mean for n_mgau= 9, n_stream= 1, n_density= 8
INFO: ........\src\programs\norm\main.c(474): Normalizing var
ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 5, feat= 0, density=0, component=0) < 0
ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 5, feat= 0, density=0, component=1) < 0
ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 5, feat= 0, density=0, component=3) < 0
ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 5, feat= 0, density=0, component=5) < 0

...and so on. Then here's the final one and the stuff below it:

ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 5, feat= 0, density=7, component=33) < 0
INFO: ........\src\libs\libio\s3mixw_io.c(232): Wrote C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/mixture_weights [9x1x8 array]
INFO: ........\src\libs\libio\s3tmat_io.c(174): Wrote C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/transition_matrices [2x3x4 array]
INFO: ........\src\libs\libio\s3gau_io.c(226): Wrote C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/means [9x1x8 array]
INFO: ........\src\libs\libio\s3gau_io.c(226): Wrote C:/SphinxTrain/jw/time/model_parameters/time.cd_cont_1000/variances [9x1x8 array]
Mon Dec 10 09:12:15 2007
Current Overall Likelihood Per Frame = 22.9961566484517

Here are some other files that may be of interest:

etc\feat.params:

-alpha 0.97
-dither yes
-doublebw no
-nfilt 40
-ncep 13
-lowerf 133.33334
-upperf 6855.4976
-nfft 512
-wlen 0.0256
-transform legacy
-feat CFG_FEATURE
-agc CFG_AGC
-cmn CFG_CMN
-varnorm CFG_VARNORM

etc\sphinx_train.cfg:

Configuration script for sphinx trainer --mode:Perl--

$CFG_VERBOSE = 1; # Determines how much goes to the screen.

These are filled in at configuration time

$CFG_DB_NAME = "time";
$CFG_BASE_DIR = "C:/SphinxTrain/jw/time";
$CFG_SPHINXTRAIN_DIR = "C:\SphinxTrain";

Directory containing SphinxTrain binaries

$CFG_BIN_DIR = "$CFG_BASE_DIR/bin";
$CFG_GIF_DIR = "$CFG_BASE_DIR/gifs";
$CFG_SCRIPT_DIR = "$CFG_BASE_DIR/scripts_pl";

Experiment name, will be used to name model files and log files

$CFG_EXPTNAME = "$CFG_DB_NAME";

Audio waveform and feature file information

$CFG_WAVFILES_DIR = "$CFG_BASE_DIR/wav";
$CFG_WAVFILE_EXTENSION = 'sph';
$CFG_WAVFILE_TYPE = 'nist'; # one of nist, mswav, raw
$CFG_FEATFILES_DIR = "$CFG_BASE_DIR/feat";
$CFG_FEATFILE_EXTENSION = 'mfc';
$CFG_VECTOR_LENGTH = 13;

$CFG_MIN_ITERATIONS = 1; # BW Iterate at least this many times
$CFG_MAX_ITERATIONS = 30; # BW Don't iterate more than this, somethings likely wrong.

(none/max) Type of AGC to apply to input files

$CFG_AGC = 'none';

(current/none) Type of cepstral mean subtraction/normalization

to apply to input files

$CFG_CMN = 'current';

(yes/no) Normalize variance of input files to 1.0

$CFG_VARNORM = 'no';

(yes/no) Use letter-to-sound rules to guess pronunciations of

unknown words (English, 40-phone specific)

$CFG_LTSOOV = 'no';

(yes/no) Train full covariance matrices

$CFG_FULLVAR = 'no';

(yes/no) Use diagonals only of full covariance matrices for

Forward-Backward evaluation (recommended if CFG_FULLVAR is yes)

$CFG_DIAGFULL = 'no';

Directory to write queue manager logs to

$CFG_QMGR_DIR = "$CFG_BASE_DIR/qmanager";

Directory to write training logs to

$CFG_LOG_DIR = "$CFG_BASE_DIR/logdir";

Directory for re-estimation counts

$CFG_BWACCUM_DIR = "$CFG_BASE_DIR/bwaccumdir";

Directory for state segmentations (output of force alignment, input

to LDA training)

$CFG_STSEG_DIR = "$CFG_BASE_DIR/stseg";

Directory to write model parameter files to

$CFG_MODEL_DIR = "$CFG_BASE_DIR/model_parameters";

Directory containing transcripts and control files for

speaker-adaptive training

$CFG_LIST_DIR = "$CFG_BASE_DIR/etc";

variables used in main training of models

$CFG_DICTIONARY = "$CFG_LIST_DIR/$CFG_DB_NAME.dic";
$CFG_RAWPHONEFILE = "$CFG_LIST_DIR/$CFG_DB_NAME.phone";
$CFG_FILLERDICT = "$CFG_LIST_DIR/$CFG_DB_NAME.filler";
$CFG_LISTOFFILES = "$CFG_LIST_DIR/${CFG_DB_NAME}_train.fileids";
$CFG_TRANSCRIPTFILE = "$CFG_LIST_DIR/${CFG_DB_NAME}_train.transcription";
$CFG_FEATPARAMS = "$CFG_LIST_DIR/feat.params";

variables used in characterizing models

$CFG_HMM_TYPE = '.cont.'; # Sphinx III

$CFG_HMM_TYPE = '.semi.'; # Sphinx II

if (($CFG_HMM_TYPE ne ".semi.") and ($CFG_HMM_TYPE ne ".cont.")) {
die "Please choose one CFG_HMM_TYPE out of '.cont.' or '.semi.', " .
"currently $CFG_HMM_TYPE\n";
}

if ($CFG_HMM_TYPE eq '.semi.') {
$CFG_DIRLABEL = 'semi';
$CFG_STATESPERHMM = 5;
$CFG_SKIPSTATE = 'yes';

Four (4) stream features for Sphinx II

$CFG_FEATURE = "s2_4x";
$CFG_NUM_STREAMS = 4;
$CFG_INITIAL_NUM_DENSITIES = 256;
$CFG_FINAL_NUM_DENSITIES = 256;
die "For semi continuous models, the initial and final models have the same density"
if ($CFG_INITIAL_NUM_DENSITIES != $CFG_FINAL_NUM_DENSITIES);
} elsif ($CFG_HMM_TYPE eq '.cont.') {
$CFG_DIRLABEL = 'cont';
$CFG_STATESPERHMM = 3;
$CFG_SKIPSTATE = 'no';

Single stream features - Sphinx 3

$CFG_FEATURE = "1s_c_d_dd";
$CFG_NUM_STREAMS = 1;
$CFG_INITIAL_NUM_DENSITIES = 1;
$CFG_FINAL_NUM_DENSITIES = 8;
die "The initial has to be less than the final number of densities"
if ($CFG_INITIAL_NUM_DENSITIES > $CFG_FINAL_NUM_DENSITIES);
}

(yes/no) Train multiple-gaussian context-independent models (useful

for alignment, use 'no' otherwise) in the models created

specifically for forced alignment

$CFG_FALIGN_CI_MGAU = 'no';

(yes/no) Train multiple-gaussian context-independent models (useful

for alignment, use 'no' otherwise)

$CFG_CI_MGAU = 'no';

Number of tied states (senones) to create in decision-tree clustering

$CFG_N_TIED_STATES = 1000;

How many parts to run Forward-Backward estimatinon in

$CFG_NPART = 1;

(yes/no) Train a single decision tree for all phones (actually one

per state) (useful for grapheme-based models, use 'no' otherwise)

$CFG_CROSS_PHONE_TREES = 'no';

Use force-aligned transcripts (if available) as input to training

$CFG_FORCEDALIGN = 'no';

Use a specific set of models for force alignment. If not defined,

context-independent models for the current experiment will be used.

$CFG_FORCE_ALIGN_MDEF = "$CFG_BASE_DIR/model_architecture/$CFG_EXPTNAME.falign_ci.mdef";
if ($CFG_FALIGN_CI_MGAU eq 'yes') {
$CFG_FORCE_ALIGN_MODELDIR = "$CFG_MODEL_DIR/$CFG_EXPTNAME.falign_ci_${CFG_DIRLABEL}$CFG_FINAL_NUM_DENSITIES";
}
else {
$CFG_FORCE_ALIGN_MODELDIR = "$CFG_MODEL_DIR/$CFG_EXPTNAME.falign_ci$CFG_DIRLABEL";
}

Use a specific dictionary and filler dictionary for force alignment.

If these are not defined, a dictionary and filler dictionary will be

created from $CFG_DICTIONARY and $CFG_FILLERDICT, with noise words

removed from the filler dictionary and added to the dictionary (this

is because the force alignment is not very good at inserting them)

$CFG_FORCE_ALIGN_DICTIONARY = "$ST::CFG_BASE_DIR/falignout$ST::CFG_EXPTNAME.falign.dict";;

$CFG_FORCE_ALIGN_FILLERDICT = "$ST::CFG_BASE_DIR/falignout/$ST::CFG_EXPTNAME.falign.fdict";;

Use a particular beam width for force alignment. The wider

(i.e. smaller numerically) the beam, the fewer sentences will be

rejected for bad alignment.

$CFG_FORCE_ALIGN_BEAM = 1e-60;

Transformation file to use for LDA. LDA will be done if this is

defined and the file exists. 03.lda_train will generate this for

you (but you must run force alignment first)

$CFG_LDA_TRANSFORM = "${CFG_MODEL_DIR}/${CFG_EXPTNAME}.lda";

Dimensionality of LDA output

$CFG_LDA_DIMENSION = 29;

set convergence_ratio = 0.004

$CFG_CONVERGENCE_RATIO = 0.04;

Queue::POSIX for multiple CPUs on a local machine

Queue::PBS to use a PBS/TORQUE queue

$CFG_QUEUE_TYPE = "Queue";

Name of queue to use for PBS/TORQUE

$CFG_QUEUE_NAME = "workq";

(yes/no) Build questions for decision tree clustering automatically

$CFG_MAKE_QUESTS = "yes";

If CFG_MAKE_QUESTS is yes, questions are written to this file.

If CFG_MAKE_QUESTS is no, questions are read from this file.

$CFG_QUESTION_SET = "${CFG_BASE_DIR}/model_architecture/${CFG_EXPTNAME}.tree_questions";

$CFG_QUESTION_SET = "${CFG_BASE_DIR}/linguistic_questions";

$CFG_CP_OPERATION = "${CFG_BASE_DIR}/model_architecture/${CFG_EXPTNAME}.cpmeanvar";

This variable has to be defined, otherwise utils.pl will not load.

$CFG_DONE = 1;

return 1;

etc\time.dic:

COMPUTER COMPUTER

etc\time.filler:

<s> SIL
</s> SIL

etc\time.phone:

COMPUTER
SIL

etc\time_train.fileids:

computer

etc\time_train.transcription:

<s> COMPUTER </s> (computer)

Sorry if that looks intimidating. Most of it can probably be ignored. I just didn't know what would be useful. It's probably something wrong with my etc\ files. Does anyone know what the problem is or anything that I can try? Thanks.

Errors with SphinxTrain

Speech Recognition Toolkit

Forums

Help

Errors with SphinxTrain document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Configuration script for sphinx trainer --mode:Perl--

These are filled in at configuration time

Directory containing SphinxTrain binaries

Experiment name, will be used to name model files and log files

Audio waveform and feature file information

(none/max) Type of AGC to apply to input files

(current/none) Type of cepstral mean subtraction/normalization

to apply to input files

(yes/no) Normalize variance of input files to 1.0

(yes/no) Use letter-to-sound rules to guess pronunciations of

unknown words (English, 40-phone specific)

(yes/no) Train full covariance matrices

(yes/no) Use diagonals only of full covariance matrices for

Forward-Backward evaluation (recommended if CFG_FULLVAR is yes)

Directory to write queue manager logs to

Directory to write training logs to

Directory for re-estimation counts

Directory for state segmentations (output of force alignment, input

to LDA training)

Directory to write model parameter files to

Directory containing transcripts and control files for

speaker-adaptive training

*variables used in main training of models*

*variables used in characterizing models*

$CFG_HMM_TYPE = '.semi.'; # Sphinx II

Four (4) stream features for Sphinx II

Single stream features - Sphinx 3

(yes/no) Train multiple-gaussian context-independent models (useful

for alignment, use 'no' otherwise) in the models created

specifically for forced alignment

(yes/no) Train multiple-gaussian context-independent models (useful

for alignment, use 'no' otherwise)

Number of tied states (senones) to create in decision-tree clustering

How many parts to run Forward-Backward estimatinon in

(yes/no) Train a single decision tree for all phones (actually one

per state) (useful for grapheme-based models, use 'no' otherwise)

Use force-aligned transcripts (if available) as input to training

Use a specific set of models for force alignment. If not defined,

context-independent models for the current experiment will be used.

Use a specific dictionary and filler dictionary for force alignment.

If these are not defined, a dictionary and filler dictionary will be

created from $CFG_DICTIONARY and $CFG_FILLERDICT, with noise words

removed from the filler dictionary and added to the dictionary (this

is because the force alignment is not very good at inserting them)

$CFG_FORCE_ALIGN_DICTIONARY = "$ST::CFG_BASE_DIR/falignout$ST::CFG_EXPTNAME.falign.dict";;

$CFG_FORCE_ALIGN_FILLERDICT = "$ST::CFG_BASE_DIR/falignout/$ST::CFG_EXPTNAME.falign.fdict";;

Use a particular beam width for force alignment. The wider

(i.e. smaller numerically) the beam, the fewer sentences will be

rejected for bad alignment.

Transformation file to use for LDA. LDA will be done if this is

defined and the file exists. 03.lda_train will generate this for

you (but you must run force alignment first)

Dimensionality of LDA output

set convergence_ratio = 0.004

Queue::POSIX for multiple CPUs on a local machine

Queue::PBS to use a PBS/TORQUE queue

Name of queue to use for PBS/TORQUE

(yes/no) Build questions for decision tree clustering automatically

If CFG_MAKE_QUESTS is yes, questions are written to this file.

If CFG_MAKE_QUESTS is no, questions are read from this file.

$CFG_QUESTION_SET = "${CFG_BASE_DIR}/linguistic_questions";

This variable has to be defined, otherwise utils.pl will not load.

Errors with SphinxTrain

variables used in main training of models

variables used in characterizing models