CMU Sphinx / Forums / Help: kmeans.c Empty cluster

I need to create a simple model of 10 words. When I run the script
scripts_pl/RunAll.pl
I get this output to the screen:

MODULE: 00 verify training files
O.S. is case sensitive ("A" != "a").
Phones will be treated as case sensitive.
    Phase 1: DICT - Checking to see if the dict and filler dict agrees with the phonelist file.
        Found 6 words using 10 phones
    Phase 2: DICT - Checking to make sure there are not duplicate entries in the dictionary
    Phase 3: CTL - Check general format; utterance length (must be positive); files exist
    Phase 4: CTL - Checking number of lines in the transcript should match lines in control file
    Phase 5: CTL - Determine amount of training data, see if n_tied_states seems reasonable.
        Estimated Total Hours Training: 0.000875
        This is a small amount of data, no comment at this time
    Phase 6: TRANSCRIPT - Checking that all the words in the transcript are in the dictionary
        Words in dictionary: 3
        Words in filler dictionary: 3
    Phase 7: TRANSCRIPT - Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
Feature type is s2_4x which is 4 streams
LDA/MLLT only has sense for single stream features, for example 1s_c_d_dd
Skipping LDA training
Feature type is s2_4x which is 4 streams
LDA/MLLT only has sense for single stream features, for example 1s_c_d_dd
Skipping MLLT training
MODULE: 05 Vector Quantization
This step had 2 ERROR messages and 8044 WARNING messages.  Please check the log file for details.
MODULE: 10 Training Context Independent models for forced alignment and VTLN
Skipped:  $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
Skipped:  $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
MODULE: 11 Force-aligning transcripts
Skipped:  $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
MODULE: 12 Force-aligning data for VTLN
Skipped:  $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
MODULE: 20 Training Context Independent models
    Phase 1: Cleaning up directories:
    accumulator...logs...qmanager...models...
    Phase 2: Flat initialize
    Phase 3: Forward-Backward
        Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
        0% 100% 
This step had 18 ERROR messages and 27 WARNING messages.  Please check the log file for details.
Training failed in iteration 1
Something failed: (/home/zentarim/voice/scripts_pl/20.ci_hmm/slave_convg.pl)





zentarim@Sphinx:~/voice$ cd logdir/
zentarim@Sphinx:~/voice/logdir$ ls
05.vector_quantize  20.ci_hmm
zentarim@Sphinx:~/voice/logdir$ cd 05.vector_quantize/
zentarim@Sphinx:~/voice/logdir/05.vector_quantize$ ls
voice.kmeans.log  voice.vq.agg_seg.log

File voice.kmeans.log contains :

...
INFO: feat.c(684): Initializing feature stream to type: 's2_4x', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
INFO: main.c(520): No mdef files.  Assuming 1-class init
INFO: main.c(1352): 1-class dump file
INFO: main.c(1390): Corpus 0: sz==315 frames
INFO: main.c(1399): Convergence ratios are abs(cur - prior) / abs(prior)
INFO: main.c(231): alloc'ing 0Mb obs buf
INFO: main.c(577): Initializing means using random k-means
INFO: main.c(580): Trial 0: 256 means
INFO: kmeans.c(153): km iter [0] 1.000000e+00 ...
WARNING: "kmeans.c", line 431: Empty cluster 1
WARNING: "kmeans.c", line 431: Empty cluster 35
WARNING: "kmeans.c", line 431: Empty cluster 54
WARNING: "kmeans.c", line 431: Empty cluster 59
...
INFO: main.c(613):  -> Aborting k-means, bad initialization
INFO: kmeans.c(153): km iter [0] 1.000000e+00 ...
...

WARNING: "kmeans.c", line 431: Empty cluster 250
WARNING: "kmeans.c", line 431: Empty cluster 251
WARNING: "kmeans.c", line 431: Empty cluster 253
INFO: main.c(613):  -> Aborting k-means, bad initialization
INFO: main.c(622):  best-so-far sqerr = -1.000000e+00
ERROR: "main.c", line 841: Too few observations for kmeans
ERROR: "main.c", line 1408: Unable to do k-means for state 0; skipping...
INFO: s3gau_io.c(226): Wrote /home/zentarim/voice/model_parameters/voice.ci_semi_flatinitial/means [1x4x256 array]
INFO: s3gau_io.c(226): Wrote /home/zentarim/voice/model_parameters/voice.ci_semi_flatinitial/variances [1x4x256 array]
INFO: main.c(1509): No mixing weight file given; none written
INFO: main.c(1669): TOTALS: km 0.046x 1.692e+00 var 0.000x 0.000e+00 em 0.000x 0.000e+00 all 0.047x 1.653e+00

My etc/sphinx_train.cfg contains:

# Configuration script for sphinx trainer                  -*-mode:Perl-*-

$CFG_VERBOSE = 1;       # Determines how much goes to the screen.

# These are filled in at configuration time
$CFG_DB_NAME = "voice";
$CFG_BASE_DIR = "/home/zentarim/voice";
$CFG_SPHINXTRAIN_DIR = "/home/zentarim/sphinxtrain-1.0.7";

# Directory containing SphinxTrain binaries
$CFG_BIN_DIR = "$CFG_BASE_DIR/bin";
$CFG_GIF_DIR = "$CFG_BASE_DIR/gifs";
$CFG_SCRIPT_DIR = "$CFG_BASE_DIR/scripts_pl";

# Experiment name, will be used to name model files and log files
$CFG_EXPTNAME = "$CFG_DB_NAME";

# Audio waveform and feature file information
$CFG_WAVFILES_DIR = "$CFG_BASE_DIR/wav";
$CFG_WAVFILE_EXTENSION = 'wav';
$CFG_WAVFILE_TYPE = 'mswav'; # one of nist, mswav, raw
$CFG_FEATFILES_DIR = "$CFG_BASE_DIR/feat";
$CFG_FEATFILE_EXTENSION = 'mfc';
$CFG_VECTOR_LENGTH = 13;

$CFG_MIN_ITERATIONS = 1;  # BW Iterate at least this many times
$CFG_MAX_ITERATIONS = 10; # BW Don't iterate more than this, somethings likely wrong.

# (none/max) Type of AGC to apply to input files
$CFG_AGC = 'none';
# (current/none) Type of cepstral mean subtraction/normalization
# to apply to input files
$CFG_CMN = 'current';
# (yes/no) Normalize variance of input files to 1.0
$CFG_VARNORM = 'no';
# (yes/no) Use letter-to-sound rules to guess pronunciations of
# unknown words (English, 40-phone specific)
$CFG_LTSOOV = 'no';
# (yes/no) Train full covariance matrices
$CFG_FULLVAR = 'no';
# (yes/no) Use diagonals only of full covariance matrices for
# Forward-Backward evaluation (recommended if CFG_FULLVAR is yes)
$CFG_DIAGFULL = 'no';

# (yes/no) Perform vocal tract length normalization in training.  This
# will result in a "normalized" model which requires VTLN to be done
# during decoding as well.
$CFG_VTLN = 'no';
# Starting warp factor for VTLN
$CFG_VTLN_START = 0.80;
# Ending warp factor for VTLN
$CFG_VTLN_END = 1.40;
# Step size of warping factors
$CFG_VTLN_STEP = 0.05;

# Directory to write queue manager logs to
$CFG_QMGR_DIR = "$CFG_BASE_DIR/qmanager";
# Directory to write training logs to
$CFG_LOG_DIR = "$CFG_BASE_DIR/logdir";
# Directory for re-estimation counts
$CFG_BWACCUM_DIR = "$CFG_BASE_DIR/bwaccumdir";
# Directory to write model parameter files to
$CFG_MODEL_DIR = "$CFG_BASE_DIR/model_parameters";

# Directory containing transcripts and control files for
# speaker-adaptive training
$CFG_LIST_DIR = "$CFG_BASE_DIR/etc";

# Decoding variables for MMIE training
$CFG_LANGUAGEWEIGHT = "11.5";
$CFG_BEAMWIDTH      = "1e-100";
$CFG_WORDBEAM       = "1e-80";
$CFG_LANGUAGEMODEL  = "$CFG_LIST_DIR/$CFG_DB_NAME.lm.DMP";
$CFG_WORDPENALTY    = "0.2";

# Lattice pruning variables
$CFG_ABEAM              = "1e-50";
$CFG_NBEAM              = "1e-10";
$CFG_PRUNED_DENLAT_DIR  = "$CFG_BASE_DIR/pruned_denlat";

# MMIE training related variables
$CFG_MMIE = "no";
$CFG_MMIE_MAX_ITERATIONS = 5;
$CFG_LATTICE_DIR = "$CFG_BASE_DIR/lattice";
$CFG_MMIE_TYPE   = "rand"; # Valid values are "rand", "best" or "ci"
$CFG_MMIE_CONSTE = "3.0";
$CFG_NUMLAT_DIR  = "$CFG_BASE_DIR/numlat";
$CFG_DENLAT_DIR  = "$CFG_BASE_DIR/denlat";

# Variables used in main training of models
$CFG_DICTIONARY     = "$CFG_LIST_DIR/$CFG_DB_NAME.dic";
$CFG_RAWPHONEFILE   = "$CFG_LIST_DIR/$CFG_DB_NAME.phone";
$CFG_FILLERDICT     = "$CFG_LIST_DIR/$CFG_DB_NAME.filler";
$CFG_LISTOFFILES    = "$CFG_LIST_DIR/${CFG_DB_NAME}_train.fileids";
$CFG_TRANSCRIPTFILE = "$CFG_LIST_DIR/${CFG_DB_NAME}_train.transcription";
$CFG_FEATPARAMS     = "$CFG_LIST_DIR/feat.params";

# Variables used in characterizing models

#$CFG_HMM_TYPE = '.cont.'; # Sphinx III
$CFG_HMM_TYPE  = '.semi.'; # PocketSphinx and Sphinx II
#$CFG_HMM_TYPE  = '.ptm.'; # PocketSphinx (larger data sets)

if (($CFG_HMM_TYPE ne ".semi.")
    and ($CFG_HMM_TYPE ne ".ptm.")
    and ($CFG_HMM_TYPE ne ".cont.")) {
  die "Please choose one CFG_HMM_TYPE out of '.cont.', '.ptm.', or '.semi.', " .
    "currently $CFG_HMM_TYPE\n";
}

# This configuration is fastest and best for most acoustic models in
# PocketSphinx and Sphinx-III.  See below for Sphinx-II.
$CFG_STATESPERHMM = 3;
$CFG_SKIPSTATE = 'no';

if ($CFG_HMM_TYPE eq '.semi.') {
  $CFG_DIRLABEL = 'semi';
# Four stream features for PocketSphinx
  $CFG_FEATURE = "s2_4x";
  $CFG_NUM_STREAMS = 4;
  $CFG_INITIAL_NUM_DENSITIES = 256;
  $CFG_FINAL_NUM_DENSITIES = 256;
  die "For semi continuous models, the initial and final models have the same density" 
    if ($CFG_INITIAL_NUM_DENSITIES != $CFG_FINAL_NUM_DENSITIES);
} elsif ($CFG_HMM_TYPE eq '.ptm.') {
  $CFG_DIRLABEL = 'ptm';
# Four stream features for PocketSphinx
  $CFG_FEATURE = "s2_4x";
  $CFG_NUM_STREAMS = 4;
  $CFG_INITIAL_NUM_DENSITIES = 64;
  $CFG_FINAL_NUM_DENSITIES = 64;
  die "For phonetically tied models, the initial and final models have the same density" 
    if ($CFG_INITIAL_NUM_DENSITIES != $CFG_FINAL_NUM_DENSITIES);
} elsif ($CFG_HMM_TYPE eq '.cont.') {
  $CFG_DIRLABEL = 'cont';
# Single stream features - Sphinx 3
  $CFG_FEATURE = "1s_c_d_dd";
  $CFG_NUM_STREAMS = 1;
  $CFG_INITIAL_NUM_DENSITIES = 1;
  $CFG_FINAL_NUM_DENSITIES = 8;
  die "The initial has to be less than the final number of densities" 
    if ($CFG_INITIAL_NUM_DENSITIES > $CFG_FINAL_NUM_DENSITIES);
}

# Number of top gaussians to score a frame. A little bit less accurate computations
# make training significantly faster. Uncomment to apply this during the training
# For good accuracy make sure you are using the same setting in decoder
# In theory this can be different for various training stages. For example 4 for
# CI stage and 16 for CD stage
# $CFG_CI_NTOP = 4;
# $CFG_CD_NTOP = 16;

# (yes/no) Train multiple-gaussian context-independent models (useful
# for alignment, use 'no' otherwise) in the models created
# specifically for forced alignment
$CFG_FALIGN_CI_MGAU = 'no';
# (yes/no) Train multiple-gaussian context-independent models (useful
# for alignment, use 'no' otherwise)
$CFG_CI_MGAU = 'no';
# Number of tied states (senones) to create in decision-tree clustering
$CFG_N_TIED_STATES = 200;
# How many parts to run Forward-Backward estimatinon in
$CFG_NPART = 1;

# (yes/no) Train a single decision tree for all phones (actually one
# per state) (useful for grapheme-based models, use 'no' otherwise)
$CFG_CROSS_PHONE_TREES = 'no';

# Use force-aligned transcripts (if available) as input to training
$CFG_FORCEDALIGN = 'no';

# Use a specific set of models for force alignment.  If not defined,
# context-independent models for the current experiment will be used.
$CFG_FORCE_ALIGN_MDEF = "$CFG_BASE_DIR/model_architecture/$CFG_EXPTNAME.falign_ci.mdef";
$CFG_FORCE_ALIGN_MODELDIR = "$CFG_MODEL_DIR/$CFG_EXPTNAME.falign_ci_$CFG_DIRLABEL";

# Use a specific dictionary and filler dictionary for force alignment.
# If these are not defined, a dictionary and filler dictionary will be
# created from $CFG_DICTIONARY and $CFG_FILLERDICT, with noise words
# removed from the filler dictionary and added to the dictionary (this
# is because the force alignment is not very good at inserting them)

# $CFG_FORCE_ALIGN_DICTIONARY = "$ST::CFG_BASE_DIR/falignout$ST::CFG_EXPTNAME.falign.dict";;
# $CFG_FORCE_ALIGN_FILLERDICT = "$ST::CFG_BASE_DIR/falignout/$ST::CFG_EXPTNAME.falign.fdict";;

# Use a particular beam width for force alignment.  The wider
# (i.e. smaller numerically) the beam, the fewer sentences will be
# rejected for bad alignment.
$CFG_FORCE_ALIGN_BEAM = 1e-60;

# Calculate an LDA/MLLT transform?
$CFG_LDA_MLLT = 'no';
# Dimensionality of LDA/MLLT output
$CFG_LDA_DIMENSION = 29;

# This is actually just a difference in log space (it doesn't make
# sense otherwise, because different feature parameters have very
# different likelihoods)
$CFG_CONVERGENCE_RATIO = 0.1;

# Queue::POSIX for multiple CPUs on a local machine
# Queue::PBS to use a PBS/TORQUE queue
$CFG_QUEUE_TYPE = "Queue";

# Name of queue to use for PBS/TORQUE
$CFG_QUEUE_NAME = "workq";

# (yes/no) Build questions for decision tree clustering automatically
$CFG_MAKE_QUESTS = "yes";
# If CFG_MAKE_QUESTS is yes, questions are written to this file.
# If CFG_MAKE_QUESTS is no, questions are read from this file.
$CFG_QUESTION_SET = "${CFG_BASE_DIR}/model_architecture/${CFG_EXPTNAME}.tree_questions";
#$CFG_QUESTION_SET = "${CFG_BASE_DIR}/linguistic_questions";

$CFG_CP_OPERATION = "${CFG_BASE_DIR}/model_architecture/${CFG_EXPTNAME}.cpmeanvar";

# This variable has to be defined, otherwise utils.pl will not load.
$CFG_DONE = 1;

return 1;

my etc/feat.params contains:

-samprate 8000.0
-nfilt 31
-lowerf 200.00
-upperf 3500.00
-dither yes

the wav files have parameters:
8kHz,16 Bit, Mono
duration of each record about 0.5 sec (one word)

Where I was wrong? I can provide more information.
Thanks in advance for your answer.

kmeans.c Empty cluster

Speech Recognition Toolkit

Forums

Help

kmeans.c Empty cluster document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

kmeans.c Empty cluster