Menu

force_align

Help
saad
2011-08-18
2012-09-22
  • saad

    saad - 2011-08-18

    hi
    I am using sphinxtrain nightly release. My sphinxtrain.cfg file is:

    Configuration script for sphinx trainer --mode:Perl--

    $CFG_VERBOSE = 1; # Determines how much goes to the screen.

    These are filled in at configuration time

    $CFG_DB_NAME = "an4";
    $CFG_BASE_DIR = "E:/Thesis/sphinx/sphinx/tutorial/an4";
    $CFG_SPHINXTRAIN_DIR = "E:/Thesis/sphinx/sphinx/tutorial/SphinxTrain";

    Directory containing SphinxTrain binaries

    $CFG_BIN_DIR = "$CFG_BASE_DIR/bin";
    $CFG_GIF_DIR = "$CFG_BASE_DIR/gifs";
    $CFG_SCRIPT_DIR = "$CFG_BASE_DIR/scripts_pl";

    Experiment name, will be used to name model files and log files

    $CFG_EXPTNAME = "$CFG_DB_NAME";

    Audio waveform and feature file information

    $CFG_WAVFILES_DIR = "$CFG_BASE_DIR/wav";
    $CFG_WAVFILE_EXTENSION = 'wav';
    $CFG_WAVFILE_TYPE = 'mswav'; # one of nist, mswav, raw
    $CFG_FEATFILES_DIR = "$CFG_BASE_DIR/feat";
    $CFG_FEATFILE_EXTENSION = 'mfc';
    $CFG_VECTOR_LENGTH = 13;
    $CFG_FEAT_WINDOW=4;

    $CFG_MIN_ITERATIONS = 1; # BW Iterate at least this many times
    $CFG_MAX_ITERATIONS = 10; # BW Don't iterate more than this, somethings likely
    wrong.

    (none/max) Type of AGC to apply to input files

    $CFG_AGC = 'none';

    (current/none) Type of cepstral mean subtraction/normalization

    to apply to input files

    $CFG_CMN = 'current';

    (yes/no) Normalize variance of input files to 1.0

    $CFG_VARNORM = 'no';

    (yes/no) Use letter-to-sound rules to guess pronunciations of

    unknown words (English, 40-phone specific)

    $CFG_LTSOOV = 'no';

    (yes/no) Train full covariance matrices

    $CFG_FULLVAR = 'no';

    (yes/no) Use diagonals only of full covariance matrices for

    Forward-Backward evaluation (recommended if CFG_FULLVAR is yes)

    $CFG_DIAGFULL = 'no';

    (yes/no) Perform vocal tract length normalization in training. This

    will result in a "normalized" model which requires VTLN to be done

    during decoding as well.

    $CFG_VTLN = 'no';

    Starting warp factor for VTLN

    $CFG_VTLN_START = 0.80;

    Ending warp factor for VTLN

    $CFG_VTLN_END = 1.40;

    Step size of warping factors

    $CFG_VTLN_STEP = 0.05;

    Directory to write queue manager logs to

    $CFG_QMGR_DIR = "$CFG_BASE_DIR/qmanager";

    Directory to write training logs to

    $CFG_LOG_DIR = "$CFG_BASE_DIR/logdir";

    Directory for re-estimation counts

    $CFG_BWACCUM_DIR = "$CFG_BASE_DIR/bwaccumdir";

    Directory to write model parameter files to

    $CFG_MODEL_DIR = "$CFG_BASE_DIR/model_parameters";

    Directory containing transcripts and control files for

    speaker-adaptive training

    $CFG_LIST_DIR = "$CFG_BASE_DIR/etc";

    *variables used in main training of models*

    $CFG_DICTIONARY = "$CFG_LIST_DIR/$CFG_DB_NAME.dic";
    $CFG_RAWPHONEFILE = "$CFG_LIST_DIR/$CFG_DB_NAME.phone";
    $CFG_FILLERDICT = "$CFG_LIST_DIR/$CFG_DB_NAME.filler";
    $CFG_LISTOFFILES = "$CFG_LIST_DIR/${CFG_DB_NAME}_train.fileids";
    $CFG_TRANSCRIPTFILE = "$CFG_LIST_DIR/${CFG_DB_NAME}_train.transcription";
    $CFG_FEATPARAMS = "$CFG_LIST_DIR/feat.params";

    *variables used in characterizing models*

    $CFG_HMM_TYPE = '.cont.'; # Sphinx III

    $CFG_HMM_TYPE = '.semi.'; # PocketSphinx and Sphinx II

    if (($CFG_HMM_TYPE ne ".semi.") and ($CFG_HMM_TYPE ne ".cont.")) {
    die "Please choose one CFG_HMM_TYPE out of '.cont.' or '.semi.', " .
    "currently $CFG_HMM_TYPE\n";
    }

    This configuration is fastest and best for most acoustic models in

    PocketSphinx and Sphinx-III. See below for Sphinx-II.

    $CFG_STATESPERHMM = 3;
    $CFG_SKIPSTATE = 'no';

    if ($CFG_HMM_TYPE eq '.semi.') {
    $CFG_DIRLABEL = 'semi';

    Four stream features for PocketSphinx

    $CFG_FEATURE = "s2_4x";
    $CFG_NUM_STREAMS = 4;
    $CFG_INITIAL_NUM_DENSITIES = 256;
    $CFG_FINAL_NUM_DENSITIES = 256;

    If you wish to build models for Sphinx-II, uncomment these lines

    $CFG_STATESPERHMM = 5;

    $CFG_SKIPSTATE = 'yes';

    die "For semi continuous models, the initial and final models have the same
    density"
    if ($CFG_INITIAL_NUM_DENSITIES != $CFG_FINAL_NUM_DENSITIES);
    } elsif ($CFG_HMM_TYPE eq '.cont.') {
    $CFG_DIRLABEL = 'cont';

    Single stream features - Sphinx 3

    $CFG_FEATURE = "1s_c";
    $CFG_NUM_STREAMS = 1;
    $CFG_INITIAL_NUM_DENSITIES = 1;
    $CFG_FINAL_NUM_DENSITIES = 8;
    die "The initial has to be less than the final number of densities"
    if ($CFG_INITIAL_NUM_DENSITIES > $CFG_FINAL_NUM_DENSITIES);
    }

    (yes/no) Train multiple-gaussian context-independent models (useful

    for alignment, use 'no' otherwise) in the models created

    specifically for forced alignment

    $CFG_FALIGN_CI_MGAU = 'no';

    (yes/no) Train multiple-gaussian context-independent models (useful

    for alignment, use 'no' otherwise)

    $CFG_CI_MGAU = 'no';

    Number of tied states (senones) to create in decision-tree clustering

    $CFG_N_TIED_STATES = 1000;

    How many parts to run Forward-Backward estimatinon in

    $CFG_NPART = 1;

    (yes/no) Train a single decision tree for all phones (actually one

    per state) (useful for grapheme-based models, use 'no' otherwise)

    $CFG_CROSS_PHONE_TREES = 'no';

    Use force-aligned transcripts (if available) as input to training

    $CFG_FORCEDALIGN = 'yes';

    Use a specific set of models for force alignment. If not defined,

    context-independent models for the current experiment will be used.

    $CFG_FORCE_ALIGN_MDEF =
    "$CFG_BASE_DIR/model_architecture/$CFG_EXPTNAME.falign_ci.mdef";
    if ($CFG_FALIGN_CI_MGAU eq 'yes') {
    $CFG_FORCE_ALIGN_MODELDIR = "$CFG_MODEL_DIR/$CFG_EXPTNAME.falign_ci_${CFG_DIRL
    ABEL}$CFG_FINAL_NUM_DENSITIES";
    }
    else {
    $CFG_FORCE_ALIGN_MODELDIR =
    "$CFG_MODEL_DIR/$CFG_EXPTNAME.falign_ci
    $CFG_DIRLABEL";
    }

    Use a specific dictionary and filler dictionary for force alignment.

    If these are not defined, a dictionary and filler dictionary will be

    created from $CFG_DICTIONARY and $CFG_FILLERDICT, with noise words

    removed from the filler dictionary and added to the dictionary (this

    is because the force alignment is not very good at inserting them)

    $CFG_FORCE_ALIGN_DICTIONARY =

    "$ST::CFG_BASE_DIR/falignout$ST::CFG_EXPTNAME.falign.dict";;

    $CFG_FORCE_ALIGN_FILLERDICT =

    "$ST::CFG_BASE_DIR/falignout/$ST::CFG_EXPTNAME.falign.fdict";;

    Use a particular beam width for force alignment. The wider

    (i.e. smaller numerically) the beam, the fewer sentences will be

    rejected for bad alignment.

    $CFG_FORCE_ALIGN_BEAM = 1e-60;

    Calculate an LDA/MLLT transform?

    $CFG_LDA_MLLT = 'no';

    Dimensionality of LDA/MLLT output

    $CFG_LDA_DIMENSION = 29;

    set convergence_ratio = 0.004

    $CFG_CONVERGENCE_RATIO = 0.04;

    Queue::POSIX for multiple CPUs on a local machine

    Queue::PBS to use a PBS/TORQUE queue

    $CFG_QUEUE_TYPE = "Queue";

    Name of queue to use for PBS/TORQUE

    $CFG_QUEUE_NAME = "workq";

    (yes/no) Build questions for decision tree clustering automatically

    $CFG_MAKE_QUESTS = "yes";

    If CFG_MAKE_QUESTS is yes, questions are written to this file.

    If CFG_MAKE_QUESTS is no, questions are read from this file.

    $CFG_QUESTION_SET =
    "${CFG_BASE_DIR}/model_architecture/${CFG_EXPTNAME}.tree_questions";

    $CFG_QUESTION_SET = "${CFG_BASE_DIR}/linguistic_questions";

    $CFG_CP_OPERATION =
    "${CFG_BASE_DIR}/model_architecture/${CFG_EXPTNAME}.cpmeanvar";

    This variable has to be defined, otherwise utils.pl will not load.

    $CFG_DONE = 1;

    return 1;

    I run the training command "perl scripts_pl\RunAll.pl"
    I have some errors. log file is:
    INFO: info.c(70): E:\Thesis\sphinx\sphinx\tutorial\an4\bin\sphinx3_align.exe
    Compiled on: Nov 2 2010, AT: 19:46:57

    INFO: cmd_ln.c(510): Parsing command line:
    E:\Thesis\sphinx\sphinx\tutorial\an4\bin\sphinx3_align.exe \
    -mdef E:/Thesis/sphinx/sphinx/tutorial/an4/model_architecture/an4.falign_ci.mdef \
    -senmgau .cont. \
    -mixw E:/Thesis/sphinx/sphinx/tutorial/an4/model_parameters/an4.falign_ci_cont/mixture_weights \
    -mixwfloor 1e-008 \
    -tmat E:/Thesis/sphinx/sphinx/tutorial/an4/model_parameters/an4.falign_ci_cont/transition_matrices \
    -mean E:/Thesis/sphinx/sphinx/tutorial/an4/model_parameters/an4.falign_ci_cont/means \
    -var E:/Thesis/sphinx/sphinx/tutorial/an4/model_parameters/an4.falign_ci_cont/variances \
    -varfloor 0.0001 \
    -dict E:/Thesis/sphinx/sphinx/tutorial/an4/falignout/an4.falign.dict \
    -fdict E:/Thesis/sphinx/sphinx/tutorial/an4/falignout/an4.falign.fdict \
    -ctl E:/Thesis/sphinx/sphinx/tutorial/an4/etc/an4_train.fileids \
    -ctloffset 0 \
    -ctlcount 620 \
    -cepdir E:/Thesis/sphinx/sphinx/tutorial/an4/feat \
    -cepext .mfc \
    -insent E:/Thesis/sphinx/sphinx/tutorial/an4/falignout/an4.aligninput \
    -outsent E:/Thesis/sphinx/sphinx/tutorial/an4/falignout/an4.alignedtranscripts.1 \
    -beam 1e-060 \
    -agc none \
    -cmn current \
    -varnorm no \
    -feat E:/Thesis/sphinx/sphinx/tutorial/an4/model_parameters/an4.falign_ci_cont/etc/sphinx_train/CFG_VECTOR_LENGTH:E:/Thesis/sphinx/sphinx/tutorial/an4/model_parameters/an4.falign_ci_cont/etc/sphinx_train/CFG_FEAT_WINDOW \
    -ceplen 13 \
    -cepwin 4

    ERROR: "cmd_ln.c", line 563: Unknown argument name '-cepwin'
    ERROR: "cmd_ln.c", line 648: cmd_ln_parse_r failed
    ERROR: "cmd_ln.c", line 697: cmd_ln_parse failed, forced exit
    Thu Aug 18 06:04:49 2011

    please help how to fix this error.
    Thanks alot

     
  • saad

    saad - 2011-08-18

    I found your comment i.e
    " 2010-09-07 01:45:22 PKT
    Hello Marek Sphinx3_align doens't support cepwin features out-of-box yet. You
    need to apply cepwin patch to sphinxbase in order to read cepwin features."
    please help me in understanding how to apply cepwin patch and what is the
    location of this patch.. THANKS ALOT AND BEST REGARDS...

     
  • Nickolay V. Shmyrev

    In sphinxtrain nightly to train cepwin features you need to use:

    $CFG_FEATURE = "1s_3c";
    

    This type of features is supported by sphinx3_align too.

     
  • saad

    saad - 2011-08-19

    thanks alot for your reply. After making the above change.

    I have following error:

    E:\Thesis\sphinx\sphinx\tutorial\an4\bin\init_gau.exe \
    -ctlfn E:/Thesis/sphinx/sphinx/tutorial/an4/etc/an4_train.fileids \
    -part 1 \
    -npart 1 \
    -cepdir E:/Thesis/sphinx/sphinx/tutorial/an4/feat \
    -cepext mfc \
    -accumdir E:/Thesis/sphinx/sphinx/tutorial/an4/bwaccumdir/an4_buff_1 \
    -agc none \
    -cmn current \
    -varnorm no \
    -feat 1s_3c \
    -ceplen 13 \
    -cepwin 3

    zus
    -help no zus
    -example no zus
    -moddeffn zus
    -ts2cbfn zus
    -accumdir zus
    -meanfn zus
    -fullvar no zus
    -ctlfn zus
    -nskip zus
    -runlen zus
    -part zus
    -npart zus
    -lsnfn zus
    -dictfn zus
    -fdictfn zus
    -segdir zus
    -segext v8_seg zus
    -scaleseg no zus
    -cepdir zus
    -cepext mfc zus
    -silcomp none zus
    -cmn current zus
    -varnorm no zus
    -agc max zus
    -feat 1s_c_d_dd zus
    -svspec zus
    -ceplen 13 zus
    -cepwin 0 zus
    -ldafn zus
    -ldadim 29 zus
    ERROR: "........\src\libs\libcep_feat\feat.c", line 228: Unimplemented
    feature 1s_3c
    ERROR: "........\src\libs\libcep_feat\feat.c", line 229: Implemented
    features are:
    c/1..L-1/,d/1..L-1/,c/0/d/0/dd/0/,dd/1..L-1/
    c/1..L-1/d/1..L-1/c/0/d/0/dd/0/dd/1..L-1/
    c/0..L-1/d/0..L-1/dd/0..L-1/
    c/0..L-1/d/0..L-1/
    c/0..L-1/
    c/0..L-1/dd/0..L-1/
    d/0..L-1/
    dd/0..L-1/
    FATAL_ERROR: "........\src\libs\libcep_feat\feat.c", line 251: feat module
    must be configured w/ a valid ID
    Fri Aug 19 06:39:45 2011

    I have nightly releases of sphinxbase, sphinxtrain and sphinx3. Please help.
    thanks alot and best Regards

     
  • saad

    saad - 2011-08-19

    should I change the version/release of sphinx3 or choose one in which 1s_3c
    features are implemented? Will you please help me which release/version?
    Thanks and Best Regards

     
  • Nickolay V. Shmyrev

    Feature 1s_3c is supported in latest version. It doesn't sound you are using
    it.

    You can learn how to download the latest version on our website:

    http://cmusphinx.sourceforge.net/wiki/download

     
  • saad

    saad - 2011-08-22

    Thanks alot for your help.
    I have downloaded and installed sphinx3-snapshot and sphinxbase-snapshot. I
    think this is the latest version. But I am facing the same error. I will be
    very thankful if you guide me,if I have selected the wrong version. Please
    help

    Thanks & best Regards

     
  • saad

    saad - 2011-08-22

    As this is the sphinxdecode.cfg file: and no option for feature "1s_3c"

    Configuration script for sphinx decoder --mode:Perl--

    Variables starting with $DEC_CFG_ refer to decoder specific

    arguments, those starting with $CFG_ refer to trainer arguments,

    some of them also used by the decoder.

    $DEC_CFG_VERBOSE = 1; # Determines how much goes to the screen.

    These are filled in at configuration time

    $DEC_CFG_DB_NAME = 'an4';
    $DEC_CFG_BASE_DIR = '/root/Desktop/sphinx/an4';
    $DEC_CFG_SPHINXDECODER_DIR = '/root/Desktop/sphinx/sphinx3';
    $DEC_CFG_SPHINXTRAIN_CFG = "$DEC_CFG_BASE_DIR/etc/sphinx_train.cfg";

    Name of the decoding script to use (psdecode.pl or s3decode.pl, probably)

    $DEC_CFG_SCRIPT = 's3decode.pl';

    require $DEC_CFG_SPHINXTRAIN_CFG;

    $DEC_CFG_BIN_DIR = "$DEC_CFG_BASE_DIR/bin";
    $DEC_CFG_GIF_DIR = "$DEC_CFG_BASE_DIR/gifs";
    $DEC_CFG_SCRIPT_DIR = "$DEC_CFG_BASE_DIR/scripts_pl";

    $DEC_CFG_EXPTNAME = "$CFG_EXPTNAME";
    $DEC_CFG_JOBNAME = "$CFG_EXPTNAME"."_job";

    Models to use.

    $DEC_CFG_MODEL_NAME = "$CFG_EXPTNAME.cd_${CFG_DIRLABEL}_${CFG_N_TIED_STATES}";

    $DEC_CFG_FEATFILES_DIR = "$DEC_CFG_BASE_DIR/feat";
    $DEC_CFG_FEATFILE_EXTENSION = '.mfc';
    $DEC_CFG_VECTOR_LENGTH = $CFG_VECTOR_LENGTH;
    $DEC_CFG_AGC = $CFG_AGC;
    $DEC_CFG_CMN = $CFG_CMN;
    $DEC_CFG_VARNORM = $CFG_VARNORM;

    $DEC_CFG_QMGR_DIR = "$DEC_CFG_BASE_DIR/qmanager";
    $DEC_CFG_LOG_DIR = "$DEC_CFG_BASE_DIR/logdir";
    $DEC_CFG_MODEL_DIR = "$CFG_MODEL_DIR";

    *variables used in decoding of wave files *

    $DEC_CFG_DICTIONARY = "$DEC_CFG_BASE_DIR/etc/$DEC_CFG_DB_NAME.dic";
    $DEC_CFG_FILLERDICT = "$DEC_CFG_BASE_DIR/etc/$DEC_CFG_DB_NAME.filler";
    $DEC_CFG_LISTOFFILES =
    "$DEC_CFG_BASE_DIR/etc/${DEC_CFG_DB_NAME}_test.fileids";
    $DEC_CFG_TRANSCRIPTFILE =
    "$DEC_CFG_BASE_DIR/etc/${DEC_CFG_DB_NAME}_test.transcription";
    $DEC_CFG_RESULT_DIR = "$DEC_CFG_BASE_DIR/result";

    This variables, used by the decoder, have to be user defined, and

    may affect the decoder output

    $DEC_CFG_LANGUAGEMODEL_DIR = "$DEC_CFG_BASE_DIR/etc";
    $DEC_CFG_LANGUAGEMODEL = "$DEC_CFG_LANGUAGEMODEL_DIR/an4.ug.lm.DMP";
    $DEC_CFG_LANGUAGEWEIGHT = "23";
    $DEC_CFG_BEAMWIDTH = "1e-120";
    $DEC_CFG_WORDBEAM = "1e-80";

    $DEC_CFG_ALIGN = "builtin";

    *variables used in characterizing models*

    $DEC_CFG_HMM_TYPE = $CFG_HMM_TYPE;

    if (($DEC_CFG_HMM_TYPE ne ".semi.") and ($DEC_CFG_HMM_TYPE ne ".cont.")) {
    die "Please choose one CFG_HMM_TYPE out of '.cont.' or '.semi.', " .
    "currently $DEC_CFG_HMM_TYPE\n";
    }

    This comes directly from reading the code. The feature definitions

    aren're represented exactly by the same string in the trainer and

    the decoder. Therefore, we need to map between them.

    %feature_type = (
    'c/1..L-1/,d/1..L-1/,c/0/d/0/dd/0/,dd/1..L-1/' => 's2_4x',
    'c/1..L-1/d/1..L-1/c/0/d/0/dd/0/dd/1..L-1/' => 's3_1x39',
    'c/0..L-1/d/0..L-1/dd/0..L-1/' => '1s_c_d_dd',
    'c/0..L-1/d/0..L-1/' => 'cep_dcep',
    'c/0..L-1/' => 'cep',
    'c/0..L-1/dd/0..L-1/' => 'INVALID',
    '4s_12c_24d_3p_12dd' => 's2_4x',
    '1s_12c_12d_3p_12dd' => 's3_1x39',
    's2_4x' => 's2_4x',
    's3_1x39' => 's3_1x39',
    '1s_c_d_dd' => '1s_c_d_dd',
    '1s_c_d_ld_dd' => '1s_c_d_ld_dd',
    '1s_c_d' => 'cep_dcep',
    '1s_c' => 'cep',
    '1s_c_dd' => 'INVALID',
    '1s_d' => 'INVALID',
    '1s_dd' => 'INVALID',
    );

    $DEC_CFG_FEATURE = "INVALID"
    unless ((exists $feature_type{$CFG_FEATURE})
    and ($DEC_CFG_FEATURE = $feature_type{$CFG_FEATURE}));

    if ($DEC_CFG_FEATURE eq "INVALID") {
    die "Feature type used for training, $CFG_FEATURE, cannot be used for
    decoding.\n" .
    "Please use one of 1s_c_d_dd, 1s_c_d, 1s_c, s2_4x, s3_1x39, 1s_c_d_ld_dd\n";
    }
    $CFG_FEAT_WINDOW ||= 0;

    Undocumented decoder magic since SphinxBase may not support -cepwin yet

    if ($CFG_FEAT_WINDOW) {
    $DEC_CFG_FEATURE = "$CFG_VECTOR_LENGTH:$CFG_FEAT_WINDOW";
    }

    $DEC_CFG_NPART = 1; # Define how many pieces to split decode in

    $DEC_CFG_OKAY_COLOR = '00D000';
    $DEC_CFG_WARNING_COLOR = '555500';
    $DEC_CFG_ERROR_COLOR = 'DD0000';

    return 1;

     
  • saad

    saad - 2011-08-22

    Thanks for your reply.
    Sir feature "1s_3c" is implemented in sphin3-snapshot? I have downloaded it
    from link you mentioned in above post.
    and I use this command to setup sphinx3
    perl scripts/setup_tutorial.pl an4

    Please help.
    Thanks Alot and Best Regards

     
  • Nickolay V. Shmyrev

    Sir feature "1s_3c" is implemented in sphin3-snapshot?

    Yes

     
  • saad

    saad - 2011-08-23

    Thanks alot for your help. Finally I find the problem with your kind help.
    Can I perform
    1-Force alignment
    2-LDA/MLLT transform
    3-VTLN
    in sphinx-3 snapshot? now I am using sphinxtrain-snapshot as well. so all
    three packages are latest one.
    Thanks alot
    Best Regards

     
  • saad

    saad - 2011-08-23

    When I train and decode my data with nightly release of sphinxtrain and
    sphinx-3, error rate was 5.6%.
    When I train and decode my data with sphinxtrain-snapshot and sphinx-3
    snapshot, error rate increases drastically to 21.1%, and when I enable
    force_align, error rate increase to 87%.
    should I try to fix the warnings during training? can you refer to me a guide
    for better understanding of force alignment parameters?
    Thanks Alot & Best Regards

     
  • Nickolay V. Shmyrev

    Can I perform
    1-Force alignment
    2-LDA/MLLT transform
    3-VTLN
    in sphinx-3 snapshot?

    I'm not sure what do you mean by this question

    should I try to fix the warnings during training?

    The issue is unlikely caused by warnings. Most likely it's some
    misconfiguration.

    can you refer to me a guide for better understanding of force alignment
    parameters?

    http://www.amazon.com/Spoken-Language-Processing-Algorithm-
    Development/dp/0130226165

     

Log in to post a comment.