Anonymous - 2003-03-04

The SphinxTrain script in xxx/scripts_pl/03.makeuntiedmdef contains a bug that will affect training with force-aligned transcripts.  All scripts that use the .transcription file must be directed to use either your original or a force-aligned transcript, depending on which kind of processing you're doing at the time.  In the SphinxTrain scripts, this is conditioned on a variable $CFG_FORCEDALIGN.

The problem is that the script 03.makeuntiedmdef/make_untied_mdef.pl fails to do this, so it always used the non-aligned transcript.  You should add a few lines to make it do so.  Here's the diff:

94,101d91
< # Use proper transcript file, depending on $CFG_FORCEDALIGN
< if ( $CFG_FORCEDALIGN eq "no" ) {
<     $transcriptfile = $CFG_TRANSCRIPTFILE;
< } else {
<     $transcriptfile  = "$CFG_BASE_DIR/generated/${CFG_EXPTNAME}.alignedtranscripts";
< }
< &ST_Log("    Using $transcriptfile\n");
<
103c93
< system ("$MAKE_MDEF -phnlstfn $CFG_RAWPHONEFILE -dictfn $CFG_DICTIONARY -fdictfn $CFG_FILLERDICT -lsnfn $transcriptfile -ountiedmdef  $untiedmdef -n_state_pm  $CFG_STATESPERHMM -maxtriphones 10000 2>$logfile");
---
> system ("$MAKE_MDEF -phnlstfn $CFG_RAWPHONEFILE -dictfn $CFG_DICTIONARY -fdictfn $CFG_FILLERDICT -lsnfn $CFG_TRANSCRIPTFILE -ountiedmdef  $untiedmdef -n_state_pm  $CFG_STATESPERHMM -maxtriphones 10000 2>$logfile");