The SphinxTrain script in xxx/scripts_pl/03.makeuntiedmdef contains a bug that will affect training with force-aligned transcripts. All scripts that use the .transcription file must be directed to use either your original or a force-aligned transcript, depending on which kind of processing you're doing at the time. In the SphinxTrain scripts, this is conditioned on a variable $CFG_FORCEDALIGN.
The problem is that the script 03.makeuntiedmdef/make_untied_mdef.pl fails to do this, so it always used the non-aligned transcript. You should add a few lines to make it do so. Here's the diff:
The SphinxTrain script in xxx/scripts_pl/03.makeuntiedmdef contains a bug that will affect training with force-aligned transcripts. All scripts that use the .transcription file must be directed to use either your original or a force-aligned transcript, depending on which kind of processing you're doing at the time. In the SphinxTrain scripts, this is conditioned on a variable $CFG_FORCEDALIGN.
The problem is that the script 03.makeuntiedmdef/make_untied_mdef.pl fails to do this, so it always used the non-aligned transcript. You should add a few lines to make it do so. Here's the diff:
94,101d91
< # Use proper transcript file, depending on $CFG_FORCEDALIGN
< if ( $CFG_FORCEDALIGN eq "no" ) {
< $transcriptfile = $CFG_TRANSCRIPTFILE;
< } else {
< $transcriptfile = "$CFG_BASE_DIR/generated/${CFG_EXPTNAME}.alignedtranscripts";
< }
< &ST_Log(" Using $transcriptfile\n");
<
103c93
< system ("$MAKE_MDEF -phnlstfn $CFG_RAWPHONEFILE -dictfn $CFG_DICTIONARY -fdictfn $CFG_FILLERDICT -lsnfn $transcriptfile -ountiedmdef $untiedmdef -n_state_pm $CFG_STATESPERHMM -maxtriphones 10000 2>$logfile");
---
> system ("$MAKE_MDEF -phnlstfn $CFG_RAWPHONEFILE -dictfn $CFG_DICTIONARY -fdictfn $CFG_FILLERDICT -lsnfn $CFG_TRANSCRIPTFILE -ountiedmdef $untiedmdef -n_state_pm $CFG_STATESPERHMM -maxtriphones 10000 2>$logfile");