Menu

Aligning tri3b with si284 training data on the WSJ

Help
aanchan
2014-07-02
2014-07-02
  • aanchan

    aanchan - 2014-07-02

    Hello,

    I had another question related to the WSJ S5 recipe. I understand that training of the tri2b (LDA+MLLT) system involves in training LDA transform which manifests itself as a file full.mat in the tri2b folder (generated by train_lda_mllt.sh).

    The tri3b recipe then picks up from where the tri2b left off, with a call to train_sat.sh which aims to give an LDA+MLLT+SAT system. The problem I encounter next is with the the call to align_fmllr.sh in run.sh:

    --Comment-- From 3b system, align all si284 data.

    steps/align_fmllr.sh --nj 20 --cmd "$train_cmd" data/train_si284 data/lang exp/tri3b exp/tri3b_ali_si284 || exit 1;

    align_fmllr.sh complains that full.mat does not exist in tri3b. It will not exist since there was no direct call to train_lda_mllt.sh while training the models in the tri3b folder, is that right? Is there a bug somewhere, or were there some mis-steps in running the s5 recipe? Otherwise should I just be linking to the full.mat from the tri2b folder?

     

    Last edit: aanchan 2014-07-02
    • Daniel Povey

      Daniel Povey - 2014-07-02

      You may be using an older version of the scripts. In the up-to-date trunk,
      full.mat is copied by train_sat.sh into its own directory (e.g. tri3b). In
      some older versions it may not have. Regardless of this, the warnings are
      harmless.
      Dan

      On Wed, Jul 2, 2014 at 11:06 AM, aanchan aanchan-mohan@users.sf.net wrote:

      Hello,

      I had another question related to the WSJ S5 recipe. I understand that
      training of the tri2b (LDA+MLLT) system involves in training LDA transform
      which manifests itself as a file full.mat in the tri2b folder (generated by
      train_lda_mllt.sh).

      The tri3b recipe then picks up from where the tri2b left off, with a call
      to train_sat.sh which aims to give an LDA+MLLT+SAT system. The problem I
      encounter next is with the the call to align_fmllr.sh in run.sh:
      From 3b system, align all si284 data.

      steps/align_fmllr.sh --nj 20 --cmd "$train_cmd" data/train_si284 data/lang
      exp/tri3b exp/tri3b_ali_si284 || exit 1;

      align_fmllr.sh complains that full.mat does not exist in tri3b. It will
      not exist since there was no direct call to train_lda_mllt.sh while
      training the models in the tri3b folder, is that right? Is there a bug
      somewhere, or were there some mis-steps in running the s5 recipe? Otherwise
      should I just be linking to the full.mat from the tri2b folder?


      Aligning tri3b with si284 training data on the WSJ
      https://sourceforge.net/p/kaldi/discussion/1355348/thread/af2c7901/?limit=25#fe3a


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/kaldi/discussion/1355348/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       
MongoDB Logo MongoDB