Menu

An error after Pre-training finished

Developers
Bruce Lee
2015-07-15
2015-07-15
  • Bruce Lee

    Bruce Lee - 2015-07-15

    Hello everyone, I am a fisher. When I use kaldi to handle my own corpus(Chinese audios), I countered some problems. I hope someone can help me to solve this.

    I just use the most of shell scripts in egs/timit/s5 and modify some files to make it available for my own data. There is no problem till I train the DNN with GPU. I use the local/nnet/run_dnn.sh to build the DNN model. I just change the steps/nnet/make_fmllr_feats.sh --nj 2, steps/nnet/make_fmllr_feats.sh --nj 4, steps/nnet/make_fmllr_feats.sh --nj 44 according to my corpus. It reported error after finishing pre-training 5 layers of DNN model.

    The following information is the content of command line:

    Pre-training finished.
    Removing features tmpdir /tmp/tmp.bIzfDxCTVc @ lijian-cca
    train.ark

    Accounting: time=17989 threads=1

    Ended (code 0) at Wed Jul 15 13:39:38 CST 2015, elapsed time 17989 seconds

    steps/nnet/train.sh --feature-transform exp/dnn4_pretrain-dbn/final.feature_transform --dbn exp/dnn4_pretrain-dbn/6.dbn --hid-layers 0 --learn-rate 0.008 data-fmllr-tri3/train_tr90 data-fmllr-tri3/train_cv10 data/lang exp/tri3_ali exp/tri3_ali exp/dnn4_pretrain-dbn_dnn

    Started at Wed Jul 15 13:39:38 CST 2015

    steps/nnet/train.sh --feature-transform exp/dnn4_pretrain-dbn/final.feature_transform --dbn exp/dnn4_pretrain-dbn/6.dbn --hid-layers 0 --learn-rate 0.008 data-fmllr-tri3/train_tr90 data-fmllr-tri3/train_cv10 data/lang exp/tri3_ali exp/tri3_ali exp/dnn4_pretrain-dbn_dnn

    INFO

    steps/nnet/train.sh : Training Neural Network
    dir : exp/dnn4_pretrain-dbn_dnn
    Train-set : data-fmllr-tri3/train_tr90 exp/tri3_ali
    CV-set : data-fmllr-tri3/train_cv10 exp/tri3_ali

    IS CUDA GPU AVAILABLE? 'lijian-cca'

    LOG (SelectGpuIdAuto():cu-device.cc:280) Selecting from 1 GPUs
    LOG (SelectGpuIdAuto():cu-device.cc:295) cudaSetDevice(0): Quadro K2200 free:3665M, used:429M, total:4095M, free/total:0.895065
    LOG (SelectGpuIdAuto():cu-device.cc:344) Trying to select device: 0 (automatically), mem_ratio: 0.895065
    LOG (SelectGpuIdAuto():cu-device.cc:363) Success selecting device 0 free mem ratio: 0.895065
    LOG (FinalizeActiveGpu():cu-device.cc:202) The active GPU is [0]: Quadro K2200 free:3649M, used:445M, total:4095M, free/total:0.891158 version 5.0
    LOG (PrintMemoryUsage():cu-device.cc:379) Memory used: 0 bytes.

    HURRAY, WE GOT A CUDA GPU FOR COMPUTATION!!!

    PREPARING ALIGNMENTS

    Using PDF targets from dirs 'exp/tri3_ali' 'exp/tri3_ali'
    copy-transition-model --binary=false exp/tri3_ali/final.mdl exp/dnn4_pretrain-dbn_dnn/final.mdl
    LOG (copy-transition-model:main():copy-transition-model.cc:62) Copied transition model.

    PREPARING FEATURES

    Preparing train/cv lists :
    4680 exp/dnn4_pretrain-dbn_dnn/train.scp
    600 exp/dnn4_pretrain-dbn_dnn/cv.scp
    5280 total
    copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp_non_local ark,scp:/tmp/tmp.fhJQQiJXhs/train.ark,exp/dnn4_pretrain-dbn_dnn/train.scp
    LOG (copy-feats:main():copy-feats.cc:100) Copied 4680 feature matrices.
    copy-feats scp:exp/dnn4_pretrain-dbn_dnn/cv.scp_non_local ark,scp:/tmp/tmp.fhJQQiJXhs/cv.ark,exp/dnn4_pretrain-dbn_dnn/cv.scp
    LOG (copy-feats:main():copy-feats.cc:100) Copied 600 feature matrices.
    Imported config : cmvn_opts='' delta_opts=''
    apply-cmvn is not used
    Getting feature dim :
    copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:-
    WARNING (feat-to-dim:Close():kaldi-io.cc:446) Pipe copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:- | had nonzero return status 13
    Feature dim is : 40
    Using pre-computed feature-transform : 'exp/dnn4_pretrain-dbn/final.feature_transform'

    NN-INITIALIZATION

    Getting input/output dims :
    feat-to-dim 'ark:copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:- | nnet-forward exp/dnn4_pretrain-dbn_dnn/final.feature_transform ark:- ark:- |' -
    copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:-
    nnet-forward exp/dnn4_pretrain-dbn_dnn/final.feature_transform ark:- ark:-
    LOG (nnet-forward:SelectGpuId():cu-device.cc:83) Manually selected to compute on CPU.
    WARNING (feat-to-dim:Close():kaldi-io.cc:446) Pipe copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:- | nnet-forward exp/dnn4_pretrain-dbn_dnn/final.feature_transform ark:- ark:- | had nonzero return status 36096
    nnet-forward 'nnet-concat exp/dnn4_pretrain-dbn_dnn/final.feature_transform exp/dnn4_pretrain-dbn/6.dbn -|' 'ark:copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:- |' ark:-
    LOG (nnet-forward:SelectGpuId():cu-device.cc:83) Manually selected to compute on CPU.
    feat-to-dim ark:- -
    run.pl: job failed, log is in exp/dnn4_pretrain-dbn_dnn/log/train_nnet.log
    lijian@lijian-cca:~/kaldi-trunk/egs/zhongwen2$ nnet-concat exp/dnn4_pretrain-dbn_dnn/final.feature_transform exp/dnn4_pretrain-dbn/6.dbn -
    LOG (nnet-concat:main():nnet-concat.cc:53) Reading exp/dnn4_pretrain-dbn_dnn/final.feature_transform
    LOG (nnet-concat:main():nnet-concat.cc:65) Concatenating exp/dnn4_pretrain-dbn/6.dbn
    ERROR (nnet-concat:Input():kaldi-io.cc:672) Error opening input stream exp/dnn4_pretrain-dbn/6.dbn
    ERROR (nnet-concat:Input():kaldi-io.cc:672) Error opening input stream exp/dnn4_pretrain-dbn/6.dbn

    [stack trace: ]
    kaldi::KaldiGetStackTrace()
    kaldi::KaldiErrorMessage::~KaldiErrorMessage()
    kaldi::Input::Input(std::string const&, bool*)
    nnet-concat(main+0x2ef) [0x4899a1]
    /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f404fd7eec5]
    nnet-concat() [0x489619]

    WARNING (nnet-forward:Close():kaldi-io.cc:446) Pipe nnet-concat exp/dnn4_pretrain-dbn_dnn/final.feature_transform exp/dnn4_pretrain-dbn/6.dbn -| had nonzero return status 65280
    WARNING (nnet-forward:Read():nnet-nnet.cc:396) The network 'nnet-concat exp/dnn4_pretrain-dbn_dnn/final.feature_transform exp/dnn4_pretrain-dbn/6.dbn -|' is empty.
    KALDI_ASSERT: at nnet-forward:GetComponent:nnet-nnet.cc:167, failed: static_cast<size_t>(component) < components_.size()
    Stack trace is:
    kaldi::KaldiGetStackTrace()
    kaldi::KaldiAssertFailure_(char const, char const, int, char const*)
    kaldi::nnet1::Nnet::GetComponent(int)
    nnet-forward(main+0x549) [0x491c6b]
    /lib/x86_64-linux-gnu/libc.so.6(libc_start_main+0xf5) [0x7f79c2687ec5]
    nnet-forward() [0x491689]
    KALDI_ASSERT: at nnet-forward:GetComponent:nnet-nnet.cc:167, failed: static_cast<size_t>(component) < components_.size()
    Stack trace is:
    kaldi::KaldiGetStackTrace()
    kaldi::KaldiAssertFailure_(char const, char const, int, char const*)
    kaldi::nnet1::Nnet::GetComponent(int)
    nnet-forward(main+0x549) [0x491c6b]
    /lib/x86_64-linux-gnu/libc.so.6(
    libc_start_main+0xf5) [0x7f79c2687ec5]
    nnet-forward() [0x491689]
    ERROR (feat-to-dim:main():feat-to-dim.cc:58) Could not read any features (empty archive?)
    ERROR (feat-to-dim:main():feat-to-dim.cc:58) Could not read any features (empty archive?)

    [stack trace: ]
    kaldi::KaldiGetStackTrace()
    kaldi::KaldiErrorMessage::~KaldiErrorMessage()
    feat-to-dim(main+0x1ea) [0x44fc2c]
    /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fe7d6931ec5]
    feat-to-dim() [0x44f9a9]

    Getting nnet input dimension failed!!
    Removing features tmpdir /tmp/tmp.fhJQQiJXhs @ lijian-cca
    cv.ark
    train.ark

    Accounting: time=5 threads=1

    Ended (code 1) at Wed Jul 15 13:39:43 CST 2015, elapsed time 5 seconds

    Can anyone tell me how to handle this? Thanks a lot and hope for someone's reply.

     
    • Karel Vesely

      Karel Vesely - 2015-07-15

      Hi,
      the problem is that you pre-trained only 5 layers, while the DNN
      training script is trying to read 6 pre-trained layers from file:
      'nnet-concat exp/dnn4_pretrain-dbn_dnn/final.feature_transform
      exp/dnn4_pretrain-dbn/6.dbn'

      You need to select a '*.dbn' file which exists.

      Best!
      Karel.

      Dne 14. 7. 2015 v 23:09 Bruce Lee napsal(a):

      Hello everyone, I am a fisher. When I use kaldi to handle my own
      corpus(Chinese audios), I countered some problems. I hope someone can
      help me to solve this.

      I just use the most of shell scripts in egs/timit/s5 and modify some
      files to make it available for my own data. There is no problem till I
      train the DNN with GPU. I use the local/nnet/run_dnn.sh to build the
      DNN model. I just change the steps/nnet/make_fmllr_feats.sh --nj 2,
      steps/nnet/make_fmllr_feats.sh --nj 4, steps/nnet/make_fmllr_feats.sh
      --nj 44 according to my corpus. It reported error after finishing
      pre-training 5 layers of DNN model.

      The following information is the content of command line:

      Pre-training finished.
      Removing features tmpdir /tmp/tmp.bIzfDxCTVc @ lijian-cca
      train.ark

      Accounting: time=17989 threads=1

      Ended (code 0) at Wed Jul 15 13:39:38 CST 2015, elapsed time 17989
      seconds

      steps/nnet/train.sh --feature-transform
      exp/dnn4_pretrain-dbn/final.feature_transform --dbn
      exp/dnn4_pretrain-dbn/6.dbn --hid-layers 0 --learn-rate 0.008
      data-fmllr-tri3/train_tr90 data-fmllr-tri3/train_cv10 data/lang
      exp/tri3_ali exp/tri3_ali exp/dnn4_pretrain-dbn_dnn

      Started at Wed Jul 15 13:39:38 CST 2015

      steps/nnet/train.sh --feature-transform
      exp/dnn4_pretrain-dbn/final.feature_transform --dbn
      exp/dnn4_pretrain-dbn/6.dbn --hid-layers 0 --learn-rate 0.008
      data-fmllr-tri3/train_tr90 data-fmllr-tri3/train_cv10 data/lang
      exp/tri3_ali exp/tri3_ali exp/dnn4_pretrain-dbn_dnn

      INFO

      steps/nnet/train.sh : Training Neural Network
      dir : exp/dnn4_pretrain-dbn_dnn
      Train-set : data-fmllr-tri3/train_tr90 exp/tri3_ali
      CV-set : data-fmllr-tri3/train_cv10 exp/tri3_ali

        IS CUDA GPU AVAILABLE? 'lijian-cca'
      

      LOG (SelectGpuIdAuto():cu-device.cc:280) Selecting from 1 GPUs
      LOG (SelectGpuIdAuto():cu-device.cc:295) cudaSetDevice(0): Quadro
      K2200 free:3665M, used:429M, total:4095M, free/total:0.895065
      LOG (SelectGpuIdAuto():cu-device.cc:344) Trying to select device: 0
      (automatically), mem_ratio: 0.895065
      LOG (SelectGpuIdAuto():cu-device.cc:363) Success selecting device 0
      free mem ratio: 0.895065
      LOG (FinalizeActiveGpu():cu-device.cc:202) The active GPU is [0]:
      Quadro K2200 free:3649M, used:445M, total:4095M, free/total:0.891158
      version 5.0
      LOG (PrintMemoryUsage():cu-device.cc:379) Memory used: 0 bytes.

        HURRAY, WE GOT A CUDA GPU FOR COMPUTATION!!!
      

      PREPARING ALIGNMENTS

      Using PDF targets from dirs 'exp/tri3_ali' 'exp/tri3_ali'
      copy-transition-model --binary=false exp/tri3_ali/final.mdl
      exp/dnn4_pretrain-dbn_dnn/final.mdl
      LOG (copy-transition-model:main():copy-transition-model.cc:62) Copied
      transition model.

      PREPARING FEATURES

      Preparing train/cv lists :
      4680 exp/dnn4_pretrain-dbn_dnn/train.scp
      600 exp/dnn4_pretrain-dbn_dnn/cv.scp
      5280 total
      copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp_non_local
      ark,scp:/tmp/tmp.fhJQQiJXhs/train.ark,exp/dnn4_pretrain-dbn_dnn/train.scp
      LOG (copy-feats:main():copy-feats.cc:100) Copied 4680 feature matrices.
      copy-feats scp:exp/dnn4_pretrain-dbn_dnn/cv.scp_non_local
      ark,scp:/tmp/tmp.fhJQQiJXhs/cv.ark,exp/dnn4_pretrain-dbn_dnn/cv.scp
      LOG (copy-feats:main():copy-feats.cc:100) Copied 600 feature matrices.
      Imported config : cmvn_opts='' delta_opts=''
      apply-cmvn is not used
      Getting feature dim :
      copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:-
      WARNING (feat-to-dim:Close():kaldi-io.cc:446) Pipe copy-feats
      scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:- | had nonzero return
      status 13
      Feature dim is : 40
      Using pre-computed feature-transform :
      'exp/dnn4_pretrain-dbn/final.feature_transform'

      NN-INITIALIZATION

      Getting input/output dims :
      feat-to-dim 'ark:copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp
      ark:- | nnet-forward exp/dnn4_pretrain-dbn_dnn/final.feature_transform
      ark:- ark:- |' -
      copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:-
      nnet-forward exp/dnn4_pretrain-dbn_dnn/final.feature_transform ark:-
      ark:-
      LOG (nnet-forward:SelectGpuId():cu-device.cc:83) Manually selected to
      compute on CPU.
      WARNING (feat-to-dim:Close():kaldi-io.cc:446) Pipe copy-feats
      scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:- | nnet-forward
      exp/dnn4_pretrain-dbn_dnn/final.feature_transform ark:- ark:- | had
      nonzero return status 36096
      nnet-forward 'nnet-concat
      exp/dnn4_pretrain-dbn_dnn/final.feature_transform
      exp/dnn4_pretrain-dbn/6.dbn -|' 'ark:copy-feats
      scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:- |' ark:-
      LOG (nnet-forward:SelectGpuId():cu-device.cc:83) Manually selected to
      compute on CPU.
      feat-to-dim ark:- -
      run.pl: job failed, log is in exp/dnn4_pretrain-dbn_dnn/log/train_nnet.log
      lijian@lijian-cca:~/kaldi-trunk/egs/zhongwen2$ nnet-concat
      exp/dnn4_pretrain-dbn_dnn/final.feature_transform
      exp/dnn4_pretrain-dbn/6.dbn -
      LOG (nnet-concat:main():nnet-concat.cc:53) Reading
      exp/dnn4_pretrain-dbn_dnn/final.feature_transform
      LOG (nnet-concat:main():nnet-concat.cc:65) Concatenating
      exp/dnn4_pretrain-dbn/6.dbn
      ERROR (nnet-concat:Input():kaldi-io.cc:672) Error opening input stream
      exp/dnn4_pretrain-dbn/6.dbn
      ERROR (nnet-concat:Input():kaldi-io.cc:672) Error opening input stream
      exp/dnn4_pretrain-dbn/6.dbn

      [stack trace: ]
      kaldi::KaldiGetStackTrace()
      kaldi::KaldiErrorMessage::~KaldiErrorMessage()
      kaldi::Input::Input(std::string const&, bool*)
      nnet-concat(main+0x2ef) [0x4899a1]
      /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f404fd7eec5]
      nnet-concat() [0x489619]

      WARNING (nnet-forward:Close():kaldi-io.cc:446) Pipe nnet-concat
      exp/dnn4_pretrain-dbn_dnn/final.feature_transform
      exp/dnn4_pretrain-dbn/6.dbn -| had nonzero return status 65280
      WARNING (nnet-forward:Read():nnet-nnet.cc:396) The network
      'nnet-concat exp/dnn4_pretrain-dbn_dnn/final.feature_transform
      exp/dnn4_pretrain-dbn/6.dbn -|' is empty.
      KALDI_ASSERT: at nnet-forward:GetComponent:nnet-nnet.cc:167, failed:
      static_cast<size_t>(component) < components_.size()
      Stack trace is:
      kaldi::KaldiGetStackTrace()
      kaldi::KaldiAssertFailure_(char const/, char const/, int, char const)
      kaldi::nnet1::Nnet::GetComponent(int)
      nnet-forward(main+0x549) [0x491c6b]
      /lib/x86_64-linux-gnu/libc.so.6(
      libc_start_main+0xf5) [0x7f79c2687ec5]
      nnet-forward() [0x491689]
      KALDI_ASSERT: at nnet-forward:GetComponent:nnet-nnet.cc:167, failed:
      static_cast<size_t>(component) < components_.size()
      Stack trace is:
      kaldi::KaldiGetStackTrace()
      kaldi::KaldiAssertFailure_(char const/, char const/, int, char const)
      kaldi::nnet1::Nnet::GetComponent(int)
      nnet-forward(main+0x549) [0x491c6b]
      /lib/x86_64-linux-gnu/libc.so.6(
      libc_start_main+0xf5) [0x7f79c2687ec5]
      nnet-forward() [0x491689]
      ERROR (feat-to-dim:main():feat-to-dim.cc:58) Could not read any
      features (empty archive?)
      ERROR (feat-to-dim:main():feat-to-dim.cc:58) Could not read any
      features (empty archive?)

      [stack trace: ]
      kaldi::KaldiGetStackTrace()
      kaldi::KaldiErrorMessage::~KaldiErrorMessage()
      feat-to-dim(main+0x1ea) [0x44fc2c]
      /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fe7d6931ec5]
      feat-to-dim() [0x44f9a9]

      Getting nnet input dimension failed!!
      Removing features tmpdir /tmp/tmp.fhJQQiJXhs @ lijian-cca
      cv.ark
      train.ark

      Accounting: time=5 threads=1

      Ended (code 1) at Wed Jul 15 13:39:43 CST 2015, elapsed time 5 seconds

      Can anyone tell me how to handle this? Thanks a lot and hope for
      someone's reply.


      An error after Pre-training finished
      https://sourceforge.net/p/kaldi/discussion/1355349/thread/95b366f6/?limit=25#7cc1


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/kaldi/discussion/1355349/
      https://sourceforge.net/p/kaldi/discussion/1355349

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/
      https://sourceforge.net/auth/subscriptions

       
  • Bruce Lee

    Bruce Lee - 2015-07-15

    The first error is "ERROR (nnet-concat:Input():kaldi-io.cc:672) Error opening input stream exp/dnn4_pretrain-dbn/6.dbn", and I checked the directory exp/dnn4_pretrain-dbn/ and found that there was no 6.dbn file. And in the step of pre-training, it didn't pre-train the sixth layer.

    How can I fix this?

     
  • Bruce Lee

    Bruce Lee - 2015-07-15

    Thanks a lot, I have found the reason. I have ever changed the configuration "nn_depth=5", so it just trained 5 layers.