Kaldi / Discussion / Developers: An error after Pre-training finished

Bruce Lee - 2015-07-15

Hello everyone, I am a fisher. When I use kaldi to handle my own corpus(Chinese audios), I countered some problems. I hope someone can help me to solve this.

I just use the most of shell scripts in egs/timit/s5 and modify some files to make it available for my own data. There is no problem till I train the DNN with GPU. I use the local/nnet/run_dnn.sh to build the DNN model. I just change the steps/nnet/make_fmllr_feats.sh --nj 2, steps/nnet/make_fmllr_feats.sh --nj 4, steps/nnet/make_fmllr_feats.sh --nj 44 according to my corpus. It reported error after finishing pre-training 5 layers of DNN model.

The following information is the content of command line:

Pre-training finished.
Removing features tmpdir /tmp/tmp.bIzfDxCTVc @ lijian-cca
train.ark

Accounting: time=17989 threads=1

Ended (code 0) at Wed Jul 15 13:39:38 CST 2015, elapsed time 17989 seconds

steps/nnet/train.sh --feature-transform exp/dnn4_pretrain-dbn/final.feature_transform --dbn exp/dnn4_pretrain-dbn/6.dbn --hid-layers 0 --learn-rate 0.008 data-fmllr-tri3/train_tr90 data-fmllr-tri3/train_cv10 data/lang exp/tri3_ali exp/tri3_ali exp/dnn4_pretrain-dbn_dnn

Started at Wed Jul 15 13:39:38 CST 2015

steps/nnet/train.sh --feature-transform exp/dnn4_pretrain-dbn/final.feature_transform --dbn exp/dnn4_pretrain-dbn/6.dbn --hid-layers 0 --learn-rate 0.008 data-fmllr-tri3/train_tr90 data-fmllr-tri3/train_cv10 data/lang exp/tri3_ali exp/tri3_ali exp/dnn4_pretrain-dbn_dnn

INFO

steps/nnet/train.sh : Training Neural Network
dir : exp/dnn4_pretrain-dbn_dnn
Train-set : data-fmllr-tri3/train_tr90 exp/tri3_ali
CV-set : data-fmllr-tri3/train_cv10 exp/tri3_ali

IS CUDA GPU AVAILABLE? 'lijian-cca'

LOG (SelectGpuIdAuto():cu-device.cc:280) Selecting from 1 GPUs
LOG (SelectGpuIdAuto():cu-device.cc:295) cudaSetDevice(0): Quadro K2200 free:3665M, used:429M, total:4095M, free/total:0.895065
LOG (SelectGpuIdAuto():cu-device.cc:344) Trying to select device: 0 (automatically), mem_ratio: 0.895065
LOG (SelectGpuIdAuto():cu-device.cc:363) Success selecting device 0 free mem ratio: 0.895065
LOG (FinalizeActiveGpu():cu-device.cc:202) The active GPU is [0]: Quadro K2200 free:3649M, used:445M, total:4095M, free/total:0.891158 version 5.0
LOG (PrintMemoryUsage():cu-device.cc:379) Memory used: 0 bytes.

HURRAY, WE GOT A CUDA GPU FOR COMPUTATION!!!

PREPARING ALIGNMENTS

Using PDF targets from dirs 'exp/tri3_ali' 'exp/tri3_ali'
copy-transition-model --binary=false exp/tri3_ali/final.mdl exp/dnn4_pretrain-dbn_dnn/final.mdl
LOG (copy-transition-model:main():copy-transition-model.cc:62) Copied transition model.

PREPARING FEATURES

Preparing train/cv lists :
4680 exp/dnn4_pretrain-dbn_dnn/train.scp
600 exp/dnn4_pretrain-dbn_dnn/cv.scp
5280 total
copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp_non_local ark,scp:/tmp/tmp.fhJQQiJXhs/train.ark,exp/dnn4_pretrain-dbn_dnn/train.scp
LOG (copy-feats:main():copy-feats.cc:100) Copied 4680 feature matrices.
copy-feats scp:exp/dnn4_pretrain-dbn_dnn/cv.scp_non_local ark,scp:/tmp/tmp.fhJQQiJXhs/cv.ark,exp/dnn4_pretrain-dbn_dnn/cv.scp
LOG (copy-feats:main():copy-feats.cc:100) Copied 600 feature matrices.
Imported config : cmvn_opts='' delta_opts=''
apply-cmvn is not used
Getting feature dim :
copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:-
WARNING (feat-to-dim:Close():kaldi-io.cc:446) Pipe copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:- | had nonzero return status 13
Feature dim is : 40
Using pre-computed feature-transform : 'exp/dnn4_pretrain-dbn/final.feature_transform'

NN-INITIALIZATION

Getting input/output dims :
feat-to-dim 'ark:copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:- | nnet-forward exp/dnn4_pretrain-dbn_dnn/final.feature_transform ark:- ark:- |' -
copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:-
nnet-forward exp/dnn4_pretrain-dbn_dnn/final.feature_transform ark:- ark:-
LOG (nnet-forward:SelectGpuId():cu-device.cc:83) Manually selected to compute on CPU.
WARNING (feat-to-dim:Close():kaldi-io.cc:446) Pipe copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:- | nnet-forward exp/dnn4_pretrain-dbn_dnn/final.feature_transform ark:- ark:- | had nonzero return status 36096
nnet-forward 'nnet-concat exp/dnn4_pretrain-dbn_dnn/final.feature_transform exp/dnn4_pretrain-dbn/6.dbn -|' 'ark:copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:- |' ark:-
LOG (nnet-forward:SelectGpuId():cu-device.cc:83) Manually selected to compute on CPU.
feat-to-dim ark:- -
run.pl: job failed, log is in exp/dnn4_pretrain-dbn_dnn/log/train_nnet.log
lijian@lijian-cca:~/kaldi-trunk/egs/zhongwen2$ nnet-concat exp/dnn4_pretrain-dbn_dnn/final.feature_transform exp/dnn4_pretrain-dbn/6.dbn -
LOG (nnet-concat:main():nnet-concat.cc:53) Reading exp/dnn4_pretrain-dbn_dnn/final.feature_transform
LOG (nnet-concat:main():nnet-concat.cc:65) Concatenating exp/dnn4_pretrain-dbn/6.dbn
ERROR (nnet-concat:Input():kaldi-io.cc:672) Error opening input stream exp/dnn4_pretrain-dbn/6.dbn
ERROR (nnet-concat:Input():kaldi-io.cc:672) Error opening input stream exp/dnn4_pretrain-dbn/6.dbn

[stack trace: ]
kaldi::KaldiGetStackTrace()
kaldi::KaldiErrorMessage::~KaldiErrorMessage()
kaldi::Input::Input(std::string const&, bool*)
nnet-concat(main+0x2ef) [0x4899a1]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f404fd7eec5]
nnet-concat() [0x489619]

WARNING (nnet-forward:Close():kaldi-io.cc:446) Pipe nnet-concat exp/dnn4_pretrain-dbn_dnn/final.feature_transform exp/dnn4_pretrain-dbn/6.dbn -| had nonzero return status 65280
WARNING (nnet-forward:Read():nnet-nnet.cc:396) The network 'nnet-concat exp/dnn4_pretrain-dbn_dnn/final.feature_transform exp/dnn4_pretrain-dbn/6.dbn -|' is empty.
KALDI_ASSERT: at nnet-forward:GetComponent:nnet-nnet.cc:167, failed: static_cast<size_t>(component) < components_.size()
Stack trace is:
kaldi::KaldiGetStackTrace()
kaldi::KaldiAssertFailure_(char const, char const, int, char const*)
kaldi::nnet1::Nnet::GetComponent(int)
nnet-forward(main+0x549) [0x491c6b]
/lib/x86_64-linux-gnu/libc.so.6(libc_start_main+0xf5) [0x7f79c2687ec5]
nnet-forward() [0x491689]
KALDI_ASSERT: at nnet-forward:GetComponent:nnet-nnet.cc:167, failed: static_cast<size_t>(component) < components_.size()
Stack trace is:
kaldi::KaldiGetStackTrace()
kaldi::KaldiAssertFailure_(char const, char const, int, char const*)
kaldi::nnet1::Nnet::GetComponent(int)
nnet-forward(main+0x549) [0x491c6b]
/lib/x86_64-linux-gnu/libc.so.6(libc_start_main+0xf5) [0x7f79c2687ec5]
nnet-forward() [0x491689]
ERROR (feat-to-dim:main():feat-to-dim.cc:58) Could not read any features (empty archive?)
ERROR (feat-to-dim:main():feat-to-dim.cc:58) Could not read any features (empty archive?)

[stack trace: ]
kaldi::KaldiGetStackTrace()
kaldi::KaldiErrorMessage::~KaldiErrorMessage()
feat-to-dim(main+0x1ea) [0x44fc2c]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fe7d6931ec5]
feat-to-dim() [0x44f9a9]

Getting nnet input dimension failed!!
Removing features tmpdir /tmp/tmp.fhJQQiJXhs @ lijian-cca
cv.ark
train.ark

Accounting: time=5 threads=1

Ended (code 1) at Wed Jul 15 13:39:43 CST 2015, elapsed time 5 seconds

Can anyone tell me how to handle this? Thanks a lot and hope for someone's reply.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Karel Vesely - 2015-07-15
  
  Hi,
  the problem is that you pre-trained only 5 layers, while the DNN
  training script is trying to read 6 pre-trained layers from file:
  'nnet-concat exp/dnn4_pretrain-dbn_dnn/final.feature_transform
  exp/dnn4_pretrain-dbn/6.dbn'
  
  You need to select a '*.dbn' file which exists.
  
  Best!
  Karel.
  
  Dne 14. 7. 2015 v 23:09 Bruce Lee napsal(a):
  
  Hello everyone, I am a fisher. When I use kaldi to handle my own
  corpus(Chinese audios), I countered some problems. I hope someone can
  help me to solve this.
  
  I just use the most of shell scripts in egs/timit/s5 and modify some
  files to make it available for my own data. There is no problem till I
  train the DNN with GPU. I use the local/nnet/run_dnn.sh to build the
  DNN model. I just change the steps/nnet/make_fmllr_feats.sh --nj 2,
  steps/nnet/make_fmllr_feats.sh --nj 4, steps/nnet/make_fmllr_feats.sh
  --nj 44 according to my corpus. It reported error after finishing
  pre-training 5 layers of DNN model.
  
  The following information is the content of command line:
  
  Pre-training finished.
  Removing features tmpdir /tmp/tmp.bIzfDxCTVc @ lijian-cca
  train.ark
  
  Accounting: time=17989 threads=1
  
  Ended (code 0) at Wed Jul 15 13:39:38 CST 2015, elapsed time 17989
  seconds
  
  steps/nnet/train.sh --feature-transform
  exp/dnn4_pretrain-dbn/final.feature_transform --dbn
  exp/dnn4_pretrain-dbn/6.dbn --hid-layers 0 --learn-rate 0.008
  data-fmllr-tri3/train_tr90 data-fmllr-tri3/train_cv10 data/lang
  exp/tri3_ali exp/tri3_ali exp/dnn4_pretrain-dbn_dnn
  
  Started at Wed Jul 15 13:39:38 CST 2015
  
  steps/nnet/train.sh --feature-transform
  exp/dnn4_pretrain-dbn/final.feature_transform --dbn
  exp/dnn4_pretrain-dbn/6.dbn --hid-layers 0 --learn-rate 0.008
  data-fmllr-tri3/train_tr90 data-fmllr-tri3/train_cv10 data/lang
  exp/tri3_ali exp/tri3_ali exp/dnn4_pretrain-dbn_dnn
  
  INFO
  
  steps/nnet/train.sh : Training Neural Network
  dir : exp/dnn4_pretrain-dbn_dnn
  Train-set : data-fmllr-tri3/train_tr90 exp/tri3_ali
  CV-set : data-fmllr-tri3/train_cv10 exp/tri3_ali
  
  IS CUDA GPU AVAILABLE? 'lijian-cca'
  
  LOG (SelectGpuIdAuto():cu-device.cc:280) Selecting from 1 GPUs
  LOG (SelectGpuIdAuto():cu-device.cc:295) cudaSetDevice(0): Quadro
  K2200 free:3665M, used:429M, total:4095M, free/total:0.895065
  LOG (SelectGpuIdAuto():cu-device.cc:344) Trying to select device: 0
  (automatically), mem_ratio: 0.895065
  LOG (SelectGpuIdAuto():cu-device.cc:363) Success selecting device 0
  free mem ratio: 0.895065
  LOG (FinalizeActiveGpu():cu-device.cc:202) The active GPU is [0]:
  Quadro K2200 free:3649M, used:445M, total:4095M, free/total:0.891158
  version 5.0
  LOG (PrintMemoryUsage():cu-device.cc:379) Memory used: 0 bytes.
  
  HURRAY, WE GOT A CUDA GPU FOR COMPUTATION!!!
  
  PREPARING ALIGNMENTS
  
  Using PDF targets from dirs 'exp/tri3_ali' 'exp/tri3_ali'
  copy-transition-model --binary=false exp/tri3_ali/final.mdl
  exp/dnn4_pretrain-dbn_dnn/final.mdl
  LOG (copy-transition-model:main():copy-transition-model.cc:62) Copied
  transition model.
  
  PREPARING FEATURES
  
  Preparing train/cv lists :
  4680 exp/dnn4_pretrain-dbn_dnn/train.scp
  600 exp/dnn4_pretrain-dbn_dnn/cv.scp
  5280 total
  copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp_non_local
  ark,scp:/tmp/tmp.fhJQQiJXhs/train.ark,exp/dnn4_pretrain-dbn_dnn/train.scp
  LOG (copy-feats:main():copy-feats.cc:100) Copied 4680 feature matrices.
  copy-feats scp:exp/dnn4_pretrain-dbn_dnn/cv.scp_non_local
  ark,scp:/tmp/tmp.fhJQQiJXhs/cv.ark,exp/dnn4_pretrain-dbn_dnn/cv.scp
  LOG (copy-feats:main():copy-feats.cc:100) Copied 600 feature matrices.
  Imported config : cmvn_opts='' delta_opts=''
  apply-cmvn is not used
  Getting feature dim :
  copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:-
  WARNING (feat-to-dim:Close():kaldi-io.cc:446) Pipe copy-feats
  scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:- | had nonzero return
  status 13
  Feature dim is : 40
  Using pre-computed feature-transform :
  'exp/dnn4_pretrain-dbn/final.feature_transform'
  
  NN-INITIALIZATION
  
  Getting input/output dims :
  feat-to-dim 'ark:copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp
  ark:- | nnet-forward exp/dnn4_pretrain-dbn_dnn/final.feature_transform
  ark:- ark:- |' -
  copy-feats scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:-
  nnet-forward exp/dnn4_pretrain-dbn_dnn/final.feature_transform ark:-
  ark:-
  LOG (nnet-forward:SelectGpuId():cu-device.cc:83) Manually selected to
  compute on CPU.
  WARNING (feat-to-dim:Close():kaldi-io.cc:446) Pipe copy-feats
  scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:- | nnet-forward
  exp/dnn4_pretrain-dbn_dnn/final.feature_transform ark:- ark:- | had
  nonzero return status 36096
  nnet-forward 'nnet-concat
  exp/dnn4_pretrain-dbn_dnn/final.feature_transform
  exp/dnn4_pretrain-dbn/6.dbn -|' 'ark:copy-feats
  scp:exp/dnn4_pretrain-dbn_dnn/train.scp ark:- |' ark:-
  LOG (nnet-forward:SelectGpuId():cu-device.cc:83) Manually selected to
  compute on CPU.
  feat-to-dim ark:- -
  run.pl: job failed, log is in exp/dnn4_pretrain-dbn_dnn/log/train_nnet.log
  lijian@lijian-cca:~/kaldi-trunk/egs/zhongwen2$ nnet-concat
  exp/dnn4_pretrain-dbn_dnn/final.feature_transform
  exp/dnn4_pretrain-dbn/6.dbn -
  LOG (nnet-concat:main():nnet-concat.cc:53) Reading
  exp/dnn4_pretrain-dbn_dnn/final.feature_transform
  LOG (nnet-concat:main():nnet-concat.cc:65) Concatenating
  exp/dnn4_pretrain-dbn/6.dbn
  ERROR (nnet-concat:Input():kaldi-io.cc:672) Error opening input stream
  exp/dnn4_pretrain-dbn/6.dbn
  ERROR (nnet-concat:Input():kaldi-io.cc:672) Error opening input stream
  exp/dnn4_pretrain-dbn/6.dbn
  
  [stack trace: ]
  kaldi::KaldiGetStackTrace()
  kaldi::KaldiErrorMessage::~KaldiErrorMessage()
  kaldi::Input::Input(std::string const&, bool*)
  nnet-concat(main+0x2ef) [0x4899a1]
  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f404fd7eec5]
  nnet-concat() [0x489619]
  
  WARNING (nnet-forward:Close():kaldi-io.cc:446) Pipe nnet-concat
  exp/dnn4_pretrain-dbn_dnn/final.feature_transform
  exp/dnn4_pretrain-dbn/6.dbn -| had nonzero return status 65280
  WARNING (nnet-forward:Read():nnet-nnet.cc:396) The network
  'nnet-concat exp/dnn4_pretrain-dbn_dnn/final.feature_transform
  exp/dnn4_pretrain-dbn/6.dbn -|' is empty.
  KALDI_ASSERT: at nnet-forward:GetComponent:nnet-nnet.cc:167, failed:
  static_cast<size_t>(component) < components_.size()
  Stack trace is:
  kaldi::KaldiGetStackTrace()
  kaldi::KaldiAssertFailure_(char const/, char const/, int, char const)
  kaldi::nnet1::Nnet::GetComponent(int)
  nnet-forward(main+0x549) [0x491c6b]
  /lib/x86_64-linux-gnu/libc.so.6(libc_start_main+0xf5) [0x7f79c2687ec5]
  nnet-forward() [0x491689]
  KALDI_ASSERT: at nnet-forward:GetComponent:nnet-nnet.cc:167, failed:
  static_cast<size_t>(component) < components_.size()
  Stack trace is:
  kaldi::KaldiGetStackTrace()
  kaldi::KaldiAssertFailure_(char const/, char const/, int, char const)
  kaldi::nnet1::Nnet::GetComponent(int)
  nnet-forward(main+0x549) [0x491c6b]
  /lib/x86_64-linux-gnu/libc.so.6(libc_start_main+0xf5) [0x7f79c2687ec5]
  nnet-forward() [0x491689]
  ERROR (feat-to-dim:main():feat-to-dim.cc:58) Could not read any
  features (empty archive?)
  ERROR (feat-to-dim:main():feat-to-dim.cc:58) Could not read any
  features (empty archive?)
  
  [stack trace: ]
  kaldi::KaldiGetStackTrace()
  kaldi::KaldiErrorMessage::~KaldiErrorMessage()
  feat-to-dim(main+0x1ea) [0x44fc2c]
  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fe7d6931ec5]
  feat-to-dim() [0x44f9a9]
  
  Getting nnet input dimension failed!!
  Removing features tmpdir /tmp/tmp.fhJQQiJXhs @ lijian-cca
  cv.ark
  train.ark
  
  Accounting: time=5 threads=1
  
  Ended (code 1) at Wed Jul 15 13:39:43 CST 2015, elapsed time 5 seconds
  
  Can anyone tell me how to handle this? Thanks a lot and hope for
  someone's reply.
  
  An error after Pre-training finished
  https://sourceforge.net/p/kaldi/discussion/1355349/thread/95b366f6/?limit=25#7cc1
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/kaldi/discussion/1355349/
  https://sourceforge.net/p/kaldi/discussion/1355349
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  https://sourceforge.net/auth/subscriptions
  
  alternate
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bruce Lee - 2015-07-15

The first error is "ERROR (nnet-concat:Input():kaldi-io.cc:672) Error opening input stream exp/dnn4_pretrain-dbn/6.dbn", and I checked the directory exp/dnn4_pretrain-dbn/ and found that there was no 6.dbn file. And in the step of pre-training, it didn't pre-train the sixth layer.

How can I fix this?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bruce Lee - 2015-07-15

Thanks a lot, I have found the reason. I have ever changed the configuration "nn_depth=5", so it just trained 5 layers.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

An error after Pre-training finished

Forums

Help

An error after Pre-training finished

Accounting: time=17989 threads=1

Ended (code 0) at Wed Jul 15 13:39:38 CST 2015, elapsed time 17989 seconds

steps/nnet/train.sh --feature-transform exp/dnn4_pretrain-dbn/final.feature_transform --dbn exp/dnn4_pretrain-dbn/6.dbn --hid-layers 0 --learn-rate 0.008 data-fmllr-tri3/train_tr90 data-fmllr-tri3/train_cv10 data/lang exp/tri3_ali exp/tri3_ali exp/dnn4_pretrain-dbn_dnn

Started at Wed Jul 15 13:39:38 CST 2015

INFO

IS CUDA GPU AVAILABLE? 'lijian-cca'

HURRAY, WE GOT A CUDA GPU FOR COMPUTATION!!!

PREPARING ALIGNMENTS

PREPARING FEATURES

NN-INITIALIZATION

Accounting: time=5 threads=1

Ended (code 1) at Wed Jul 15 13:39:43 CST 2015, elapsed time 5 seconds

An error after Pre-training finished

Forums

Help

An error after Pre-training finished document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Accounting: time=17989 threads=1

Ended (code 0) at Wed Jul 15 13:39:38 CST 2015, elapsed time 17989 seconds

steps/nnet/train.sh --feature-transform exp/dnn4_pretrain-dbn/final.feature_transform --dbn exp/dnn4_pretrain-dbn/6.dbn --hid-layers 0 --learn-rate 0.008 data-fmllr-tri3/train_tr90 data-fmllr-tri3/train_cv10 data/lang exp/tri3_ali exp/tri3_ali exp/dnn4_pretrain-dbn_dnn

Started at Wed Jul 15 13:39:38 CST 2015

INFO

IS CUDA GPU AVAILABLE? 'lijian-cca'

HURRAY, WE GOT A CUDA GPU FOR COMPUTATION!!!

PREPARING ALIGNMENTS

PREPARING FEATURES

NN-INITIALIZATION

Accounting: time=5 threads=1

Ended (code 1) at Wed Jul 15 13:39:43 CST 2015, elapsed time 5 seconds

An error after Pre-training finished