|
From: Mailing l. u. f. U. C. a. U. <kal...@li...> - 2013-06-17 17:51:56
|
Hi all, I'm trying to run the s3 recipe for WSJ and I'm running into a problem I was wondering whether you could shed any light upon: The script run.sh works fine up to steps/align_lda_mllt.sh. However, when I am running: steps/make_denlats_lda_etc.sh --num-jobs 4 --cmd "$train_cmd" \ data/train_si84 data/lang exp/tri2b_ali_si84 exp/tri2b_denlats_si84 I get a KALDI_ASSERT error. I updated to the most recent version of the trunk and retried but with no effect. More specifically, I get the following output in one of the log files (I'm running the recipe on a cluster and I'm submitting to a queue using 20 jobs): ===================================== >> cat wsj/s3/exp/tri2b_denlats_si84/decode_den.24.log Running on ro Started at Tue Jun 18 05:27:00 EEST 2013 gmm-latgen-faster --beam=13.0 --lattice-beam=7.0 --acoustic-scale=0.1 --max-mem=20000000 --max-active=5000 --word-symbol-table=data/lang/words.txt exp/tri2b_ali_si84/final.mdl exp/tri2b_denlats_si84/dengraph/HCLG.fst 'ark:apply-cmvn --norm-vars=false --utt2spk=ark:data/train_si84/split20/24/utt2spk ark:exp/tri2b_ali_si84/24.cmvn scp:data/train_si84/split20/24/feats.scp ark:- | splice-feats ark:- ark:- | transform-feats exp/tri2b_ali_si84/final.mat ark:- ark:- |' 'ark:|gzip -c >exp/tri2b_denlats_si84/lat.24.gz' splice-feats ark:- ark:- apply-cmvn --norm-vars=false --utt2spk=ark:data/train_si84/split20/24/utt2spk ark:exp/tri2b_ali_si84/24.cmvn scp:data/train_si84/split20/24/feats.scp ark:- transform-feats exp/tri2b_ali_si84/final.mat ark:- ark:- KALDI_ASSERT: at gmm-latgen-faster:TransitionIdToPdf:hmm/transition-model.h:309, failed: static_cast<size_t>(trans_id) < id2state_.size() Stack trace is: kaldi::KaldiGetStackTrace() kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*) kaldi::TransitionModel::TransitionIdToPdf(int) const kaldi::DecodableAmDiagGmmScaled::LogLikelihood(int, int) kaldi::LatticeFasterDecoder::ProcessEmitting(kaldi::DecodableInterface*, int) kaldi::LatticeFasterDecoder::Decode(kaldi::DecodableInterface*) kaldi::DecodeUtteranceLatticeFaster(kaldi::LatticeFasterDecoder&, kaldi::DecodableInterface&, fst::SymbolTable const*, std::string, double, bool, bool, kaldi::TableWriter<kaldi::BasicVectorHolder<int> >*, kaldi::TableWriter<kaldi::BasicVectorHolder<int> >*, kaldi::TableWriter<kaldi::CompactLatticeHolder>*, kaldi::TableWriter<kaldi::LatticeHolder>*, double*) gmm-latgen-faster(main+0xc3b) [0x58dad6] /lib64/libc.so.6(__libc_start_main+0xe6) [0x2ba2f7d9cc16] gmm-latgen-faster() [0x58cd11] /rmt/programs/gridengine_new/default/spool/ro/job_scripts/10778: line 6: 26822 Aborted (core dumped) ( gmm-latgen-faster --beam=13.0 --lattice-beam=7.0 --acoustic-scale=0.1 --max-mem=20000000 --max-active=5000 --word-symbol-table=data/lang/words.txt exp/tri2b_ali_si84/final.mdl exp/tri2b_denlats_si84/dengraph/HCLG.fst "ark:apply-cmvn --norm-vars=false --utt2spk=ark:data/train_si84/split20/24/utt2spk ark:exp/tri2b_ali_si84/24.cmvn scp:data/train_si84/split20/24/feats.scp ark:- | splice-feats ark:- ark:- | transform-feats exp/tri2b_ali_si84/final.mat ark:- ark:- |" "ark:|gzip -c >exp/tri2b_denlats_si84/lat.24.gz" ) 2>> /rmt/work/audio_asr/kaldi/kaldi-trunk/egs/wsj/s3/exp/tri2b_denlats_si84/decode_den.24.log >> /rmt/work/audio_asr/kaldi/kaldi-trunk/egs/wsj/s3/exp/tri2b_denlats_si84/decode_den.24.log ===================================== I've started looking into the code in further detail but I guess debugging in this way will take a while since I have very little experience with kaldi. So, any ideas or suggestions will be greatly appreciated. Thank you, nassos PS: The decode_den.24.sh script: ===================================== #!/bin/bash cd /rmt/work/audio_asr/kaldi/kaldi-trunk/egs/wsj/s3 . path.sh echo Running on `hostname` >/rmt/work/audio_asr/kaldi/kaldi-trunk/egs/wsj/s3/exp/tri2b_denlats_si84/decode_den.24.log echo Started at `date` >>/rmt/work/audio_asr/kaldi/kaldi-trunk/egs/wsj/s3/exp/tri2b_denlats_si84/decode_den.24.log ( gmm-latgen-faster --beam=13.0 --lattice-beam=7.0 --acoustic-scale=0.1 --max-mem=20000000 --max-active=5000 --word-symbol-table=data/lang/words.txt exp/tri2b_ali_si84/final.mdl exp/tri2b_denlats_si84/dengraph/HCLG.fst "ark:apply-cmvn --norm-vars=false --utt2spk=ark:data/train_si84/split20/24/utt2spk ark:exp/tri2b_ali_si84/24.cmvn scp:data/train_si84/split20/24/feats.scp ark:- | splice-feats ark:- ark:- | transform-feats exp/tri2b_ali_si84/final.mat ark:- ark:- |" "ark:|gzip -c >exp/tri2b_denlats_si84/lat.24.gz" ) 2>>/rmt/work/audio_asr/kaldi/kaldi-trunk/egs/wsj/s3/exp/tri2b_denlats_si84/decode_den.24.log >>/rmt/work/audio_asr/kaldi/kaldi-trunk/egs/wsj/s3/exp/tri2b_denlats_si84/decode_den.24.log ret=$? echo >>/rmt/work/audio_asr/kaldi/kaldi-trunk/egs/wsj/s3/exp/tri2b_denlats_si84/decode_den.24.log echo Finished at `date` >>/rmt/work/audio_asr/kaldi/kaldi-trunk/egs/wsj/s3/exp/tri2b_denlats_si84/decode_den.24.log exit $ret ## submitted with: # qsub -S /bin/bash -sync y -j y -o /rmt/work/audio_asr/kaldi/kaldi-trunk/egs/wsj/s3/exp/tri2b_denlats_si84/decode_den.24.log -l mem_free=700M /rmt/work/audio_asr/kaldi/kaldi-trunk/egs/wsj/s3/exp/tri2b_denlats_si84/q/decode_den.24.sh >>/rmt/work/audio_asr/kaldi/kaldi-trunk/egs/wsj/s3/exp/tri2b_denlats_si84/q/queue.log 2>&1 ===================================== |