Re: [Kaldi-users] cuda dnn

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Thanks.
Karel-- you might want to modify the script to check that the train
and cv sets have disjoint utterance-ids.
Dan

On Tue, Sep 24, 2013 at 1:27 PM, Mailing list used for User
Communication and Updates <kal...@li...> wrote:
> Hi,
>
> If someone gets the same errors..
> I had identical utterance ids in train and cv sets, while sound files
> were in different directories. That caused confusion in some feature
> preparation steps where the sets had been processed together,
>
> Regards,
> Valentin
>
> On Sat, Sep 21, 2013 at 12:37 AM, Valentin Mendelev <vm...@gm...> wrote:
>> Hi.
>>
>> That really was a partial message. I pressed shift+enter occasionally
>> and then realized how to solve my problem.
>> But another one have emerged.
>>
>> I’m trying to train a dnn  on my own small base (1 speaker, about
>> 10hrs  splitted on  7 words long utterances less than 10s duration
>> each) using egs/swbd/s5b/local/run_dnn.sh with appropriate alterations
>> (no feature-transform, paths).
>>
>> I run this
>>
>> $cuda_cmd $dir/_pretrain_dbn.log \
>>  steps/pretrain_dbn.sh --hid_dim 2048 --train_utts 15000 --cmvn_utts
>> 1000 $t $dir || exit 1
>> <set proper paths>
>>
>> and this
>>
>> $cuda_cmd $dir/_train_nnet.log \
>>  steps/train_nnet.sh --dbn $dbn --hid-layers 0 --learn-rate 0.008 \
>>  $t $cv $lang $ali $ali_cv $dir || exit 1;
>>
>> Pre-training is ok now, but MLP training falls.
>> In  prerun.log there are a lot of messages like this
>>
>> WARNING (nnet-train-xent-hardlab-frmshuff:main():nnet-train-xent-hardlab-frmshuf
>> f.cc:148) Alignment has wrong length, ali 258 vs. feats 334, utt 101-11
>> and finally
>> KALDI_ASSERT: at
>> nnet-train-xent-hardlab-frmshuff:CloseInternal:util/kaldi-table-inl.h:1546,
>> failed: holder_ == NULL
>> Stack trace is:
>> kaldi::KaldiGetStackTrace()
>> kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*)
>> kaldi::RandomAccessTableReaderArchiveImplBase<kaldi::BasicVectorHolder<int>
>>>::CloseInternal()
>>
>> .In _train_nnet.;log (last stage) :
>>
>> # RUNNING THE NN-TRAINING SCHEDULER
>> steps/train_nnet_scheduler.sh --feature-transform
>> exp/tri3b2_pretrain-dbn73_dnn/tr_splice5-1_cmvn-g.nnet --learn-rate
>> 0.008 --seed 777 exp/tri3b2_pretrain-dbn73_dnn/nnet_6.dbn_dnn.init
>> ark:copy-feats scp:exp/tri3b2_pretrain-dbn73_dnn/train.scp ark:- |
>> ark:copy-feats scp:exp/tri3b2_pretrain-dbn73_dnn/cv.scp ark:- |
>> ark:ali-to-pdf exp/tri3b2_ali/final.mdl "ark:gunzip -c
>> exp/tri3b2_ali/ali.*.gz exp/tri3b2_ali_cvseg/ali.*.gz |" ark:- |
>> exp/tri3b2_pretrain-dbn73_dnn
>> steps/train_nnet_scheduler.sh: line 78:  5525 Aborted
>> (core dumped) $train_tool --cross-validate=true
>> --bunchsize=$bunch_size --cachesize=$cache_size --verbose=$verbose
>> ${feature_transform:+ --feature-transform=$feature_transform}
>> ${use_gpu_id:+ --use-gpu-id=$use_gpu_id} $mlp_best "$feats_cv"
>> "$labels" 2> $dir/log/prerun.log
>>
>> It’s not a list sort problem because I can train simple triphone
>> models on the same alignment and decode the cv set.
>>
>> I’m using default feature settings, so I suppose it should be plain
>> mfcc with 5 frames contexts.
>> Could you tell where to look to make this work?
>>
>> I run ubuntu 12.10 64-bits and my video card is  GTX 580. with 1.5G RAM
>>
>> Regards,
>> Valentin
>
> ------------------------------------------------------------------------------
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
> the latest Intel processors and coprocessors. See abstracts and register >
> http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
> _______________________________________________
> Kaldi-users mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-users