Re: [Kaldi-users] Adapting LibriSpeech model to Blizzard2013 corpus

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

I noticed that the egs dumped for the Blizzard corpus contains 300
utterances for each the *valid *and *train* subsets, out of a total of 9733
utterances. Does the *train_more2.sh* script train on all 9733 utterances
of the dataset, or does it ignore the utterances included in the *valid* and
*train *subsets when training?

On Fri, Jul 3, 2015 at 2:59 PM, Daniel Povey <dp...@gm...> wrote:

> It definitely supports that- if you set num-threads to 1 it will train
> with GPU, but read this page
> http://kaldi.sourceforge.net/dnn2.html
>
> Dan
>
> On Fri, Jul 3, 2015 at 6:48 AM, Mate Andre <ele...@gm...> wrote:
> > The train_more2.sh has been running for 19 hours and is currently at pass
> > 42/60. Since the script is training on the 19-hour subset of Blizzard, I
> > imagine it'll take quite a while longer to train on the full 300 hours.
> >
> > Is there an option to run the train_more2.sh script on GPU?
> >
> > On Thu, Jul 2, 2015 at 2:25 PM, Daniel Povey <dp...@gm...> wrote:
> >>
> >> >
> >> > Back to training on the Blizzard dataset, I was able to dump the
> >> > iVectors
> >> > for Blizzard's 19-hour subset. Where are they needed, though? Neither
> >> > train_more2.sh nor get_egs2.sh seem to accept dumped iVectors as
> input.
> >>
> >> It's  the --online-ivector-dir option.
> >>
> >> > Regardless, I ran the train_more2.sh script on Blizzard's data/ and
> egs/
> >> > folder (generated with get_egs2.sh), and I get the following errors in
> >> > train.*.*.log:
> >> >
> >> > KALDI_ASSERT: at
> nnet-train-parallel:FormatNnetInput:nnet-update.cc:212,
> >> > failed:
> >> > data[0].input_frames.NumRows() >= num_splice
> >> > [...]
> >> > LOG (nnet-train-parallel:DoBackprop():nnet-update.cc:275) Error doing
> >> > backprop, nnet info is: num-components 17
> >> > num-updatable-components 5
> >> > left-context 7
> >> > right-context 7
> >> > input-dim 140
> >> > output-dim 5816
> >> > parameter-dim 10351000
> >> > [...]
> >> >
> >> > The logs tell me that the left and right contexts were set to 7.
> >> > However, I
> >> > specified them both to 3 when running get_egs2.sh. The
> >> > egs/info/{left,right}_context files even confirm that they are set to
> 3.
> >> > Is
> >> > it possible that train_more2.sh is using the contexts from another
> >> > directory?
> >>
> >> The problem is that 3 < 7.  The neural net requires a certain amount
> >> of temporal context (7 left and right, here) and if you dump less than
> >> that in the egs it will crash.  So you need to set them to 7 when
> >> dumping egs.
> >>
> >> Dan
> >>
> >>
> >>
> >> > On Tue, Jun 30, 2015 at 2:07 PM, Daniel Povey <dp...@gm...>
> wrote:
> >> >>
> >> >> Check the script that generated it; probably the graph directory was
> >> >> in a different location e.g. in tri6 or something like that.
> >> >> Hopefully we would have uploaded that too.
> >> >> We only need to regenerate the graph when the tree changes.
> >> >> Dan
> >> >>
> >> >>
> >> >> On Tue, Jun 30, 2015 at 2:05 PM, Mate Andre <ele...@gm...>
> >> >> wrote:
> >> >> > To ensure that the nnet_a_online model is performing well on the
> >> >> > 19-hour
> >> >> > Blizzard dataset and that it is producing correct alignments, I
> want
> >> >> > to
> >> >> > run
> >> >> > the decoding script on the Blizzard data. However, the
> nnet_a_online
> >> >> > model
> >> >> > on kadi-asr.org doesn't seem to have a graph directory needed for
> >> >> > decoding.
> >> >> > Is there any way I can get a hold of this directory without
> training
> >> >> > the
> >> >> > entire model?
> >> >
> >> >
> >
> >
>