Re: [Kaldi-users] Adapting LibriSpeech model to Blizzard2013 corpus

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

>
> Back to training on the Blizzard dataset, I was able to dump the iVectors
> for Blizzard's 19-hour subset. Where are they needed, though? Neither
> train_more2.sh nor get_egs2.sh seem to accept dumped iVectors as input.

It's  the --online-ivector-dir option.

> Regardless, I ran the train_more2.sh script on Blizzard's data/ and egs/
> folder (generated with get_egs2.sh), and I get the following errors in
> train.*.*.log:
>
> KALDI_ASSERT: at nnet-train-parallel:FormatNnetInput:nnet-update.cc:212,
> failed:
> data[0].input_frames.NumRows() >= num_splice
> [...]
> LOG (nnet-train-parallel:DoBackprop():nnet-update.cc:275) Error doing
> backprop, nnet info is: num-components 17
> num-updatable-components 5
> left-context 7
> right-context 7
> input-dim 140
> output-dim 5816
> parameter-dim 10351000
> [...]
>
> The logs tell me that the left and right contexts were set to 7. However, I
> specified them both to 3 when running get_egs2.sh. The
> egs/info/{left,right}_context files even confirm that they are set to 3. Is
> it possible that train_more2.sh is using the contexts from another
> directory?

The problem is that 3 < 7.  The neural net requires a certain amount
of temporal context (7 left and right, here) and if you dump less than
that in the egs it will crash.  So you need to set them to 7 when
dumping egs.

Dan

> On Tue, Jun 30, 2015 at 2:07 PM, Daniel Povey <dp...@gm...> wrote:
>>
>> Check the script that generated it; probably the graph directory was
>> in a different location e.g. in tri6 or something like that.
>> Hopefully we would have uploaded that too.
>> We only need to regenerate the graph when the tree changes.
>> Dan
>>
>>
>> On Tue, Jun 30, 2015 at 2:05 PM, Mate Andre <ele...@gm...> wrote:
>> > To ensure that the nnet_a_online model is performing well on the 19-hour
>> > Blizzard dataset and that it is producing correct alignments, I want to
>> > run
>> > the decoding script on the Blizzard data. However, the nnet_a_online
>> > model
>> > on kadi-asr.org doesn't seem to have a graph directory needed for
>> > decoding.
>> > Is there any way I can get a hold of this directory without training the
>> > entire model?
>
>