Re: [Kaldi-users] Adapting LibriSpeech model to Blizzard2013 corpus

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

The graph was indeed in tri6b.

Back to training on the Blizzard dataset, I was able to dump the iVectors
for Blizzard's 19-hour subset. Where are they needed, though? Neither
*train_more2.sh* nor *get_egs2.sh* seem to accept dumped iVectors as input.

Regardless, I ran the *train_more2.sh* script on Blizzard's data/ and egs/
folder (generated with *get_egs2.sh*), and I get the following errors in
train.*.*.log:

KALDI_ASSERT: at nnet-train-parallel:FormatNnetInput:nnet-update.cc:212,
failed:
data[0].input_frames.NumRows() >= num_splice
[...]
LOG (nnet-train-parallel:DoBackprop():nnet-update.cc:275) Error doing
backprop, nnet info is: num-components 17
num-updatable-components 5
left-context 7
right-context 7
input-dim 140
output-dim 5816
parameter-dim 10351000
[...]

The logs tell me that the left and right contexts were set to 7. However, I
specified them both to 3 when running *get_egs2.sh*. The
*egs/info/{left,right}_context* files even confirm that they are set to 3.
Is it possible that train_more2.sh is using the contexts from another
directory?

On Tue, Jun 30, 2015 at 2:07 PM, Daniel Povey <dp...@gm...> wrote:

> Check the script that generated it; probably the graph directory was
> in a different location e.g. in tri6 or something like that.
> Hopefully we would have uploaded that too.
> We only need to regenerate the graph when the tree changes.
> Dan
>
>
> On Tue, Jun 30, 2015 at 2:05 PM, Mate Andre <ele...@gm...> wrote:
> > To ensure that the nnet_a_online model is performing well on the 19-hour
> > Blizzard dataset and that it is producing correct alignments, I want to
> run
> > the decoding script on the Blizzard data. However, the nnet_a_online
> model
> > on kadi-asr.org doesn't seem to have a graph directory needed for
> decoding.
> > Is there any way I can get a hold of this directory without training the
> > entire model?
>