Re: [Kaldi-users] Adapting LibriSpeech model to Blizzard2013 corpus

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Actually, you should probably be using train_more2.sh.  It looks like
the update_nnet.sh script is deprecated.
train_more2.sh requires egs dumped by get_egs2.sh.  [the "2" format of
the egs is more compact.]
In your scenario you would be dumping egs for the blizzard data.  You
would need alignments for the blizzard data.  Be careful with the
get_egs2.sh script because like other get_egs scripts, it will dump
egs with the left-context and right context you specify, and the
features you give it, but it can't check that it's correct.   If you
are using one of the "online" models that uses ivectors you would have
to provide dumped ivectors, and these need to be computed with the
same iVector extractor as the model that you are starting from.

You might want to run on a small subset first; make sure that the
training objective (e.g. in compute_train_prob.*.sh) is in the normal
range, otherwise it may mean that you did something wrong.

To get the alignments  you would need to align using the same model as
was used to align the data for training the original nnet- you can
download that from kaldi-asr.org.
Dan

On Mon, Jun 29, 2015 at 7:03 AM, Mate Andre <ele...@gm...> wrote:
> The train_more.sh script requires an egs directory, which seems to be
> created by update_nnet.sh. However, update_nnet.sh requires an alignments
> directory.
>
> If I'm planning to run update_nnet.sh with data/train_960, does that mean I
> have to find alignments for train_960 before running update_nnet.sh? Is
> there a faster way to generate the egs directory without having to update
> the neural net?
>
> On Thu, Jun 25, 2015 at 2:26 PM, Daniel Povey <dp...@gm...> wrote:
>>
>> I think the script train_more.sh might be useful here.
>> If you only have 1 GPU it might take as long as a week, but
>> downloading the trained models might be a better idea.
>> Dan
>>
>>
>>
>> > l am going to train a deep neural net model with "multi-splice" using
>> > the
>> > LibriSpeech dataset with the local/online/run_nnet2_ms.sh script
>> > included in
>> > Kaldi's repository, which I think will give the best resulting WER. The
>> > end
>> > goal is to use the trained model in this phase for initializing a next
>> > model
>> > to train and do forced alignment on Blizzard2013 dataset, specifically
>> > the
>> > 2013-EH2 subset including 1 female speaker, 19 hours of speech and
>> > sentence-level alignments.
>> > I don't have much of experience with Kaldi and my questions are:
>> >
>> > 1. How long does it take to train on all (960hrs) of Librispeech on a
>> > GPU
>> > (say GTX TITAN X or K6000)? Even a rough estimate could be useful.
>> > 2. Is there anything to take into account before training on
>> > Librispeech?
>> > 3. And more importantly, how should I initialize/train the next model
>> > for
>> > the Blizzard2013 dataset? I managed to go through data preparation for
>> > that
>> > and created the necessary files.
>> >
>> >
>> > ------------------------------------------------------------------------------
>> > Monitor 25 network devices or servers for free with OpManager!
>> > OpManager is web-based network management software that monitors
>> > network devices and physical & virtual servers, alerts via email & sms
>> > for fault. Monitor 25 devices for free with no restriction. Download now
>> > http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
>> > _______________________________________________
>> > Kaldi-users mailing list
>> > Kal...@li...
>> > https://lists.sourceforge.net/lists/listinfo/kaldi-users
>> >
>
>