From: Daniel P. <dp...@gm...> - 2015-06-29 20:28:19
|
> I am using the nnet_a_online model. Does this model require iVectors? Yes it does. You would have to extract them for your data- see the commands used in the script that trained the nnet_a_online model. You'd have to download the iVector extractor. You may also have to dump 40-dim features. That will be in the script to- also see the _common.sh script which it sources near the beginning. Some parts are in there. > Also, regarding the left and right contexts for get_egs2.sh, would I have to > use the values found in nnet_a_online/conf/splice.conf (--left-context=3, > --right-context=3) ? No, I don't think so; you have to use the left and right contexts of the nnet model itself. These are printed out by nnet-am-info. Dan > > Thank you for the advice. I am tackling a 19-hour subset of Blizzard before > moving on to the full, 300-hour dataset. > > On Mon, Jun 29, 2015 at 2:42 PM, Daniel Povey <dp...@gm...> wrote: >> >> Actually, you should probably be using train_more2.sh. It looks like >> the update_nnet.sh script is deprecated. >> train_more2.sh requires egs dumped by get_egs2.sh. [the "2" format of >> the egs is more compact.] >> In your scenario you would be dumping egs for the blizzard data. You >> would need alignments for the blizzard data. Be careful with the >> get_egs2.sh script because like other get_egs scripts, it will dump >> egs with the left-context and right context you specify, and the >> features you give it, but it can't check that it's correct. If you >> are using one of the "online" models that uses ivectors you would have >> to provide dumped ivectors, and these need to be computed with the >> same iVector extractor as the model that you are starting from. >> >> You might want to run on a small subset first; make sure that the >> training objective (e.g. in compute_train_prob.*.sh) is in the normal >> range, otherwise it may mean that you did something wrong. >> >> To get the alignments you would need to align using the same model as >> was used to align the data for training the original nnet- you can >> download that from kaldi-asr.org. >> Dan >> >> >> On Mon, Jun 29, 2015 at 7:03 AM, Mate Andre <ele...@gm...> wrote: >> > The train_more.sh script requires an egs directory, which seems to be >> > created by update_nnet.sh. However, update_nnet.sh requires an >> > alignments >> > directory. >> > >> > If I'm planning to run update_nnet.sh with data/train_960, does that >> > mean I >> > have to find alignments for train_960 before running update_nnet.sh? Is >> > there a faster way to generate the egs directory without having to >> > update >> > the neural net? >> > >> > On Thu, Jun 25, 2015 at 2:26 PM, Daniel Povey <dp...@gm...> wrote: >> >> >> >> I think the script train_more.sh might be useful here. >> >> If you only have 1 GPU it might take as long as a week, but >> >> downloading the trained models might be a better idea. >> >> Dan >> >> >> >> >> >> >> >> > l am going to train a deep neural net model with "multi-splice" using >> >> > the >> >> > LibriSpeech dataset with the local/online/run_nnet2_ms.sh script >> >> > included in >> >> > Kaldi's repository, which I think will give the best resulting WER. >> >> > The >> >> > end >> >> > goal is to use the trained model in this phase for initializing a >> >> > next >> >> > model >> >> > to train and do forced alignment on Blizzard2013 dataset, >> >> > specifically >> >> > the >> >> > 2013-EH2 subset including 1 female speaker, 19 hours of speech and >> >> > sentence-level alignments. >> >> > I don't have much of experience with Kaldi and my questions are: >> >> > >> >> > 1. How long does it take to train on all (960hrs) of Librispeech on a >> >> > GPU >> >> > (say GTX TITAN X or K6000)? Even a rough estimate could be useful. >> >> > 2. Is there anything to take into account before training on >> >> > Librispeech? >> >> > 3. And more importantly, how should I initialize/train the next model >> >> > for >> >> > the Blizzard2013 dataset? I managed to go through data preparation >> >> > for >> >> > that >> >> > and created the necessary files. >> >> > >> >> > >> >> > >> >> > ------------------------------------------------------------------------------ >> >> > Monitor 25 network devices or servers for free with OpManager! >> >> > OpManager is web-based network management software that monitors >> >> > network devices and physical & virtual servers, alerts via email & >> >> > sms >> >> > for fault. Monitor 25 devices for free with no restriction. Download >> >> > now >> >> > http://ad.doubleclick.net/ddm/clk/292181274;119417398;o >> >> > _______________________________________________ >> >> > Kaldi-users mailing list >> >> > Kal...@li... >> >> > https://lists.sourceforge.net/lists/listinfo/kaldi-users >> >> > >> > >> > > > |