From: Daniel P. <dp...@gm...> - 2015-06-29 19:04:14
|
That's OK. Those warnings happen when words in the "text" file are not covered in the words.txt (they get replaced with the designated OOV word), but all those words are either super-rare words or mis-spellings or normalization failures, so it's not a problem that they are not in the vocabulary. Dan On Mon, Jun 29, 2015 at 6:45 AM, Mate Andre <ele...@gm...> wrote: > The alignments script as been running for about a day and I've found these > warnings in align.*.log: > > sym2int.pl: replacing HIGGINSES with 2 > sym2int.pl: replacing MEASTERS with 2 > sym2int.pl: replacing YO'RS with 2 > sym2int.pl: replacing HIGGINSES with 2 > sym2int.pl: replacing THEVENOT with 2 > sym2int.pl: replacing PASQUA with 2 > sym2int.pl: replacing COCHINEALS with 2 > sym2int.pl: replacing HAMPER'S with 2 > sym2int.pl: replacing HUNDRED'LL with 2 > sym2int.pl: replacing CLEMMING with 2 > sym2int.pl: replacing CLEMMING with 2 > sym2int.pl: replacing HOU'D with 2 > sym2int.pl: replacing OURSEL with 2 > sym2int.pl: replacing SWOUNDING with 2 > sym2int.pl: replacing DID' with 2 > sym2int.pl: replacing INSTINCTLY with 2 > sym2int.pl: replacing DEFYINGLY with 2 > sym2int.pl: replacing BELIEVE' with 2 > sym2int.pl: replacing BROSSEN with 2 > sym2int.pl: replacing CLEAVINGS with 2 > sym2int.pl: not warning for OOVs any more times > > Can these warnings be safely ignored, or am I possibly using the wrong lang > directory? I'm currently using data/lang_nosp. > > > On Fri, Jun 26, 2015 at 6:05 PM, Daniel Povey <dp...@gm...> wrote: >> >> Use the tree from the regular nnet_a directory- the system has the same >> tree. >> Dan >> >> >> On Fri, Jun 26, 2015 at 5:55 PM, Mate Andre <ele...@gm...> wrote: >> > The "tree" file is missing from the nnet_a_online directory in the >> > Kaldi-ASR >> > build. Is it possible to create it without retraining the entire model? >> > >> > On Fri, Jun 26, 2015 at 5:02 PM, Daniel Povey <dp...@gm...> wrote: >> >> >> >> You need to point it to the nnet_a_online directory instead. >> >> Dan >> >> >> >> >> >> On Fri, Jun 26, 2015 at 4:59 PM, Mate Andre <ele...@gm...> >> >> wrote: >> >> > Thanks for the prompt reply. >> >> > >> >> > When using steps/online/nnet2/align.sh, I get the following error: >> >> > "no >> >> > such >> >> > file exp/nnet2_online/nnet_a/conf/online_nnet2_decoding.conf". Do I >> >> > need >> >> > to >> >> > generate "online_nnet2_decoding.conf" and the "conf" directory with >> >> > another >> >> > script, since they aren't included in the Kaldi-ASR build? >> >> > >> >> > On Fri, Jun 26, 2015 at 4:44 PM, Daniel Povey <dp...@gm...> >> >> > wrote: >> >> >> >> >> >> It expects 140 because 140 = 40 + 100, the 40 is the "hires" MFCC >> >> >> features (the Librispeech scripts create these from the wav data), >> >> >> and >> >> >> the 100 is the iVector features. You would have to get these from >> >> >> the >> >> >> iVector extractor. >> >> >> However, you may find your life is easier if you use >> >> >> steps/online/nnet2/align.sh, that will start from the wav data and >> >> >> do >> >> >> the feature extraction itself. >> >> >> Dan >> >> >> >> >> >> >> >> >> On Fri, Jun 26, 2015 at 4:41 PM, Mate Andre <ele...@gm...> >> >> >> wrote: >> >> >> > My goal is to find alignments for the 960-hour LibriSpeech >> >> >> > dataset. I >> >> >> > am >> >> >> > using the nnet2_online/nnet_a LibriSpeech model from the Kaldi-ASR >> >> >> > site, >> >> >> > and >> >> >> > I am running the steps/nnet2/align.sh script in Kaldi's >> >> >> > LibriSpeech >> >> >> > folder >> >> >> > using the following command: >> >> >> > >> >> >> > steps/nnet2/align.sh --nj 10 --cmd 'run.pl' data/train_960 >> >> >> > data/lang_nosp >> >> >> > exp/nnet2_online/nnet_a exp/nnet2_online/nnet_a_ali >> >> >> > >> >> >> > where exp/nnet2_online/nnet_a contains the files in >> >> >> > nnet2_online/nnet_a >> >> >> > and >> >> >> > exp/nnet2_online/nnet_a_ali is an empty directory. >> >> >> > >> >> >> > I'm getting the following error in the log files: >> >> >> > >> >> >> > ERROR (nnet-align-compiled:NnetComputer():nnet-compute.cc:70) >> >> >> > Feature >> >> >> > dimension is 13 but network expects 140 >> >> >> > >> >> >> > Am I using the correct script to generate the alignments, or is >> >> >> > there >> >> >> > another reason I am getting this error? >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > ------------------------------------------------------------------------------ >> >> >> > Monitor 25 network devices or servers for free with OpManager! >> >> >> > OpManager is web-based network management software that monitors >> >> >> > network devices and physical & virtual servers, alerts via email & >> >> >> > sms >> >> >> > for fault. Monitor 25 devices for free with no restriction. >> >> >> > Download >> >> >> > now >> >> >> > http://ad.doubleclick.net/ddm/clk/292181274;119417398;o >> >> >> > _______________________________________________ >> >> >> > Kaldi-users mailing list >> >> >> > Kal...@li... >> >> >> > https://lists.sourceforge.net/lists/listinfo/kaldi-users >> >> >> > >> >> > >> >> > >> > >> > > > |