From: Daniel P. <dp...@gm...> - 2015-06-23 06:57:36
|
It could still be about insertion errors. Typically you want insertion rates about 1/3 to 1/2 as big as deletion rates. If your setup is getting too many insertions, it could be using the LM scale to compensate. Playing with an insertion penalty may help (see the more recent scoring scripts). Dan On Tue, Jun 23, 2015 at 1:04 AM, Kirill Katsnelson <kir...@sm...> wrote: > Yes, I am using the pretty standard nnet2_online model with the librispeech data, with a 8 kHz conversion and a squished frequency range of the high-res features, as I am finding there is a lot of rather useless variance in the very low range, given the data are coming mostly from cell phones. But nothing fancy there overall. > > -kkm > >> -----Original Message----- >> From: Daniel Povey [mailto:dp...@gm...] >> Sent: 2015-06-22 2131 >> To: Kirill Katsnelson >> Cc: Nagendra Goel; kal...@li... >> Subject: Re: [Kaldi-users] LM weight >> >> By a lot of context I mean left-context and right-context, in the >> splicing. But I guess you are using one of the standard types of >> model. >> Dan >> >> >> On Tue, Jun 23, 2015 at 12:24 AM, Kirill Katsnelson >> <kir...@sm...> wrote: >> > The majority of the WER comes from subs, so this part looks pretty >> normal. >> > >> > A lot of acoustic context--probably, depending on the definition of >> "a lot." :-) Not sure I understand this part. How can I tell? It makes >> sense, looking at the base dev set figures that I got training the >> model from the first 500 hr of the librispeech corpus (best range of >> 16-17). Which are still higher than the reference in the RESULTS for >> the full 1Khr corpus, which is rather in the 12-15 range. >> > >> > -kkm >> > >> >> -----Original Message----- >> >> From: Daniel Povey [mailto:dp...@gm...] >> >> Sent: 2015-06-22 2059 >> >> To: Kirill Katsnelson >> >> Cc: Nagendra Goel; kal...@li... >> >> Subject: Re: [Kaldi-users] LM weight >> >> >> >> Usually if there is a lot of acoustic context in your model you will >> >> require a larger LM weight. >> >> Also, if for some reason there tend to be a lot of insertions in >> >> decoding (e.g. something weird went wrong in training, or there is >> >> some kind of normalization problem), a large LM weight can help >> >> reduce insertions and so improve the WER. >> >> >> >> Dan >> >> >> >> >> >> On Mon, Jun 22, 2015 at 11:36 PM, Kirill Katsnelson >> >> <kir...@sm...> wrote: >> >> > I am getting the same ratio on both small and more targeted, and a >> >> quite large general LM. I do not understand what to make out if it! >> >> > >> >> > -kkm >> >> > >> >> >> -----Original Message----- >> >> >> From: Nagendra Goel [mailto:nag...@go...] >> >> >> Sent: 2015-06-22 2032 >> >> >> To: Kirill Katsnelson; kal...@li... >> >> >> Subject: RE: [Kaldi-users] LM weight >> >> >> >> >> >> Or maybe your domain is limited and LM very nicely matched to the >> >> >> task at hand? >> >> >> >> >> >> -----Original Message----- >> >> >> From: Kirill Katsnelson >> [mailto:kir...@sm...] >> >> >> Sent: Monday, June 22, 2015 11:29 PM >> >> >> To: kal...@li... >> >> >> Subject: [Kaldi-users] LM weight >> >> >> >> >> >> I my test sets I am getting the best WER at LM/acoustic weight in >> >> the >> >> >> range of 18-19, with multiple LMs of different size and origin. I >> >> was >> >> >> usually thinking the usual ballpark figure about 10, give or >> take. >> >> >> From your experience, does this larger LM weight mean anything, >> >> >> and what if it does? I am guessing an inadequate acoustic model, >> >> >> requiring more LM "pull"--am I making sense? >> >> >> >> >> >> -kkm >> >> >> >> >> >> ----------------------------------------------------------------- >> - >> >> >> -- >> >> - >> >> >> -- >> >> >> ----- >> >> >> -- >> >> >> Monitor 25 network devices or servers for free with OpManager! >> >> >> OpManager is web-based network management software that monitors >> >> >> network devices and physical & virtual servers, alerts via email >> & >> >> >> sms for fault. >> >> >> Monitor 25 devices for free with no restriction. Download now >> >> >> http://ad.doubleclick.net/ddm/clk/292181274;119417398;o >> >> >> _______________________________________________ >> >> >> Kaldi-users mailing list >> >> >> Kal...@li... >> >> >> https://lists.sourceforge.net/lists/listinfo/kaldi-users >> >> > >> >> > ------------------------------------------------------------------ >> - >> >> > -- >> >> - >> >> > -------- Monitor 25 network devices or servers for free with >> >> > OpManager! >> >> > OpManager is web-based network management software that monitors >> >> > network devices and physical & virtual servers, alerts via email & >> >> sms >> >> > for fault. Monitor 25 devices for free with no restriction. >> >> > Download now >> >> > http://ad.doubleclick.net/ddm/clk/292181274;119417398;o >> >> > _______________________________________________ >> >> > Kaldi-users mailing list >> >> > Kal...@li... >> >> > https://lists.sourceforge.net/lists/listinfo/kaldi-users |