|
From: Kirill K. <kir...@sm...> - 2015-06-23 03:37:14
|
Yes, it's been fixed. Thanks again Guoguo!
-kkm
> -----Original Message-----
> From: Guoguo Chen [mailto:che...@gm...]
> Sent: 2015-06-22 1920
> To: Daniel Povey
> Cc: Kirill Katsnelson; kal...@li...
> Subject: Re: [Kaldi-developers] spaces in ngram declarations in the
> \data\ section
>
> I checked in a fix for this. Kirill, could you have a try and see if
> this fixes your problem?
>
> Guoguo
> --
>
> On Mon, Jun 22, 2015 at 3:56 PM, Daniel Povey <dp...@gm...> wrote:
>
>
> I don't know if there is a formal definition of the ARPA format;
> things like this come up occasionally.
> The easiest thing is to just allow the format
> ngram= 12344
> as well as
> ngram = 12345,
> and also print a warning for any lines after the \data\ marker
> that
> are not interpretable.
> Guoguo, could you do this?
>
>
> Dan
>
>
>
> On Mon, Jun 22, 2015 at 3:45 PM, Kirill Katsnelson
> <kir...@sm...> wrote:
> > Some LM files have spaces in ngram declarations in the \data\
> section:
> >
> >
> > \data\
> > ngram 1=150000
> > ngram 2= 9774628
> > ngram 3= 44845299
> >
> >
> > \1-grams:
> > -7.89095 <s> -2.06214
> > -2.92635 don't -1.85988
> >
> > arpa-to-const-arpa does not like them in a peculiar way.
> Namely, it bombs out on the 1st unigram with a backoff weight, because
> it decided 1 is the final order. Looking at the code, the library code
> in src/lm/ const-arpa-lm.cc does not expect any space except between
> "ngram" and the rest of line, silently skipping any lines that begin
> with "ngram" and tokenized on space into less or more than 2 tokens.
> See line 316 if (keyword_found && col.size() == 2 && col[0] == "ngram")
> { -- not even a warning if "col.size() == 2" is false.
> >
> > Are these spaces legit? Should the tool be fixed, or the
> grammar? I never saw a formal spec of ARPA LM in my life.
> >
> > -kkm
> >
> >
> -----------------------------------------------------------------------
> -
> ------
> > Monitor 25 network devices or servers for free with OpManager!
> > OpManager is web-based network management software that
> monitors
> > network devices and physical & virtual servers, alerts via
> email & sms
> > for fault. Monitor 25 devices for free with no restriction.
> Download now
> > http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
> > _______________________________________________
> > Kaldi-developers mailing list
> > Kal...@li...
> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>
>
|