From: Jan T. <jt...@gm...> - 2015-06-16 01:22:01
|
Does your LM or text contain characters that one could call "special"? The arpa2fst is probably one of the more horrible parts of kaldi -- I remember I had similar issues a year ago -- turned out there was an issue with some character being "special" for the arpa2fst. Dan fixed it, but I don't recall how or if it would be possible tht the same problem might appear again (with a different character). y. On Mon, Jun 15, 2015 at 9:14 PM, Kirill Katsnelson < kir...@sm...> wrote: > I have a small set of sentences with repeat counts, and generating an LM > out of it. One is generated by a horrible local tool I have trouble tracing > exactly how. For this one L*G composition takes about 20 seconds on my CPU. > Another LM I just generated out of the same files with srilm 1.7.1 > ngram-count. This one has been sitting in mkgraphs.sh on L_disambig*G > composition step for about 30 minutes, and still churning. > fstdeterminizestar --use-log=true is running at 100%. L_disambig.fst is the > same file in both cases. Looks like the G making it not determinizable, > although I have no idea how it came to be. > > Anyone could share an advice on tracking down the problem? Thanks. > > -kkm > > > ------------------------------------------------------------------------------ > _______________________________________________ > Kaldi-users mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-users > |