From: Kirill K. <kir...@sm...> - 2015-06-16 01:14:31
|
I have a small set of sentences with repeat counts, and generating an LM out of it. One is generated by a horrible local tool I have trouble tracing exactly how. For this one L*G composition takes about 20 seconds on my CPU. Another LM I just generated out of the same files with srilm 1.7.1 ngram-count. This one has been sitting in mkgraphs.sh on L_disambig*G composition step for about 30 minutes, and still churning. fstdeterminizestar --use-log=true is running at 100%. L_disambig.fst is the same file in both cases. Looks like the G making it not determinizable, although I have no idea how it came to be. Anyone could share an advice on tracking down the problem? Thanks. -kkm |