|
From: Mailing l. u. f. U. C. a. U. <kal...@li...> - 2013-08-01 18:59:49
|
Following the formal s5 scripts in wsj and the directions below, I was able to get word timings that ROUGHLY matched the decoding values I was getting. in the decode file (20.txt , decoding with a beam of 20) : 02.cut1 YOU ARE STANDING ON A SANDY WHITE BEACH OF THE ASSISTED AND END A THAN ANY OTHER KIND CONSISTED . . . in the timings.all.txt file: 02.cut1 1 2.64 0.03 <UNK> 02.cut1 1 3.06 0.11 YOU 02.cut1 1 3.17 0.22 ARE 02.cut1 1 3.39 0.40 STANDING 02.cut1 1 4.06 0.23 ON 02.cut1 1 4.29 0.06 A 02.cut1 1 4.35 0.12 IS 02.cut1 1 4.57 0.75 INVIOLATE 02.cut1 1 5.43 1.24 ECOSYSTEM 02.cut1 1 6.88 0.35 NESTS 02.cut1 1 7.64 0.65 ASSISTED(2) 02.cut1 1 8.29 0.19 OUT 02.cut1 1 9.76 1.06 ENLISTED(2) 02.cut1 1 10.82 0.41 WOULD 02.cut1 1 11.23 0.59 AN 02.cut1 1 11.82 0.79 ALAN'S 02.cut1 1 12.67 1.03 NETTLESOME 02.cut1 1 13.84 1.00 INSISTED 02.cut1 1 14.84 0.21 AND Here are the commands: %utils/mkgraph.sh data/local/g300_test/lang exp/tri1 exp/tri1/graph %steps/decode.sh --nj 10 --model exp/tri1/final.mdl --num-threads 1 --acwt 0.1 --cmd "$decode_cmd" --config conf/decode.config exp/tri1/graph data/local/g300_test exp/tri1/decode_g300_test %lattice-1best "ark:gunzip -c exp/tri1/decode_g300_test/lat.*.gz|" ark:- | lattice-align-words ./data/local/g300_test/lang/phones/word_boundary.int exp/tri1/final.mdl ark:- ark:- | nbest-to-ctm ark:- - | ./utils/int2sym.pl -f 5 ./data/local/g300_test/lang/words.txt > exp/tri1/decode_g300_test/timings.all.txt LOG (lattice-1best:main():lattice-1best.cc:88) Done converting 339 to best path, 0 had errors. LOG (lattice-align-words:main():lattice-align-words.cc:117) Successfully aligned 339 lattices; 0 had errors. LOG (nbest-to-ctm:main():nbest-to-ctm.cc:95) Converted 339 linear lattices to ctm format; 0 had errors. It should use the lattices generated from the decoding and the word_boundary and word files. I can apply weights for language model and acoustic model, but I doubt that will have a great effect. The words.txt file must be correct if I am getting "similar" results. Anyway, any help is appreciated. Nathan |