we're trying to improve the recognition task with decode_biglm.sh. We've created two FST language models(G with 32000 words and G' with 137000 words) and the HCLG graph with the tri1 acoustic model. After, we've used decode_biglm and we've obtained worse results than with decode.sh. We were wondering why we've obtained these results:
we're trying to improve the recognition task with decode_biglm.sh. We've
created two FST language models(G with 32000 words and G' with 137000
words) and the HCLG graph with the tri1 acoustic model. After, we've used
decode_biglm and we've obtained worse results than with decode.sh. We were
wondering why we've obtained these results:
The "biglm" decoder is not intended for situations where the vocabulary
differs. If the two G.fst's were built with "words.txt" files that are
different, you will get nonsense results.
You could try increasing the beam or max tokens for the biglm decoder.
On Friday, June 13, 2014, sprieto sprieto@users.sf.net wrote:
Hi,
we're trying to improve the recognition task with decode_biglm.sh. We've
created two FST language models(G with 32000 words and G' with 137000
words) and the HCLG graph with the tri1 acoustic model. After, we've used
decode_biglm and we've obtained worse results than with decode.sh. We were
wondering why we've obtained these results:
We understand that we have to create a small G (32000 words) and build graph HCLG.fst with it. After, we have to build a new G' bigger (137000 words where the 32000 words in G are included). So, is this the way to use decode_biglm.sh with G.fst and G'.fst ?
We have done this but didn't get better results that with decode.sh using HCLG.fst.
Thanks
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Sorry, we did not understand your reply. We think that we've got the point now. The main idea of the decode_biglm.sh script is to use two types of LMs; the small one to create the lattices and the biggest for rescoring. Is this correct? For this purpose, as you said, we do have to create the two LMs using the same vocabulary. But, also using the same amount of text? If so, where is the difference between both LMs? Could it be the number of n-s of the n-grams? (using, for instance, 2-grams for the small one and 5-gram for the big one?) Otherwise, there should be a way to create fst-s of different sizes from the same n-gram model (being both of 5-gram, for example), correct? Is it possible using the arpa2fst script?
Thank you so much in advance,
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
we're trying to improve the recognition task with decode_biglm.sh. We've created two FST language models(G with 32000 words and G' with 137000 words) and the HCLG graph with the tri1 acoustic model. After, we've used decode_biglm and we've obtained worse results than with decode.sh. We were wondering why we've obtained these results:
-decode.sh -> WER=37.84%
-decode_biglm.sh-> WER=60.81%
Thanks
You could try increasing the beam or max tokens for the biglm decoder.
On Friday, June 13, 2014, sprieto sprieto@users.sf.net wrote:
The "biglm" decoder is not intended for situations where the vocabulary
differs. If the two G.fst's were built with "words.txt" files that are
different, you will get nonsense results.
Dan
On Fri, Jun 13, 2014 at 8:05 AM, Paul Dixon edobashira@users.sf.net wrote:
Hi,
We understand that we have to create a small G (32000 words) and build graph HCLG.fst with it. After, we have to build a new G' bigger (137000 words where the 32000 words in G are included). So, is this the way to use decode_biglm.sh with G.fst and G'.fst ?
We have done this but didn't get better results that with decode.sh using HCLG.fst.
Thanks
Sorry, we did not understand your reply. We think that we've got the point now. The main idea of the decode_biglm.sh script is to use two types of LMs; the small one to create the lattices and the biggest for rescoring. Is this correct? For this purpose, as you said, we do have to create the two LMs using the same vocabulary. But, also using the same amount of text? If so, where is the difference between both LMs? Could it be the number of n-s of the n-grams? (using, for instance, 2-grams for the small one and 5-gram for the big one?) Otherwise, there should be a way to create fst-s of different sizes from the same n-gram model (being both of 5-gram, for example), correct? Is it possible using the arpa2fst script?
Thank you so much in advance,