In CMUCLMTK tool kit we can see a mergeidngram exe to merge only idngram files, if I need to merge a standard en-us generic language model from CMU sphinx with our domain specific language model how can I do that..? as the downloaded model is in .dmp format, how to convert that into idngram?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
In CMUCLMTK tool kit we can see a mergeidngram exe to merge only idngram files, if I need to merge a standard en-us generic language model from CMU sphinx with our domain specific language model how can I do that..?
CMUCLMTK has lm_combine tool to combine language model
You can also use SRILM (more functional and modern toolkit), the command to mix lm is ''ngram -mix-lm''
as the downloaded model is in .dmp format, how to convert that into idngram?
It is not possible
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
In CMUCLMTK tool kit we can see a mergeidngram exe to merge only idngram files, if I need to merge a standard en-us generic language model from CMU sphinx with our domain specific language model how can I do that..? as the downloaded model is in .dmp format, how to convert that into idngram?
CMUCLMTK has lm_combine tool to combine language model
You can also use SRILM (more functional and modern toolkit), the command to mix lm is ''ngram -mix-lm''
It is not possible
How to use lm_combine?
What is -weight argument?
When i tried merging the default CMU Sphinx LM with custom language model, I got the following
Reading in a 3-gram language model.
Number of 1-grams = 72354.
Number of 2-grams = 6581523.
Number of 3-grams = 7704188.
Reading unigrams...
Reading 2-grams...
Error - Repeated 2-gram in ARPA format language model.