Menu

Merging language models

Help
CK
2014-09-16
2014-09-16
  • CK

    CK - 2014-09-16

    In CMUCLMTK tool kit we can see a mergeidngram exe to merge only idngram files, if I need to merge a standard en-us generic language model from CMU sphinx with our domain specific language model how can I do that..? as the downloaded model is in .dmp format, how to convert that into idngram?

     
    • Nickolay V. Shmyrev

      In CMUCLMTK tool kit we can see a mergeidngram exe to merge only idngram files, if I need to merge a standard en-us generic language model from CMU sphinx with our domain specific language model how can I do that..?

      CMUCLMTK has lm_combine tool to combine language model

      You can also use SRILM (more functional and modern toolkit), the command to mix lm is ''ngram -mix-lm''

      as the downloaded model is in .dmp format, how to convert that into idngram?

      It is not possible

       
      • Manoj Gaonkar

        Manoj Gaonkar - 2017-06-30

        How to use lm_combine?

        What is -weight argument?
        When i tried merging the default CMU Sphinx LM with custom language model, I got the following

        Reading in a 3-gram language model.
        Number of 1-grams = 72354.
        Number of 2-grams = 6581523.
        Number of 3-grams = 7704188.
        Reading unigrams...

        Reading 2-grams...
        Error - Repeated 2-gram in ARPA format language model.

         

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.