CMU Sphinx / Forums / Help: Using a new model in pocketsphinx

Tamirru - 2013-12-23

Hi,
I am new at CMU Sphinx.
My current argument setup of pocketsphinx is with as follow:
-hmm ../../model/hmm/wsj
-lm ../../model/lm/en/language_model.arpaformat.DMP
-dict ../../model/lm/en/dic.dic

I use windows installation of pocketsphinx,
and I want to use this model:
http://cmusphinx.sourceforge.net/2013/01/a-new-english-language-model-release/
where I've downloaded 2 files: en-70k.lm and en-70k-0.1.lm.gz.md5

what should I do to use this model with my pocketsphinx setup?
Also, do I need to use specific audio input parameters? (frequency, mono etc.)

Thx!

Last edit: Tamirru 2013-12-23

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2013-12-23

Download new lanuage model en-us.lm.dmp here:

https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/US%20English%20Generic%20Language%20Model/

Download new acoustic model en-us here:

http://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/US%20English%20Generic%20Acoustic%20Model/en-us.tar.gz/download

For dictionary use cmu07.dic from pocketsphinx distribution.

Run it the same way as before with -hmm, -lm and -dict

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Tamirru - 2013-12-23

Thx, with the dmp file it is working, But I would like to use en-70k.lm
how should I run it? I tried -lm (instead of the dmp file) but got blank output

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2013-12-23

You can not use 70k lm with pocketsphinx, you need to limit vocabulary to less than 60k (or to 20k like in dmp file).

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- kunal sharma - 2017-05-18
  
  how can i limit the vocabulary in lm file ?
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Nickolay V. Shmyrev - 2017-05-18
    
    This thread is not relevant anymore, pocketsphinx should work with the model above.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - misha mihail - 2017-05-18
      
      Thank you for your reply, Nickolay. I only want few words like (start,
      stop, next, back on my video player), So can i use
      android-procketsphinx-demo with new.lm and new.dic? Or should i have to
      build dmp file?
      Please leave a message for me. Thank you!
      
      Last edit: Nickolay V. Shmyrev 2017-05-18
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Nickolay V. Shmyrev - 2017-05-18
        
        You already asked your question at another thread:
        
        https://sourceforge.net/p/cmusphinx/discussion/help/thread/16802cf0/
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Tamirru - 2013-12-23

Ohh.. ok thanks Nickolay!
Can you recommend on another good model for video transcript? (don't care about real time performance, I need high success rate)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2013-12-23

I've just recommended you the good models in the first reply.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Tamirru - 2013-12-23

Thought maybe video transcript might have a specific model.
OK I'll try it out. Thx again!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

rupy - 2014-05-21

So I tried this data, and it works better... but still only one or two words at the time and very slow. Is there anything that could improve speed and length of sentences?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Using a new model in pocketsphinx

Speech Recognition Toolkit

Forums

Help

Using a new model in pocketsphinx document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Using a new model in pocketsphinx