CMU Sphinx / Forums / Help: CMU SLMT does not work

NeoGermi - 2005-11-04

Hello,

I've already finished my project.
The exploding likelihood explained 2 months ago came from bad sound files and could not be repaired. So I had to switch to another corpus and lo and behold ;-) it works fine :-)

But another problem occures now by building my own language model and involving it into my system. I didn't find any solution in the net, nor in the docu neither in the forum here. I first tried to build the binary-lm explained by the SLMT(statistical language modelling toolkit)-docu and it won't work, so the ARPA version. The one explained in the sphinx4-docu doesn't work, too.
I get some error messages like "Bad format" or something else... Also the lm3g2dmp won't help, first of all, it is not able to handle case-sensitiveness and then it wont't produce the right format. So why do we have it??

could someone who successfully made and integrated an own LM give some helpul tipps?

Thanks a lot,

Sebastian

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- NeoGermi - 2005-11-05
  
  So, I decided to build a LM with only lowercase words and tried to build the DMP file with the lm3g2dmp tool. The result is, that I get
  a NullPointer exception when starting the speech recognition with the following message:
  
  java.lang.NullPointerException
  at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist.getInitialSearchState(LexTreeLinguist.java:461)
  
  Does anyone has an idea what this could be?
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Darren Remington - 2006-01-05
    
    I'm getting the same problem - the problem is definitely with the toolkit:
    
    I put my text corpus the the Online QuickLM tool (this small corpus has 660 words) and that one works for me.
    
    When I run that exact same corpus thru the CMU SLMT, I get the null pointer exception, specifically:
    
    Exception in thread "main" java.lang.NullPointerException
    at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist.getInitialSearchState(LexTreeLinguist.java:461)
    at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist.compileGrammar(LexTreeLinguist.java:487)
    at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist.allocate(LexTreeLinguist.java:406)
    at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.allocate(WordPruningBreadthFirstSearchManager.java:323)
    at edu.cmu.sphinx.decoder.Decoder.allocate(Decoder.java:109)
    at edu.cmu.sphinx.recognizer.Recognizer.allocate(Recognizer.java:182)
    ====================================================================
    
    What I noticed is three things:
    1. the working LM has all units in UPPERCASE and the non-working LM has all units in lowercase
    2. the working LM has entries for silence tags - <S> and </S> and the non-working LM does not.
    3. the working LM has no entry for the unknown tag - <UNK> and the non-working LM does.
    
    I need to get the CMU SLMT working - I have a corpus of 5104 words ready to be tested and I can't build the LM using the QuickLM for that many words.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Darren Remington - 2006-01-06
    
    FOUND IT - You need to have entries for silence tags ... <s> and </s> ...
    
    I added the silence tags to my text corpus (at the beginning and end of each sentence) and the LM I created using the CMU SLMT worked just fine.
    
    Good Luck on yours.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- NeoGermi - 2005-11-19
  
  noone any idea? does noone hasexperiences with CMU SLT? I cant believe that!
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- yazanj - 2005-11-26
  
  I'm having similar problems, I've tried to build my own lanuage model using CMU SLMT but sphinx 4 complained about having unexpeted EOF in the .lm file.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

CMU SLMT does not work

Speech Recognition Toolkit

Forums

Help

CMU SLMT does not work document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

When I run that exact same corpus thru the CMU SLMT, I get the null pointer exception, specifically:

CMU SLMT does not work