CMU Sphinx / Forums / Help: Word recognition problem

Anonymous - 2003-09-23

Hi

I just recently started to test Sphinx2. I created a small dictionary with 5 words. Everything works fine. Word detection is very accurate, the only problem is when I say a word, that is not in dictionary. The program displays one of the words from dictionary, even that speaked word is very different. How can I solve this problem, that if the word is not in dictionary the program will display nothing?

I have these words in dictionary:
LEFT, RIGHT, UP, DOWN, STOP

If I say MACHINE, the program display DOWN!?!

tnx for help

Martin

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous - 2003-09-24
  
  I'm not sure as to what your question is. Is it how can you set it so it only displays the words that match 100% and not what it "thinks" you might have said?
  
  In that case, I might suggest playing around with the Language Weights/Penalties settings in the decoder setup. Try setting the -inspen to 0, see if that does anything. I'm not sure if you can set a threshold to reject the utterance if it doesn't score high enough in the search, perhaps someone a little more knowlegable might have that answer.
  
  Also, does anyone think that if he trains a word-based acoustic model if that would solve it?
  
  Good luck,
  Steve
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous - 2003-10-10
  
  Martin -- you have demonstrated a fundamental problem of a "pure" HMM speech recognizer, which becomes important with smaller language models, such as in your case. The HMM recognition estimates the most probable word sequence given the audio signal presented. Most probable of what? Of the word sequences under consideration.
  -- in the case of large vocabulary recognition (for which Sphinx was designed), there's an enormous set of alternative utterances to choose from, and the most probable of those is probably what's desired; if it's incorrect, the result will be something close. Depending on the application, that's probably OK.
  -- but in the case of a small language model, the question of out-of-vocabulary utterances becomes very important, and the "most probable" result doesn't answer another important question, "How probable is it that the answer is correct?"
  
  It's tempting to attempt to use the HMM recognition score to a measure of correctness, but for various reasons, that is invalid.
  
  The topics of rejection and confidence measures are relevant to your question. The latter topic in particular is fairly popular in the recent literature, and you'll find a number of papers by searching the WWW. Here are a few examples:
  http://www.sls.lcs.mit.edu/sls/publications/2002/Hazen_CSL02.pdf
  http://www.telecom.tuc.gr/paperdb/eurospeech99/PAPERS/S1O3/C059.PDF
  http://www.clsp.jhu.edu/publications/pubs/SL980401.pdf
  
  cheers,
  jerry wolf
  soliloquy learning inc.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Christoffer Andersson - 2003-10-13
  
  That is a great reply Jerry, but is there a way to get around it, or reject words that are out of vocabulary in Sphinx2 without digging into the theory?
  
  I am very interested in using Sphinx as a command and control proggy, and then it is very important that a command is not guessed wrong, especially if the word is not even close, or out of vocabulary.
  
  Chris.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Anonymous - 2003-10-14
    
    AFAIK Sphinx doesn't come with any way of rejecting OOV words; it'll simply give you the best matching sequence of the words it knows about -- the ones in your language model. I wish I knew an easy answer, but I don't!
    
    jerry
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous - 2003-10-31
  
  Martin,
  I'm not sure if you're still looking to answer this problem. I'm beginning to evaluate a new VR commercial product called LumenVox. It's released as an SDK, primarily developed to Telephone interaction systems, however would also be applicable to command & control type applications. That product has the feature you're looking for here, where it returns confidence scores for your utterances, and your code can easily reject presumable OOV words based on these scores. The site is www.lumenvox.com, and they have evaluation s/w you can download.
  Hope it helps,
  Steve
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Manohar Chapalamadugu - 2003-11-17
  
  Hi
  
  I downloaded the sphinx2 on my windows XP and tried testing the batch files. I compiled SphinxExamples.dsw using VC 6.0. I also compiled continous classes too for the batch file to run.
  
  I ran simple.bat and the program doesn't even come close with success rate lesser than 1%. Is this what I should expect without training? Do I need to do something more to increase the success rate.
  
  I am looking for a speech recognition engine for embedded devices (low mem and less processing power) for my school project. Can somebody suggest me an alternative for speaker independent no training recognition engine? CSLU?
  
  Thanks.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Word recognition problem

Speech Recognition Toolkit

Forums

Help

Word recognition problem document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Word recognition problem