Menu

Word recognition problem

Help
Anonymous
2003-09-23
2012-09-22
  • Anonymous

    Anonymous - 2003-09-23

    Hi

    I just recently started to test Sphinx2. I created a small dictionary with 5 words. Everything works fine. Word detection is very accurate, the only problem is when I say a word, that is not in dictionary. The program displays one of the words from dictionary, even that speaked word is very different. How can I solve this problem, that if the word is not in dictionary the program will display nothing?

    I have these words in dictionary:
    LEFT, RIGHT, UP, DOWN, STOP

    If I say MACHINE, the program display DOWN!?!

    tnx for help

    Martin

     
    • Anonymous

      Anonymous - 2003-09-24

      I'm not sure as to what your question is.  Is it how can you set it so it only displays the words that match 100% and not what it "thinks" you might have said?

      In that case, I might suggest playing around with the Language Weights/Penalties settings in the decoder setup.  Try setting the -inspen to 0, see if that does anything.  I'm not sure if you can set a threshold to reject the utterance if it doesn't score high enough in the search, perhaps someone a little more knowlegable might have that answer.

      Also, does anyone think that if he trains a word-based acoustic model if that would solve it?

      Good luck,
      Steve

       
    • Anonymous

      Anonymous - 2003-10-10

      Martin -- you have demonstrated a fundamental problem of a "pure" HMM speech recognizer, which becomes important with smaller language models, such as in your case.  The HMM recognition estimates the most probable word sequence given the audio signal presented.  Most probable of what?  Of the word sequences under consideration.
        -- in the case of large vocabulary recognition (for which Sphinx was designed), there's an enormous set of alternative utterances to choose from, and the most probable of those is probably what's desired; if it's incorrect, the result will be something close.  Depending on the application, that's probably OK.
        -- but in the case of a small language model, the question of out-of-vocabulary utterances becomes very important, and the "most probable" result doesn't answer another important question, "How probable is it that the answer is correct?"

      It's tempting to attempt to use the HMM recognition score to a measure of correctness, but for various reasons, that is invalid.

      The topics of rejection and confidence measures are relevant to your question.  The latter topic in particular is fairly popular in the recent literature, and you'll find a number of papers by searching the WWW.  Here are a few examples:
        http://www.sls.lcs.mit.edu/sls/publications/2002/Hazen_CSL02.pdf
        http://www.telecom.tuc.gr/paperdb/eurospeech99/PAPERS/S1O3/C059.PDF
        http://www.clsp.jhu.edu/publications/pubs/SL980401.pdf

      cheers,
        jerry wolf
        soliloquy learning inc.

       
    • Christoffer Andersson

      That is a great reply Jerry, but is there a way to get around it, or reject words that are out of vocabulary in Sphinx2 without digging into the theory?

      I am very interested in using Sphinx as a command and control proggy, and then it is very important that a command is not guessed wrong, especially if the word is not even close, or out of vocabulary.

      Chris.

       
      • Anonymous

        Anonymous - 2003-10-14

        AFAIK Sphinx doesn't come with any way of rejecting OOV words; it'll simply give you the best matching sequence of the words it knows about -- the ones in your language model.  I wish I knew an easy answer, but I don't!

        jerry

         
    • Anonymous

      Anonymous - 2003-10-31

      Martin,
      I'm not sure if you're still looking to answer this problem.  I'm beginning to evaluate a new VR commercial product called LumenVox.  It's released as an SDK, primarily developed to Telephone interaction systems, however would also be applicable to command & control type applications.  That product has the feature you're looking for here, where it returns confidence scores for your utterances, and your code can easily reject presumable OOV words based on these scores.  The site is www.lumenvox.com, and they have evaluation s/w you can download.
      Hope it helps,
      Steve

       
    • Manohar Chapalamadugu

      Hi

      I downloaded the sphinx2 on my windows XP and tried testing the batch files. I compiled SphinxExamples.dsw using VC 6.0. I also compiled continous classes too for the batch file to run.

      I ran simple.bat and the program doesn't even come close with success rate lesser than 1%. Is this what I should expect without training? Do I need to do something more to increase the success rate.

      I am looking for a speech recognition engine for embedded devices (low mem and less processing power) for my school project. Can somebody suggest me an alternative for speaker independent no training recognition engine?  CSLU?

      Thanks.

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.