Menu

Getting rid of false positives

Help
2009-09-15
2012-09-22
  • Mike Medved

    Mike Medved - 2009-09-15

    Hi - So thanks in part to numerous posts here and help from CMU folks, I've
    got a nice pocketsphinx solution working on a Virtex5 PowerPC system running
    QNX. It is doing recognition on a small grammar (12 commands), using a custom
    trained acoustic model.

    The recognition is excellent - like 98.5% accurate, but it gives false
    positives for basically everything. By false positive, I mean that I can speak
    words that aren't in the grammar at all and it will recognize them as words in
    the grammar (even if they're not even close).

    I'm sure there is something that can be done here by tweaking parameters, or
    manipulating the language model, I guess I'm looking for the obvious stuff. I
    can post the LM if needs be... it is pretty small.

     
  • Nickolay V. Shmyrev

    > The recognition is excellent - like 98.5% accurate

    Congratulations

    > is something that can be done here by tweaking parameters

    In your acoustic model you need to introduce the "garbage" phone
    that will represent everything else. Probably you need a few phones for common
    specific types of sounds. Then you need to include those "garbage"
    words into the grammar with a low probability. Also you need to get posterior
    probabilty with ps_get_prob from the pocketsphinx and compare it with some
    threshold. Everything here needs tuning unfortunately.

     
  • Mike Medved

    Mike Medved - 2009-09-21

    So maybe you can help me with how this works. My .phone file would have to
    look like:

    ...
    T
    V
    W
    Z
    SIL
    XXX

    And my .dic file would look like:

    SLEEP S L IY P
    STEP S T EH P
    TO T AH
    WAKEUP W EY K AH P
    GARBAGE XXX

    I'll paste my LM in the next post, cuz I have no clue how to manipulate it.

    But then what, I'd make these three changes, then do I have to rebuild my
    acoustic models?

    M

     
  • Mike Medved

    Mike Medved - 2009-09-21

    Turns out postingthe LM here is kind of a trainwreck. It doesn't respect new
    lines and it does strikethrough for <s> </s> which is all over the
    LM. I can email it or post it someplace else if needs be.

    M

     
  • Nickolay V. Shmyrev

    Hi Mike. Well, y, after you add a garbage to the phoneset you need to retrain
    the model. Then you can either add gargabe words into the dictionary and model
    them with lm or add them to a filler dictionary and model them as fillers
    automatically inserted after other words. You still need to carefully select
    the weight in order to get stable recognition results.

    I think we reallly need to prepare a demo on this and train a production model
    that could be used with pocketsphinx. It will take some time for me though.

     
  • Mike Medved

    Mike Medved - 2009-11-19

    Hi Nickolay-

    So after a couple of months hiatus, I'm back into looking at this. Just wanted
    to check and see if you guys have done any work on a demo for this in the mean
    time, or written some documentation about how to do it?

    Manipulating the phoneset and dictionary will be easy for me, but changing the
    LM is going to be a pain. I used your online tool to make the LM file and
    haven't touched it since and can't seem to find any documentation about what
    the format of the data in the file actually is.

    Another question - right now in my training data all I have is recordings of
    valid phrases, so do I now have to go off and record a bunch of stuff I don't
    really care about to train as garbage? Essentially everything else that can be
    said is garbage as far as I'm concerned...

     
  • Nickolay V. Shmyrev

    I'm back into looking at this. Just wanted to check and see if you guys have
    done any work on a demo for this in the mean time, or written some
    documentation about how to do it?

    No, still pending in todo

    Another question - right now in my training data all I have is recordings of
    valid phrases, so do I now have to go off and record a bunch of stuff I don't
    really care about to train as garbage? Essentially everything else that can be
    said is garbage as far as I'm concerned.

    Yes, it would be nice to record garbage at least for a testing part of the
    database. For training it also makes sense to record typical noises.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.