Menu

pocketsphinx utter words by alphabets

Help
Jim Hu
2011-10-18
2012-09-22
  • Jim Hu

    Jim Hu - 2011-10-18

    Hi, I'm currently trying to use pocketsphinx to create a dictionary that let
    users input words by spelling them.
    The ideal goal is to able to put all the words in, which is about 16,000
    words.

    I've looked through a lot of this forum's related posts and tried the
    following:
    1. kind of a hack -> using FSG with
    dict files looking like this:
    W-O-R-D   //pronunciation omitted
    W-O-R-L-D
    ...
    grammar files looking like this:
    <testgrammar> = W-O-R-D | W-O-R-L-D | ...; </testgrammar>

    result:
    putting all the words in results in crash at load time,
    as when I tried with putting less words in-> user may utter too slowly for
    "reading" a word (as I understand the pronunciation is designed for reading a
    word, not spelling it)

    1. use language model created with corpus.txt like this:
      W O R D
      W O R L D
      ...
      with dict files looking like this:(the full one is all 26 alphabets)
      W //pronunciation omitted
      O
      R
      L
      D

    result:
    the recognition error rate is very high and using nbest doesn't help very
    much, most of time it hears words that are some alphabets off, and most of the
    time it is way off, implementing a spell checker didn't help a lot neither...

    And I hope don't have to implement it like some posts mentioned, using
    "alpha""beta" etc. for improving the recognition rates.
    Can anyone please point me in a better direction?? I'm quite stuck....

    I'm a newbie at this, and really appreciate the efforts! This is a great
    project!

    Thanks a lot!

     
  • Nickolay V. Shmyrev

    Approach 2 has more sense but it also requires attention and probably
    reimplementation of some algorithms. Pocketsphinx is not really suitable for
    short-phone recognition. The algorithm itself requires modification.

    So if you really want to do this you are on the right way but you need to
    spend more time on identification of the accuracy issues and analyzis of the
    ways to fix them.

    The first step would be to collect a speech database and measure the accuracy
    you have with the current model.

     

Log in to post a comment.