Menu

Sphinx4 or PocketSphinx for a smart home?

2011-11-22
2012-09-22
  • Thomas Cherryhomes

    I have spent the last decade doing smart home research, and currently work on
    the LinuxMCE project
    (http://www.linuxmce.org/), and I find
    myself curious as to which engine would be better suited for ALL of the
    following use cases:

    • Simple declarative commands with/without qualifiers and extraneous vocabulary, "Computer, turn on the lights, please. Only halfway...more...more...thank you."
    • Dealing with names of media "Play 2001 a Space Odyssey" "Play everything by The Beatles"

    This would be the simplest example of the extremes of what I would ideally
    like to research for this..Would I use PocketSphinx, or Sphinx4, or a hybrid
    of both running in tandem, or?

    ... It is worth noting that we already have in LinuxMCE, a vastly distributed
    message passing architecture that everything sits on top of.

    Thanks,
    -Thom

     
  • Nickolay V. Shmyrev

    Since LinuxMCE is mostly in C++ and it's supposed to work in low-resource
    environment you need to use pocketsphinx. It has most of the required
    functionality for remote control.

    I need to warn you that if you have to build a real working system you will
    have issues with distant microphones and you will have to build a processing
    module for a microphone array (not a part of CMUSphinx). Or you need to use a
    close-talking microphone.

     
  • Nickolay V. Shmyrev

    Well, for microphone array there are open source packages too, like ManyEars.

     
  • Thomas Cherryhomes

    I understand the issues with dealing with microphone arrays. It's one of the
    reasons I am working with the microphone array present in the Kinect for my
    initial experiments.

    Is pocketsphinx still a viable option when I want to deal beyond a simple
    "remote control?" Please understand, that I am trying to also be able to
    SELECT media, deal with names of people, phone numbers, and other "not so
    black and white" aspects whilst building a vocabulary/grammar for this thing.

    -Thom

     
  • Nickolay V. Shmyrev

    Is pocketsphinx still a viable option when I want to deal beyond a simple
    "remote control?" Please understand, that I am trying to also be able to
    SELECT media, deal with names of people, phone numbers, and other "not so
    black and white" aspects whilst building a vocabulary/grammar for this thing.

    Please understand that software is also "not so black and white". There are
    always things which you will need to implement yourself. CMUSphinx is a just a
    tool, it's not a product. However, it's a good starting point for your project
    as I already told you above.

     
  • Thomas Cherryhomes

    I understand. I worked with Sphinx2 in the 2001-2002 time frame for a limited
    domain speech recognizer project.

    I'll stick with PocketSphinx for now.

     

Log in to post a comment.