Menu

Nickolay, I'm asking ...

Help
Mark
2008-12-01
2020-01-17
  • Mark

    Mark - 2008-12-01

    Nickolay, it's probably time for me to ask.

    What specifically are the items below from your past post.

    Is this what will be coming in later releases pocketsphinx or are you hoping they will be coming.

    Has anyone outside CMU implemented these features in pocketsphinx and made them freely available like other universities or researchers.

    Also, is this the cutting edge or state of the art in front-end signal processing and recognition techniques for SR in the telephony domain?

    Thanks and here is the list of your items:

    Well, there are issues in both the decoder and the interface with the
    telephony application.


    First about the decoder, pocketsphinx right now is the most supported
    and most feature-reach decoder of the family, but in general it's still
    oriented on the embedded devices. For telephony applications you
    probably need to extend it a lot. The features that are currently
    missing are probably:

    • Out-of-box support for multiple recognizers (probably more a freeswitch
      issue and a model training issue, for example we have no free
      male/female model).

    • Speaker clustering.

    • Automatic VTLN estimation from pitch (This looks simple).

    • Good endpointer.

    • Discriminative training support in SphinxTrain (Huge task).

    • Good and clean support for a garbage model to be able to filter out
      out of grammar words.

    • Embedded RASTA extraction and RASTA model training.

    • Advanced features extraction

    Another issue is dialog tracking and understanding. CMU folks are doing
    work on dialog systems, for example Raven is available

    http://www.ravenclaw-olympus.org/systems_overview.html

    It would be worth to look on it and try to integrate it into
    freepbx. Decoder will need to support combined language model. As well
    as you'll need a component for postprocessing. The postprocessing includes
    disfluency removal, text normalization, text boundary detection. Integration
    with nltk probably useful for sense extraction.

    If you need more details on any of the above, feel free to ask.

     
    • munvar ali

      munvar ali - 2008-12-11

      Hi Nikolay and Mark

      Good Morning, Hope all are doing well

      From past 6 months i am using SPHINX-4 for indian accent english and acheived 60-70% accuracy for read speech not for spontaneous speech(free speech).

      Now i am interesting to work with pocket sphinx and contribute my knowledge, can one of you tell me the procedure and required softwares to be installed ??

      as of now i have

      1. Microsoft visual studio 2005
      2. Microsoft visual studio 2006
      3. Microsoft window mobile
      4. Active sync

      Thank you inadvance

       
    • Nickolay V. Shmyrev

      > Is this what will be coming in later releases pocketsphinx or are you hoping they will be coming.

      I hope they will be available one day but time frame is large (years or so)

      > Has anyone outside CMU implemented these features in pocketsphinx and made them freely available like other universities or researchers.

      I don't know anything about other implementations

      > Also, is this the cutting edge or state of the art in front-end signal processing and recognition techniques for SR in the telephony domain?

      I wouldn't say they are state of art, just a features that would be nice to have.

       
    • Mark

      Mark - 2008-12-02

      OK, a couple of things.

      First, what are the steps to I need to take or more likely get someone else to do to put these upgrades into pocketsphinx? This stuff is beyond my pay grade but I can give it a try or give an outline to another.

      Second, what is the state of the are in ASR within telephony? You mention a bunch of techniques before which all seem to condition and normalize the audio for variance in speaker pitch, noise, volume, etc, before the signal is chopped up and pushed to the recognizer.

      Thanks.

       
  • Sumadhur Vaidyula

    Hello Good Morning.!!!
    I have trained an acoustic model using the documentation and it is succesfully trained and tested.
    But while using the model in the command pocketsphinx_continuous and giving the directory path of Model in -hmm and dictionary in -dict the command is running but it is not printing anything.
    Please Help

     
  • Sumadhur Vaidyula

    This is the screenshot of training process.!!

     
  • Sumadhur Vaidyula

    This is my Data set.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.