Menu

Pocketshpinx general questions

Help
Tikiboy
2012-01-31
2012-09-22
  • Tikiboy

    Tikiboy - 2012-01-31

    Hey guys, I recently embarked on a journey to get pocketsphinx up and running
    and i finally manage to get an example working on Android.
    http://cmusphinx.sourceforge.net/2011/05/building-pocketsphinx-on-
    android/

    However, after inspecting the whole process, i am confused with some things
    and i have some queries.

    My first question is that in terms of the whole framework and how sphinx
    works, is pocketsphinx the same as Sphinx4? I am asking this question because
    i have tried to find documentations talking about the general concept/big
    picture of how pocketsphinx works and all i found was API documentations. On
    the other hand, Sphinx4 has documentations on the whole CMUSphinx concept.
    There was a really good diagram in the Sphinx4 diagram as well as a powerpoint
    http://cmusphinx.sourceforge.net/doc/speech.ppt describing these concepts. I am wondering if these
    concepts (especially the diagram on the configuration-decoder-linguist
    relationship) apply to pocketsphinx as well?

    My second question is from the android example above, after compiling, i
    realized that the method calls are all in Java. However, pocketsphinx and
    sphinxbase are in C/C++. I understand that SWIG was used to link the 2
    together and a java wrapper class has been made. However, the problem is that
    I am unclear of how does a developer know which java wrapper method made by
    SWIG correspond to which C/C++ method of pocketsphinx. I tried to make some
    comparism and it baffles me how the java wrapper has a boolean arguement that
    is mapped to an int arguement in C/C++.

    Lastly, i was just wondering if there is another documentation for
    pocketsphinx lying around aside from http://cmusphinx.sourceforge.net/api/poc
    ketsphinx/
    . I am asking
    this because given a bunch of API, i have no idea how to use pocketsphinx
    properly. There is no guideline on the basic initialization steps to take, for
    example i should initialize configurationmanager before calling a recognizer
    and subsequently get the result. So far, my only cue is to learn from the
    example i followed and copy every line of code it used to do my audio
    recognition.

    Sorry for the long post. I hope it is not confusing.

    Thanks alot!

     
  • Nickolay V. Shmyrev

    My first question is that in terms of the whole framework and how sphinx
    works, is pocketsphinx the same as Sphinx4? I am asking this question because
    i have tried to find documentations talking about the general concept/big
    picture of how pocketsphinx works and all i found was API documentations. On
    the other hand, Sphinx4 has documentations on the whole CMUSphinx concept.
    There was a really good diagram in the Sphinx4 diagram as well as a powerpoint
    http://cmusphinx.sourceforge.net/doc/speech.ppt describing these concepts. I am wondering if these
    concepts (especially the diagram on the configuration-decoder-linguist
    relationship) apply to pocketsphinx as well?

    Pocketsphinx works same way from the high level point of view. However,
    implementation details might be different. For example there is no separation
    between SearchManager/Linguist in pocketsphinx, instead a single search object
    performs the search. Acoustic model is the same and there are multiple
    scorers.

    However, the problem is that I am unclear of how does a developer know which
    java wrapper method made by SWIG correspond to which C/C++ method of
    pocketsphinx. I tried to make some comparism and it baffles me how the java
    wrapper has a boolean arguement that is mapped to an int arguement in C/C++.

    Yes, sometimes mapping is not straightforward, but in best case there must be
    a separate document on the Java API made with SWIG. Currently we unfortunately
    miss this part of documentation so you need to guess the functions. Usually
    it's easy to guess. For example boolean is naturally mapped to int in most of
    the programs in C. There is no separate boolean type, so everyone does that.
    You can find some more details in SWIG documentation.

    Lastly, i was just wondering if there is another documentation for
    pocketsphinx lying around aside from http://cmusphinx.sourceforge.net/api/poc
    ketsphinx/
    . I am asking
    this because given a bunch of API, i have no idea how to use pocketsphinx
    properly. There is no guideline on the basic initialization steps to take, for
    example i should initialize configurationmanager before calling a recognizer
    and subsequently get the result. So far, my only cue is to learn from the
    example i followed and copy every line of code it used to do my audio
    recognition.

    There is a tutorial which you must read before you start a development

    http://cmusphinx.sourceforge.net/wiki/tutorial

    It has part on the pocketsphinx

    http://cmusphinx.sourceforge.net/wiki/tutorialpocketsphinx

     
  • Tikiboy

    Tikiboy - 2012-02-01

    Thanks once again for the clarification. It tied up loose hanging ends that
    instill doubt in the learning process (never a good thing in my case).

     

Log in to post a comment.