CMU Sphinx / Forums / Help: SphinxTrain for PocketSphinx ?

Speech Recognition Toolkit

SphinxTrain for PocketSphinx ?

Forum: Help

Creator: pegasus2000

Created: 2008-12-13

Updated: 2012-09-22

pegasus2000 - 2008-12-13

Hello, I've successfully executed pocketsphinx_tidigits
under Nanodesktop.

Now, I must create my own models for recognition of other
words. I have some questions:

a) SphinxTrain is usable also for PocketSphinx ?

b) There is an official how-to that explains how to use
SphinxTrain and the other utilities step by step in way to
create my models ?

c) Where is SphinxTrain source code ?

Thanks in advance.

(If you retain that it is useful, I have some advices to make
easier the porting of your software on other platforms; for
example, in our Nanodesktop porting we have had not few
troubles...)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- David Huggins-Daines - 2008-12-13
  
  Yes, please do post about the issues you had in posting...
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - pegasus2000 - 2008-12-13
    
    Here the video that shows your software
    working under nd.
    
    http://www.youtube.com/watch?v=Y0cqbzB6CV8
    
    Some notes:
    
    a) I have seen that you use double for
    computations. Some platforms use 32-bit
    depth for double (as for float) for
    performance reasons. So, if the source code
    is recompiled to work for this platform,
    variable as -wbeam and similar are truncated
    by the compiler to 0 value and this error
    causes a secondary error in the value returned
    by logmath and finally the error of the
    decoder.
    
    When we have seen in a previous video
    that there are 0 words recognized, the main
    trouble is that wbeam parameter was wrong.
    
    Under nd, I've modified your code, redirecting
    all calls to mathematical functions, as to
    printf and scanf, to dedicated routines,
    compiled to work at 64-bit depth, and
    separated by normal mathematical routines
    included in NanoM library.
    
    b) Some escape chars sequences for scanf aren't
    recognized by some versions of libc scanf.
    For example, Avr-Libc scanf doesn't recognize
    your scanf calls. It is adviceable to include
    a dedicated version of scanf in your source
    code (for example, I've added a copy of Minix
    scanf called psphinx_scanf, and I've modified
    your code to use it instead of normal
    NanoC scanf). (In any case, perhaps this problem
    can be considered a bug of Avr-Libc scanf
    and not a bug of Sphinx: I've replaced this
    routine also in NanoC library, so nd will
    have a new implementation of scanf in its
    next release).
    
    c) PocketSphinx_digits uses a microphone with
    a frequency of 8000 Hz. Some platforms haven't
    a microphone driver that can acquire at this
    frequency. I've created a software layer that
    executes an undersampling operation from
    44100 Hz (real frequency of acquisition for
    Nanodesktop ndHAL_Mic API) to 8000 Hz,
    before passing the data to the decoder. It is
    adviceable that the decoder includes this
    layer for undersampling internally and that
    it is automatically executed.
    
    I hope that there are no problems with
    SphinxTrain. I'll let you know what it
    happens.
    
    Thanks for your collaboration
    
    Filippo Battaglia
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

SphinxTrain for PocketSphinx ?

Speech Recognition Toolkit

Forums

Help

SphinxTrain for PocketSphinx ? document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

SphinxTrain for PocketSphinx ?