Menu

MFCC on DSP

Help
Tomislav
2011-08-01
2012-09-22
  • Tomislav

    Tomislav - 2011-08-01

    Hello,
    I'm trying to extract MFCCs from a speech signal. The idea is to record speech
    and extract MFCCs from it on a portable DSP (Texas Instruments TMS320VC5505
    eZdsp USB Stick), and then use these MFCCs with pocketsphinx on a nearby PC
    (or server). The problem is that I don't know C++ or JAVA, only C. Could
    someone tell me what files from sphinxbase and pocketsphinx contain the code
    for extracting MFCCs from a speech signal, which I could just modify and then
    load into the compiler for my DSP. And where do I define the parameters
    (sampling frequency, number of MFCCs,...) and input wave file?
    I tried to search the forum but I don't understand much of it, and when I
    tried to examine some files (like wave2feat, fe_sigproc,...) I just get lost.
    I also noticed that most poeple here use Linux. I use Windows. Could that be a
    problem? Should I change to Linux?

    I also have a problem with running PocketSphinx. I downloaded and extracted
    both SphinxBase and PocketSphinx for Windows32 and did everything the
    instructions said, but when I run the pocketsphinx_continuous exe file, the
    cmd window appears for less than a second and then just disappereares. But
    when I run the exe file from cmd window, it says: "ERROR: "acmod.c", line 84:
    Acoustic model definition is not specified neither with -mdef option nor with
    -hmm".
    So I tried both:"pocketsphinx_continuous -mdef" and "pocketsphinx_continuous
    -hmm" and then I get the error:
    "ERROR: "cmd_ln.c", 779: Cannot open configuration file -mdef (-hmm) for
    reading".

    Thank you in advance.

     
  • Nickolay V. Shmyrev

    Could someone tell me what files from sphinxbase and pocketsphinx contain
    the code for extracting MFCCs from a speech signal, which I could just modify
    and then load into the compiler for my DSP.

    The feature extraction code is a part of sphinxbase library. The main header
    is

    sphinxbase/include/sphinxbase/fe.h
    

    The implementation is

    sphinxbase/src/libsphixnbase/fe/fe_interface.c
    sphinxbase/src/libsphixnbase/fe/fe_sigproc.c
    

    The main flow is:

    1. Initialize the feature extraction object with fe_init_auto_r
    2. Extract features from raw data with fe_process_frames
    3. Free resources

    You can find the sample in

    sphinxbase/test/unit/test_fe/test_fe.c

    Should I change to Linux? I

    Pocketsphinx perfectly works on Windows

    So I tried both:"pocketsphinx_continuous -mdef" and "pocketsphinx_continuous
    -hmm" and then I get the error: "ERROR: "cmd_ln.c", 779: Cannot open
    configuration file -mdef (-hmm) for reading".

    To use pocketsphinx_continuous you need to specify three entities - the
    acoustic model, the language model and the dictionary. For example:

    pocketsphinx_continuous.exe  -hmm ../../model/hmm/en_US/hub4wsj_sc_8k -lm ../../model/lm/en/turtle.DMP -dict ../../model/lm/en/turtle.dic
    

    You can learn more about pocketsphinx by reading the tutorial

    http://cmusphinx.sourceforge.net/wiki/tutorial

     
  • pittbullz07

    pittbullz07 - 2011-10-31

    Hello,

    I will post my questions here since they are related to the topic:

    I would like to use pocketsphinx decoder on a processor which has two
    heterogenous cores, an ARM with a Floating point unit and a 32 bit Fixed Point
    DSP. I was wondering if i could get a simple workflow for the decoder.
    Another question would be: what parts of the decoding consume most time, and
    if any of these can be paralelized, in order to use both processors at the
    same time?

    Also i would greatly appreciate advice and pointers on what functions to try
    and offload to the DSP.

     
  • NGUYEN dang-khoa

    is it possible if i want to reimplement the sphinx feature extraction by
    javascript (ofcouse the input is raw audio buffer)

     
  • pittbullz07

    pittbullz07 - 2011-11-04

    Hello,

    Any pointer regarding my question would be greatly appreciated

     
  • Nickolay V. Shmyrev

    I was wondering if i could get a simple workflow for the decoder.

    http://cmusphinx.sourceforge.net/doc/speech.ppt

    Another question would be: what parts of the decoding consume most time, and
    if any of these can be paralelized, in order to use both processors at the
    same time?

    The major parts of computation are acoustic scoring and viterbi path
    propagation. Acoustic scoring is good for DSP, viterbi path propagation is
    very hard to fit into current hardware architecutres.

     
  • Nickolay V. Shmyrev

    is it possible if i want to reimplement the sphinx feature extraction by
    javascript (ofcouse the input is raw audio buffer)

    Javascript is a turing-complete language, it's possible to implement any sort
    of computation using it. The only problem is that it will be very slow.

     
  • pittbullz07

    pittbullz07 - 2011-11-04

    Thank you very much for the answers and pointers!

     

Log in to post a comment.