Menu

adding processing techniques to sphinxbase

Help
luciano
2011-05-06
2012-09-22
  • luciano

    luciano - 2011-05-06

    Hello,
    I'd like to include some pre/post processing techniques in a pocketsphinx
    application. What I am doing now is using external programs to apply the
    techniques before training/decoding.
    Where would it be the proper place to include them within the sphinxbase
    source code?
    Here are some examples I'd like to use:

    Spectral substraction from: Yang Lu, Philipos C. Loizou. A geometric approach
    to spectral subtraction. Speech Communication 50 (2008) 453-466. 2008.http://
    www.ncbi.nlm.nih.gov/pmc/articles/PMC2516309/

    should it be somewhere inside cont_ad_base.c ?

    Spectral normalization with Histogram equalization: Ángel de la Torre, et. al.
    Histogram Equalization of Speech Representation for Robust Speech Recognition.
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 13, NO. 3, MAY 2005 ht
    tp://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1420370&isnumber=30690

    cmn.c?

    Filtering in the cepstral domain: Chia-Ping Chen and Jeff A. Bilmes. MVA
    Processing of Speech Features. IEEE TRANSACTIONS ON AUDIO, SPEECH, AND
    LANGUAGE PROCESSING, VOL. 15, NO. 1, JANUARY 2007 http://ieeexplore.ieee.org/
    stamp/stamp.jsp?tp=&arnumber=4032763&isnumber=4032760

    this tecnique uses several frames. feat.c?

    In order to include this new options, should I modify any of the _fe_parse_XX
    _ functions in fe_interface.c or should I do something else?

    Thank you very much in advance
    Luciano

     
  • Nickolay V. Shmyrev

    Hello Luciano

    I'd like to include some pre/post processing techniques in a pocketsphinx
    application

    That should be a great advancements. Mostly your location is ok, but I suggest
    you to create a new files for some of the methods you proposed. Just like
    cmn.c is a separate file, spectral substraction could be a separate one.

    Honestly I think that if we will approach this seriously we need to redesign
    the whole sphinxbase frontend. It should be way more flexible in terms of
    which processing stages to include and how to perform them. For example, some
    of the approaches will require us to apply VAD not in the first stage in
    cont_ad but also in the later stages after FFT or even after cepstrum is
    computed. We will also need a feedback loop.

    I suggest you to do a few design sessions first to work it out. We need to
    review other frontends of different toolkits for that. RWTH-ASR and Julius. We
    need to create an advanced state of the art fronend with pluggable design and
    efficient one.

    In order to include this new options, should I modify any of the fe_parse_XX
    functions in fe_interface.c or should I do something else?

    No issue to create a new function in the same file

     
  • luciano

    luciano - 2011-05-09

    Hello Nickolay, thank you very much for your reply.
    I will take your advices and I'll think it over on how to better implement and
    include the algorithms.
    I'll keep in touch on this regard.
    Thanks again,
    Luciano

     

Log in to post a comment.