CMU Sphinx / Forums / Help: MFCC on DSP

Tomislav - 2011-08-01

Hello,
I'm trying to extract MFCCs from a speech signal. The idea is to record speech
and extract MFCCs from it on a portable DSP (Texas Instruments TMS320VC5505
eZdsp USB Stick), and then use these MFCCs with pocketsphinx on a nearby PC
(or server). The problem is that I don't know C++ or JAVA, only C. Could
someone tell me what files from sphinxbase and pocketsphinx contain the code
for extracting MFCCs from a speech signal, which I could just modify and then
load into the compiler for my DSP. And where do I define the parameters
(sampling frequency, number of MFCCs,...) and input wave file?
I tried to search the forum but I don't understand much of it, and when I
tried to examine some files (like wave2feat, fe_sigproc,...) I just get lost.
I also noticed that most poeple here use Linux. I use Windows. Could that be a
problem? Should I change to Linux?

I also have a problem with running PocketSphinx. I downloaded and extracted
both SphinxBase and PocketSphinx for Windows32 and did everything the
instructions said, but when I run the pocketsphinx_continuous exe file, the
cmd window appears for less than a second and then just disappereares. But
when I run the exe file from cmd window, it says: "ERROR: "acmod.c", line 84:
Acoustic model definition is not specified neither with -mdef option nor with
-hmm".
So I tried both:"pocketsphinx_continuous -mdef" and "pocketsphinx_continuous
-hmm" and then I get the error:
"ERROR: "cmd_ln.c", 779: Cannot open configuration file -mdef (-hmm) for
reading".

Thank you in advance.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2011-08-01

Could someone tell me what files from sphinxbase and pocketsphinx contain
the code for extracting MFCCs from a speech signal, which I could just modify
and then load into the compiler for my DSP.

The feature extraction code is a part of sphinxbase library. The main header
is

sphinxbase/include/sphinxbase/fe.h

The implementation is

sphinxbase/src/libsphixnbase/fe/fe_interface.c sphinxbase/src/libsphixnbase/fe/fe_sigproc.c

The main flow is:

Initialize the feature extraction object with fe_init_auto_r

Extract features from raw data with fe_process_frames

Free resources

You can find the sample in

sphinxbase/test/unit/test_fe/test_fe.c

Should I change to Linux? I

Pocketsphinx perfectly works on Windows

So I tried both:"pocketsphinx_continuous -mdef" and "pocketsphinx_continuous
-hmm" and then I get the error: "ERROR: "cmd_ln.c", 779: Cannot open
configuration file -mdef (-hmm) for reading".

To use pocketsphinx_continuous you need to specify three entities - the
acoustic model, the language model and the dictionary. For example:

pocketsphinx_continuous.exe -hmm ../../model/hmm/en_US/hub4wsj_sc_8k -lm ../../model/lm/en/turtle.DMP -dict ../../model/lm/en/turtle.dic

You can learn more about pocketsphinx by reading the tutorial

http://cmusphinx.sourceforge.net/wiki/tutorial
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

pittbullz07 - 2011-10-31

Hello,

I will post my questions here since they are related to the topic:

I would like to use pocketsphinx decoder on a processor which has two
heterogenous cores, an ARM with a Floating point unit and a 32 bit Fixed Point
DSP. I was wondering if i could get a simple workflow for the decoder.
Another question would be: what parts of the decoding consume most time, and
if any of these can be paralelized, in order to use both processors at the
same time?

Also i would greatly appreciate advice and pointers on what functions to try
and offload to the DSP.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

NGUYEN dang-khoa - 2011-11-01

is it possible if i want to reimplement the sphinx feature extraction by
javascript (ofcouse the input is raw audio buffer)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

pittbullz07 - 2011-11-04

Hello,

Any pointer regarding my question would be greatly appreciated

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2011-11-04

I was wondering if i could get a simple workflow for the decoder.

http://cmusphinx.sourceforge.net/doc/speech.ppt

Another question would be: what parts of the decoding consume most time, and
if any of these can be paralelized, in order to use both processors at the
same time?

The major parts of computation are acoustic scoring and viterbi path
propagation. Acoustic scoring is good for DSP, viterbi path propagation is
very hard to fit into current hardware architecutres.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2011-11-04

is it possible if i want to reimplement the sphinx feature extraction by
javascript (ofcouse the input is raw audio buffer)

Javascript is a turing-complete language, it's possible to implement any sort
of computation using it. The only problem is that it will be very slow.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

pittbullz07 - 2011-11-04

Thank you very much for the answers and pointers!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

MFCC on DSP

Speech Recognition Toolkit

Forums

Help

MFCC on DSP document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

MFCC on DSP