Hello,
I'm trying to extract MFCCs from a speech signal. The idea is to record speech
and extract MFCCs from it on a portable DSP (Texas Instruments TMS320VC5505
eZdsp USB Stick), and then use these MFCCs with pocketsphinx on a nearby PC
(or server). The problem is that I don't know C++ or JAVA, only C. Could
someone tell me what files from sphinxbase and pocketsphinx contain the code
for extracting MFCCs from a speech signal, which I could just modify and then
load into the compiler for my DSP. And where do I define the parameters
(sampling frequency, number of MFCCs,...) and input wave file?
I tried to search the forum but I don't understand much of it, and when I
tried to examine some files (like wave2feat, fe_sigproc,...) I just get lost.
I also noticed that most poeple here use Linux. I use Windows. Could that be a
problem? Should I change to Linux?
I also have a problem with running PocketSphinx. I downloaded and extracted
both SphinxBase and PocketSphinx for Windows32 and did everything the
instructions said, but when I run the pocketsphinx_continuous exe file, the
cmd window appears for less than a second and then just disappereares. But
when I run the exe file from cmd window, it says: "ERROR: "acmod.c", line 84:
Acoustic model definition is not specified neither with -mdef option nor with
-hmm".
So I tried both:"pocketsphinx_continuous -mdef" and "pocketsphinx_continuous
-hmm" and then I get the error:
"ERROR: "cmd_ln.c", 779: Cannot open configuration file -mdef (-hmm) for
reading".
Thank you in advance.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Could someone tell me what files from sphinxbase and pocketsphinx contain
the code for extracting MFCCs from a speech signal, which I could just modify
and then load into the compiler for my DSP.
The feature extraction code is a part of sphinxbase library. The main header
is
Initialize the feature extraction object with fe_init_auto_r
Extract features from raw data with fe_process_frames
Free resources
You can find the sample in
sphinxbase/test/unit/test_fe/test_fe.c
Should I change to Linux? I
Pocketsphinx perfectly works on Windows
So I tried both:"pocketsphinx_continuous -mdef" and "pocketsphinx_continuous
-hmm" and then I get the error: "ERROR: "cmd_ln.c", 779: Cannot open
configuration file -mdef (-hmm) for reading".
To use pocketsphinx_continuous you need to specify three entities - the
acoustic model, the language model and the dictionary. For example:
I will post my questions here since they are related to the topic:
I would like to use pocketsphinx decoder on a processor which has two
heterogenous cores, an ARM with a Floating point unit and a 32 bit Fixed Point
DSP. I was wondering if i could get a simple workflow for the decoder.
Another question would be: what parts of the decoding consume most time, and
if any of these can be paralelized, in order to use both processors at the
same time?
Also i would greatly appreciate advice and pointers on what functions to try
and offload to the DSP.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Another question would be: what parts of the decoding consume most time, and
if any of these can be paralelized, in order to use both processors at the
same time?
The major parts of computation are acoustic scoring and viterbi path
propagation. Acoustic scoring is good for DSP, viterbi path propagation is
very hard to fit into current hardware architecutres.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I'm trying to extract MFCCs from a speech signal. The idea is to record speech
and extract MFCCs from it on a portable DSP (Texas Instruments TMS320VC5505
eZdsp USB Stick), and then use these MFCCs with pocketsphinx on a nearby PC
(or server). The problem is that I don't know C++ or JAVA, only C. Could
someone tell me what files from sphinxbase and pocketsphinx contain the code
for extracting MFCCs from a speech signal, which I could just modify and then
load into the compiler for my DSP. And where do I define the parameters
(sampling frequency, number of MFCCs,...) and input wave file?
I tried to search the forum but I don't understand much of it, and when I
tried to examine some files (like wave2feat, fe_sigproc,...) I just get lost.
I also noticed that most poeple here use Linux. I use Windows. Could that be a
problem? Should I change to Linux?
I also have a problem with running PocketSphinx. I downloaded and extracted
both SphinxBase and PocketSphinx for Windows32 and did everything the
instructions said, but when I run the pocketsphinx_continuous exe file, the
cmd window appears for less than a second and then just disappereares. But
when I run the exe file from cmd window, it says: "ERROR: "acmod.c", line 84:
Acoustic model definition is not specified neither with -mdef option nor with
-hmm".
So I tried both:"pocketsphinx_continuous -mdef" and "pocketsphinx_continuous
-hmm" and then I get the error:
"ERROR: "cmd_ln.c", 779: Cannot open configuration file -mdef (-hmm) for
reading".
Thank you in advance.
The feature extraction code is a part of sphinxbase library. The main header
is
The implementation is
The main flow is:
You can find the sample in
Pocketsphinx perfectly works on Windows
To use pocketsphinx_continuous you need to specify three entities - the
acoustic model, the language model and the dictionary. For example:
You can learn more about pocketsphinx by reading the tutorial
http://cmusphinx.sourceforge.net/wiki/tutorial
Hello,
I will post my questions here since they are related to the topic:
I would like to use pocketsphinx decoder on a processor which has two
heterogenous cores, an ARM with a Floating point unit and a 32 bit Fixed Point
DSP. I was wondering if i could get a simple workflow for the decoder.
Another question would be: what parts of the decoding consume most time, and
if any of these can be paralelized, in order to use both processors at the
same time?
Also i would greatly appreciate advice and pointers on what functions to try
and offload to the DSP.
is it possible if i want to reimplement the sphinx feature extraction by
javascript (ofcouse the input is raw audio buffer)
Hello,
Any pointer regarding my question would be greatly appreciated
http://cmusphinx.sourceforge.net/doc/speech.ppt
The major parts of computation are acoustic scoring and viterbi path
propagation. Acoustic scoring is good for DSP, viterbi path propagation is
very hard to fit into current hardware architecutres.
Javascript is a turing-complete language, it's possible to implement any sort
of computation using it. The only problem is that it will be very slow.
Thank you very much for the answers and pointers!