"The speech utterance was segmented into 16ms frames, and
Hamming window was applied for smoothing and reducing spectral leakage. A
frame was overlapped with the next frame by 9ms. Each segmented speech frame
is parameterized using 12 Mel-frequency subband energy values. The observed
speech frame is transformed to the frequency domain using FFT. Then, a set of
12 Mel scaled filter banks which has frequency span between 200Hz to 3.2kHz,
was applied to the FFT power spectrum.(...) The vector of 12 speech power
coefficients for each speech frame, nf , was assigned to a cluster by vector
quantization"
I am looking for a example about how obtain MFCCs using Sphinx4. I wanna
obtain the vectors and pass it for a classifier (intelligence computacional
algoritm).
P.S. Audio processing is not my atuation area :)
Thanks and Sorry for my bad english.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
i'd like to do this:
"The speech utterance was segmented into 16ms frames, and
Hamming window was applied for smoothing and reducing spectral leakage. A
frame was overlapped with the next frame by 9ms. Each segmented speech frame
is parameterized using 12 Mel-frequency subband energy values. The observed
speech frame is transformed to the frequency domain using FFT. Then, a set of
12 Mel scaled filter banks which has frequency span between 200Hz to 3.2kHz,
was applied to the FFT power spectrum.(...) The vector of 12 speech power
coefficients for each speech frame, nf , was assigned to a cluster by vector
quantization"
I am looking for a example about how obtain MFCCs using Sphinx4. I wanna
obtain the vectors and pass it for a classifier (intelligence computacional
algoritm).
P.S. Audio processing is not my atuation area :)
Thanks and Sorry for my bad english.
See edu.cmu.sphinx.tools.feature.FeatureFileDumper class
Next time, please use sphinx4 forum when you ask about sphinx4.
Sorry about that and thanks for the answer you help me a lot!! ;)