I'm sure this question has been asked before, but all I could find were questions about Sphinx 4. So here goes:
I want to extract MFCCs from PCM data using PocketSphinx. I don't want to perform any other processing, just MFCC extraction. Is there some existing code that shows how to do that?
Also, what MFCC format does PocketSphinx use? Does it add a dummy coefficient with the signal power?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I also looked at other C/C++ implementations of MFCC extraction. But those that looked good either came with a non-permissive license (GPL) or were part of huge libraries.
Any help on how to extract MFCCs in code with PocketSphinx would be appreciated!
Last edit: Daniel Wolf 2018-09-18
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I did an MSc Project in 2008 to create an application for Windows CE/Windows Mobile 5.0 to enable capturing of speech and converting it to text (Speech to Text) using PocketSphinx 0.5. At the time of project PocketSphinx 0.5 was not officially supported or ported to Windows Mobile yet.
I too were interested in getting MFCCs in near real time for a personal project. I gave it a go awhile back, about 2 years. I need to dig into my external hard disks to find out how far I went with it. On top on my head, it may have been somewhere around pitch detection code - not sure.
Regards
Last edit: Nickolay V. Shmyrev 2018-09-19
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm sure this question has been asked before, but all I could find were questions about Sphinx 4. So here goes:
I want to extract MFCCs from PCM data using PocketSphinx. I don't want to perform any other processing, just MFCC extraction. Is there some existing code that shows how to do that?
Also, what MFCC format does PocketSphinx use? Does it add a dummy coefficient with the signal power?
I also looked at other C/C++ implementations of MFCC extraction. But those that looked good either came with a non-permissive license (GPL) or were part of huge libraries.
Any help on how to extract MFCCs in code with PocketSphinx would be appreciated!
Last edit: Daniel Wolf 2018-09-18
Hi Daniel,
I did an MSc Project in 2008 to create an application for Windows CE/Windows Mobile 5.0 to enable capturing of speech and converting it to text (Speech to Text) using PocketSphinx 0.5. At the time of project PocketSphinx 0.5 was not officially supported or ported to Windows Mobile yet.
I too were interested in getting MFCCs in near real time for a personal project. I gave it a go awhile back, about 2 years. I need to dig into my external hard disks to find out how far I went with it. On top on my head, it may have been somewhere around pitch detection code - not sure.
Regards
Last edit: Nickolay V. Shmyrev 2018-09-19
The code is available in sphinxbase sources:
https://github.com/cmusphinx/sphinxbase/blob/master/test/unit/test_fe/test_fe.c
https://github.com/cmusphinx/sphinxbase/blob/master/src/sphinx_fe/sphinx_fe.c
The API documentation is also avialable:
https://github.com/cmusphinx/sphinxbase/blob/master/include/sphinxbase/fe.h
Format is covered on the wiki page
https://cmusphinx.github.io/wiki/mfcformat/
Signal power coefficient is not dummy, it is an important feature value. cmusphinx uses 0 cepstral coefficient together with others.
Thank you very much!