Sorry, it's hard to comment on discussion without a direct question.
Overall there are many algorithms and methods to separate speech from non-speech, most advanced them can learn model of the human speech and effectively discriminate. One of the active are of research is non-negative matrix factorization which allows you to separate speech from overlapping sounds using the vocabulary of speech atoms. It's a wide area of research.
The algorithms for audio improvement are certainly related to CMUSphinx and could help it to improve the accuracy but they are not our primary focus right now, so it is not easy to expect us to implement noise reduction for Audacity.
If you have questions on speech processing let us know.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for your reply, Nickolay! Trust me, it hasn't been easy to obtain informed answers. Before I make a commitment to further learning and using Sphinx, I would like to ascertain that it will not be a dead end. As implementer, I am aware that I am taking the risks of being close to the infamous "bleeding edge", but so far, I am very encouraged. The work done by the Sphinx team is nothing short of remarkable.
I was just reading the paper "The CMU Sphinx-4 Speech Recognition System" by Lamere, Kwok, Gouvea, Raj, Singh, Walker and Wolf. This part piqued my interest:
"One of the features of this design is that the output of any
of the blocks can be tapped. Similarly, the actual input to the
system need not be at the first block, but can be at any of the
intermediate blocks. [...] In addition, any of the blocks can be
replaced."
Question 1:
If my interest is to start with a WAV file with voice+noise and write Java software that will output a WAV file, with (hopefully!) the voice component only, which parts should I be "tapping"?
Question 2:
I have read about a "Hello, world!" material for Sphinx, but I have not found it anywhere. Can you point me to the basic building blocks, given my interest above?
TIA.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I was just reading the paper "The CMU Sphinx-4 Speech Recognition System" by Lamere, Kwok, Gouvea, Raj, Singh, Walker and Wolf. This part piqued my interest:
This is pretty outdated
I have read about a "Hello, world!" material for Sphinx, but I have not found it anywhere. Can you point me to the basic building blocks, given my interest above?
Thats fun, hopefully you'll get to the right side.
Nickolay:
You just told me that you Sphinx developers are not interested in Noise Removal from WAV audioclips (at the time, anyway), which happens to be my excuse (only justification, really) to get involved with Sphinx.
IOW: The "right" mountain is moving away... :-(
I should mention that my interest is neither commercial nor academic. It simply piqued my curiosity.
BTW: I have been away from Java (my first OO language) for years. Due to the realities of the business world I moved to C++ and currently C#. I am -as we speak (another pun!)- getting the latest version of NetBeans, etc. Still have not decided whether to do Sphinx development under Windows or Linux, though.
Last edit: Travis Banger 2015-01-16
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
First, I would like to direct your attention to my thread "The Human Voice has been widely characterized, correct?" in the comp.dsp Usenet forum.
TIA.
This informal post should provide more information about my question:
http://forum.audacityteam.org/viewtopic.php?f=21&t=83058
I am wondering whether somebody from the Sphinx initiative could join the "Audacity" group in order to develop a strong Noise Reduction application.
Dear Travis
Sorry, it's hard to comment on discussion without a direct question.
Overall there are many algorithms and methods to separate speech from non-speech, most advanced them can learn model of the human speech and effectively discriminate. One of the active are of research is non-negative matrix factorization which allows you to separate speech from overlapping sounds using the vocabulary of speech atoms. It's a wide area of research.
The algorithms for audio improvement are certainly related to CMUSphinx and could help it to improve the accuracy but they are not our primary focus right now, so it is not easy to expect us to implement noise reduction for Audacity.
If you have questions on speech processing let us know.
Thanks for your reply, Nickolay! Trust me, it hasn't been easy to obtain informed answers. Before I make a commitment to further learning and using Sphinx, I would like to ascertain that it will not be a dead end. As implementer, I am aware that I am taking the risks of being close to the infamous "bleeding edge", but so far, I am very encouraged. The work done by the Sphinx team is nothing short of remarkable.
I was just reading the paper "The CMU Sphinx-4 Speech Recognition System" by Lamere, Kwok, Gouvea, Raj, Singh, Walker and Wolf. This part piqued my interest:
"One of the features of this design is that the output of any
of the blocks can be tapped. Similarly, the actual input to the
system need not be at the first block, but can be at any of the
intermediate blocks. [...] In addition, any of the blocks can be
replaced."
Question 1:
If my interest is to start with a WAV file with voice+noise and write Java software that will output a WAV file, with (hopefully!) the voice component only, which parts should I be "tapping"?
Question 2:
I have read about a "Hello, world!" material for Sphinx, but I have not found it anywhere. Can you point me to the basic building blocks, given my interest above?
TIA.
This is pretty outdated
http://cmusphinx.sourceforge.net/wiki/tutorialsphinx4
That was my guess, when I read "Sun Microsystems". RIP :-(
The proverbial picture (worth 1K words) is attached:
Thats fun, hopefully you'll get to the right side.
Nickolay:
You just told me that you Sphinx developers are not interested in Noise Removal from WAV audioclips (at the time, anyway), which happens to be my excuse (only justification, really) to get involved with Sphinx.
IOW: The "right" mountain is moving away... :-(
I should mention that my interest is neither commercial nor academic. It simply piqued my curiosity.
BTW: I have been away from Java (my first OO language) for years. Due to the realities of the business world I moved to C++ and currently C#. I am -as we speak (another pun!)- getting the latest version of NetBeans, etc. Still have not decided whether to do Sphinx development under Windows or Linux, though.
Last edit: Travis Banger 2015-01-16