CMU Sphinx / Forums / Speech Recognition Theory: Can Voice Recognition be used for audioclip Noise Removal? Has it been?

Travis Banger - 2015-01-14

First, I would like to direct your attention to my thread "The Human Voice has been widely characterized, correct?" in the comp.dsp Usenet forum.

TIA.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Travis Banger - 2015-01-14
  
  This informal post should provide more information about my question:
  
  http://forum.audacityteam.org/viewtopic.php?f=21&t=83058
  
  I am wondering whether somebody from the Sphinx initiative could join the "Audacity" group in order to develop a strong Noise Reduction application.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Nickolay V. Shmyrev - 2015-01-14
    
    Dear Travis
    
    Sorry, it's hard to comment on discussion without a direct question.
    
    Overall there are many algorithms and methods to separate speech from non-speech, most advanced them can learn model of the human speech and effectively discriminate. One of the active are of research is non-negative matrix factorization which allows you to separate speech from overlapping sounds using the vocabulary of speech atoms. It's a wide area of research.
    
    The algorithms for audio improvement are certainly related to CMUSphinx and could help it to improve the accuracy but they are not our primary focus right now, so it is not easy to expect us to implement noise reduction for Audacity.
    
    If you have questions on speech processing let us know.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Travis Banger - 2015-01-15
      
      Thanks for your reply, Nickolay! Trust me, it hasn't been easy to obtain informed answers. Before I make a commitment to further learning and using Sphinx, I would like to ascertain that it will not be a dead end. As implementer, I am aware that I am taking the risks of being close to the infamous "bleeding edge", but so far, I am very encouraged. The work done by the Sphinx team is nothing short of remarkable.
      
      I was just reading the paper "The CMU Sphinx-4 Speech Recognition System" by Lamere, Kwok, Gouvea, Raj, Singh, Walker and Wolf. This part piqued my interest:
      
      "One of the features of this design is that the output of any
      of the blocks can be tapped. Similarly, the actual input to the
      system need not be at the first block, but can be at any of the
      intermediate blocks. [...] In addition, any of the blocks can be
      replaced."
      
      Question 1:
      
      If my interest is to start with a WAV file with voice+noise and write Java software that will output a WAV file, with (hopefully!) the voice component only, which parts should I be "tapping"?
      
      Question 2:
      
      I have read about a "Hello, world!" material for Sphinx, but I have not found it anywhere. Can you point me to the basic building blocks, given my interest above?
      
      TIA.
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Nickolay V. Shmyrev - 2015-01-15
        
        I was just reading the paper "The CMU Sphinx-4 Speech Recognition System" by Lamere, Kwok, Gouvea, Raj, Singh, Walker and Wolf. This part piqued my interest:
        
        This is pretty outdated
        
        I have read about a "Hello, world!" material for Sphinx, but I have not found it anywhere. Can you point me to the basic building blocks, given my interest above?
        
        http://cmusphinx.sourceforge.net/wiki/tutorialsphinx4
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Travis Banger - 2015-01-15
        
        This is pretty outdated
        
        That was my guess, when I read "Sun Microsystems". RIP :-(
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Travis Banger - 2015-01-15
      
      The proverbial picture (worth 1K words) is attached:
      
      Widening-or-Narrowing-Gap.jpg
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Nickolay V. Shmyrev - 2015-01-15
        
        Thats fun, hopefully you'll get to the right side.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Travis Banger - 2015-01-15

Thats fun, hopefully you'll get to the right side.

Nickolay:

You just told me that you Sphinx developers are not interested in Noise Removal from WAV audioclips (at the time, anyway), which happens to be my excuse (only justification, really) to get involved with Sphinx.

IOW: The "right" mountain is moving away... :-(

I should mention that my interest is neither commercial nor academic. It simply piqued my curiosity.

BTW: I have been away from Java (my first OO language) for years. Due to the realities of the business world I moved to C++ and currently C#. I am -as we speak (another pun!)- getting the latest version of NetBeans, etc. Still have not decided whether to do Sphinx development under Windows or Linux, though.

Last edit: Travis Banger 2015-01-16

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Can Voice Recognition be used for audioclip Noise Removal? Has it been?

Speech Recognition Toolkit

Forums

Help

Can Voice Recognition be used for audioclip Noise Removal? Has it been?

Can Voice Recognition be used for audioclip Noise Removal? Has it been?

Speech Recognition Toolkit

Forums

Help

Can Voice Recognition be used for audioclip Noise Removal? Has it been? document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Can Voice Recognition be used for audioclip Noise Removal? Has it been?