Menu

Can Voice Recognition be used for audioclip Noise Removal? Has it been?

2015-01-14
2015-01-15
  • Travis Banger

    Travis Banger - 2015-01-14

    First, I would like to direct your attention to my thread "The Human Voice has been widely characterized, correct?" in the comp.dsp Usenet forum.

    TIA.

     
    • Travis Banger

      Travis Banger - 2015-01-14

      This informal post should provide more information about my question:

      http://forum.audacityteam.org/viewtopic.php?f=21&t=83058

      I am wondering whether somebody from the Sphinx initiative could join the "Audacity" group in order to develop a strong Noise Reduction application.

       
      • Nickolay V. Shmyrev

        Dear Travis

        Sorry, it's hard to comment on discussion without a direct question.

        Overall there are many algorithms and methods to separate speech from non-speech, most advanced them can learn model of the human speech and effectively discriminate. One of the active are of research is non-negative matrix factorization which allows you to separate speech from overlapping sounds using the vocabulary of speech atoms. It's a wide area of research.

        The algorithms for audio improvement are certainly related to CMUSphinx and could help it to improve the accuracy but they are not our primary focus right now, so it is not easy to expect us to implement noise reduction for Audacity.

        If you have questions on speech processing let us know.

         
        • Travis Banger

          Travis Banger - 2015-01-15

          Thanks for your reply, Nickolay! Trust me, it hasn't been easy to obtain informed answers. Before I make a commitment to further learning and using Sphinx, I would like to ascertain that it will not be a dead end. As implementer, I am aware that I am taking the risks of being close to the infamous "bleeding edge", but so far, I am very encouraged. The work done by the Sphinx team is nothing short of remarkable.

          I was just reading the paper "The CMU Sphinx-4 Speech Recognition System" by Lamere, Kwok, Gouvea, Raj, Singh, Walker and Wolf. This part piqued my interest:

          "One of the features of this design is that the output of any
          of the blocks can be tapped. Similarly, the actual input to the
          system need not be at the first block, but can be at any of the
          intermediate blocks. [...] In addition, any of the blocks can be
          replaced."

          Question 1:

          If my interest is to start with a WAV file with voice+noise and write Java software that will output a WAV file, with (hopefully!) the voice component only, which parts should I be "tapping"?

          Question 2:

          I have read about a "Hello, world!" material for Sphinx, but I have not found it anywhere. Can you point me to the basic building blocks, given my interest above?

          TIA.

           
          • Nickolay V. Shmyrev

            I was just reading the paper "The CMU Sphinx-4 Speech Recognition System" by Lamere, Kwok, Gouvea, Raj, Singh, Walker and Wolf. This part piqued my interest:

            This is pretty outdated

            I have read about a "Hello, world!" material for Sphinx, but I have not found it anywhere. Can you point me to the basic building blocks, given my interest above?

            http://cmusphinx.sourceforge.net/wiki/tutorialsphinx4

             
            • Travis Banger

              Travis Banger - 2015-01-15

              This is pretty outdated

              That was my guess, when I read "Sun Microsystems". RIP :-(

               
        • Travis Banger

          Travis Banger - 2015-01-15

          The proverbial picture (worth 1K words) is attached:

           
          • Nickolay V. Shmyrev

            Thats fun, hopefully you'll get to the right side.

             
  • Travis Banger

    Travis Banger - 2015-01-15

    Thats fun, hopefully you'll get to the right side.

    Nickolay:

    You just told me that you Sphinx developers are not interested in Noise Removal from WAV audioclips (at the time, anyway), which happens to be my excuse (only justification, really) to get involved with Sphinx.

    IOW: The "right" mountain is moving away... :-(

    I should mention that my interest is neither commercial nor academic. It simply piqued my curiosity.

    BTW: I have been away from Java (my first OO language) for years. Due to the realities of the business world I moved to C++ and currently C#. I am -as we speak (another pun!)- getting the latest version of NetBeans, etc. Still have not decided whether to do Sphinx development under Windows or Linux, though.

     

    Last edit: Travis Banger 2015-01-16

Log in to post a comment.