CMU Sphinx / Forums / Help: Voice Activity Detection (VAD)

Alonso - 2016-08-18

I'm looking for a program that can detect voice in my recordings. I have hundreds of recordings in different formats, and in most of them the only valuable part is the voice part. However, many of them are hours long and only contain a few minutes, or even seconds, of voice.

Until now I have been using an audio editor to detect the voice manually. I open the file in a spectrogram view and visually look for voice waveforms. This method works but it's very time consuming. I'm looking for software that does it automatically and marks the voice parts or something similar.

The closest thing I've found so far are these Sphinx systems, but I'm not sure if they include a program that does what I'm looking for. Can you give me some feedback on this?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-08-18
  
  You can use https://github.com/wiseman/py-webrtcvad
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alonso - 2016-09-20

Sorry for the late reply. Can you tell me a bit more? I just checked webrtc.org and I didn't find anything about VAD.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-09-20
  
  Can you tell me a bit more?
  
  Sure, if you ask more specific questions
  
  I just checked webrtc.org and I didn't find anything about VAD.
  
  The link above did not point to webrtc.org, its a separate github project.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alonso - 2016-11-16

Hi again, once again it's been a lot of time since my last post. I just don't know how to setup my account to send me a notification when there is a reply. For the time being I'll just create a reminder in my calendar.

My specific question is: How can I use the link you provided to achieve the results I'm looking for? As I said, I'm looking for a program (or any form of software) which allows me to detect voice in my recordings. Ideally, the program would take as an input a recording and provide as an output an audio file which contains only the parts of the original recording which have voice activity. However, anything that allows me to detect automatically the voice activity will do.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-11-16
  
  How can I use the link you provided to achieve the results I'm looking for? As I said, I'm looking for a program (or any form of software) which allows me to detect voice in my recordings.
  
  Link points to a program that allows you to detect voice in your recordings.
  
  Ideally, the program would take as an input a recording and provide as an output an audio file which contains only the parts of the original recording which have voice activity.
  
  https://github.com/wiseman/py-webrtcvad/blob/master/example.py does exactly that. You run
  
  python example.py 1 file.wav
  
  it creates chunked wavs with voice. You can modify it according to your further needs.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alonso - 2016-12-05

Obviously that doesn't work. I wonder if you're assuming I'm a Python programmer. I am not. When I run that command in my system (Windows) it just says it doesn't know what python is. Anyway, I installed Python, copied that code, saved it to a file, ran the command you said and, unsurprisingly, it didn't work either (ImportError: No module named 'webrtcvad'). So, my specific question is: How do I make that program work?

Note: Needless to say, the instructions in https://github.com/wiseman/py-webrtcvad don't work in my system.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-12-05
  
  Ok, maybe audacity will work better for you then:
  
  http://manual.audacityteam.org/man/truncate_silence.html
  
  you can download it here:
  
  http://www.audacityteam.org/download/
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alonso - 2016-12-07

Audacity doesn't do VAD. I already discussed it with an Audacity developer: http://forum.audacityteam.org/viewtopic.php?f=21&t=10485&start=10. Nor does Adobe Audition. Truncate silence doesn't work for me because the noise to signal ratio is too high in my recordings.

What I need are instructions on how to run the VAD program that I can understand and follow. I am a former programmer, so I can figure out some stuff for myself (like I figured out how to run Python code). However, the instructions in https://github.com/wiseman/py-webrtcvad are cryptic for me.

When I say those instructions don't work in my system, what I mean is, for instance, that when I run pip install webrtcvad in my command prompt I just receive an error message because Windows doesn't recognize the pip command. A bit of googling suggests that those instructions are not system commands but Python code (meant to be written in a Python program I create, I guess). Is that correct? If so, how do I go about creating such a program and making it work (hopefully in Windows), keeping in mind that I've never written a line of Python?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-12-07
  
  http://stackoverflow.com/questions/4750806/how-do-i-install-pip-on-windows
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

OK, I ran python -m pip install webrtcvad in the Windows shell and I got this error:

Installing collected packages: webrtcvad
  Running setup.py install for webrtcvad ... error
    Complete output from command C:\Users\Alonso\AppData\Local\Programs\Python\Python35\python.exe -u -c "import setuptools, tokenize;__file__='C:\\Users\\Alonso\\AppData\\Local\\Temp\\pip-build-s0ki5qwn\\webrtcvad\\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record C:\Users\Alonso\AppData\Local\Temp\pip-cp8a659d-record\install-record.txt --single-version-externally-managed --compile:
    running install
    running build
    running build_py
    creating build
    creating build\lib.win-amd64-3.5
    copying webrtcvad.py -> build\lib.win-amd64-3.5
    running build_ext
    building '_webrtcvad' extension
    error: Unable to find vcvarsall.bat

    ----------------------------------------
Command "C:\Users\Alonso\AppData\Local\Programs\Python\Python35\python.exe -u -c "import setuptools, tokenize;__file__='C:\\Users\\Alonso\\AppData\\Local\\Temp\\pip-build-s0ki5qwn\\webrtcvad\\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record C:\Users\Alonso\AppData\Local\Temp\pip-cp8a659d-record\install-record.txt --single-version-externally-managed --compile" failed with error code 1 in C:\Users\Alonso\AppData\Local\Temp\pip-build-s0ki5qwn\webrtcvad\

Any ideas?

Nickolay V. Shmyrev - 2016-12-07

This problem is covered in the link above, you just need to read it till the end:

http://stackoverflow.com/a/12476379/432021

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Actually, it is not covered in that link. As you can see, I'm using Python 3.5. That link covers the problem for versions up to 3.3, and the solutions it offers don't work with 3.5. You know, I've worked as a programmer for several years, so I know that most programmers have very weak communication skills. Still, I would ask you to make an effort to communicate if you want to help me with this.

Anyway, I found a page that seems to address the issue for all versions: https://blogs.msdn.microsoft.com/pythonengineering/2016/04/11/unable-to-find-vcvarsall-bat/. Following its advice, I installed Visual C++ Build Tools 2015, uninstalled the version of setuptools that came with my Python installation (v20), and installed the latest version of setuptools (v30). Now I get this error message:

Traceback (most recent call last):
  File "C:\Users\Alonso\AppData\Local\Programs\Python\Python35\lib\runpy.py", line 174, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "C:\Users\Alonso\AppData\Local\Programs\Python\Python35\lib\runpy.py", line 133, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "C:\Users\Alonso\AppData\Local\Programs\Python\Python35\lib\runpy.py", line 109, in _get_module_details
    __import__(pkg_name)
  File "C:\Users\Alonso\AppData\Local\Programs\Python\Python35\lib\site-packages\pip\__init__.py", line 28, in <module>
    from pip.vcs import git, mercurial, subversion, bazaar  # noqa
  File "C:\Users\Alonso\AppData\Local\Programs\Python\Python35\lib\site-packages\pip\vcs\mercurial.py", line 9, in <module>
    from pip.download import path_to_url
  File "C:\Users\Alonso\AppData\Local\Programs\Python\Python35\lib\site-packages\pip\download.py", line 35, in <module>
    from pip.utils.setuptools_build import SETUPTOOLS_SHIM
ImportError: No module named 'pip.utils.setuptools_build'

John Wiseman - 2016-12-16

Alonso, I think if you try it without installing the latest version of setuptools it should work.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alonso - 2017-02-25

I left this project for a long time and retook it today. It works now. Thank you, John!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Voice Activity Detection (VAD)

Speech Recognition Toolkit

Forums

Help

Voice Activity Detection (VAD) document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Voice Activity Detection (VAD)