VAD
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM
This repository is a voice activity detection (VAD) toolkit that implements multiple models (DNN, bDNN, LSTM, ACAM) for detecting speech versus non-speech in audio. It also provides a recorded dataset in varied real-world settings (e.g. bus stop, construction site, park, room) with ground truth labeling. The toolkit supports both MATLAB and Python/TensorFlow components (for feature extraction, classification, postprocessing). Acoustic feature extraction (multi-resolution cochleagram, MRCG...