vad free download - SourceForge

Showing 13 open source projects for "vad"

View related business solutions

Build Agents and Models on One Platform
Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.

Try It Free
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
1

Bailing

Bailing is a voice dialogue robot similar to GPT-4o

Bailing is an open-source voice-dialogue assistant designed to deliver natural voice-based conversations by combining automatic speech recognition (ASR), voice activity detection (VAD), a large language model (LLM), and text-to-speech (TTS) in a single pipeline. Its goal is to offer a “voice-first” chat experience similar to what one might expect from a system like GPT-4o, but fully open and deployable by users. The project is modular: each core function — ASR, VAD, LLM, TTS — exists as a separately replaceable component, which allows flexibility in picking your preferred models depending on resources or languages. ...

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
2

RealtimeSTT

A robust, efficient, low-latency speech-to-text library

RealtimeSTT is a Python-based realtime speech-to-text engine emphasizing low latency, wake-word detection, voice activity detection, and automatic speech segmentation. It provides asynchronous callbacks, nanosecond-precision timestamps, and CLI tools, suitable for building voice assistants, meeting transcribers, or live caption systems.

Downloads: 1 This Week

Last Update: 2026-05-31
See Project
3

WhisperJAV

Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD

WhisperJAV is an open-source speech transcription pipeline designed specifically for generating subtitles for Japanese adult video content. The project addresses challenges that standard speech recognition models face when transcribing this type of audio, which often includes low signal-to-noise ratios and large numbers of non-verbal vocalizations. Traditional automatic speech recognition systems can misinterpret these sounds as words, leading to inaccurate transcripts. WhisperJAV introduces...

Downloads: 25 This Week

Last Update: 2026-05-11
See Project
4

Handy STT

A free, open source, and extensible speech-to-text application

...Its backend leverages OpenAI’s Whisper models for GPU-accelerated speech recognition and Parakeet V3 for efficient CPU-only transcription with automatic language detection. To further refine accuracy and responsiveness, Handy integrates Silero’s Voice Activity Detection (VAD) for silence filtering, ensuring only speech segments are processed.

Downloads: 17 This Week

Last Update: 3 days ago
See Project
Train ML Models With SQL You Already Know
BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.

Try Free
5

WhisperJAV

A subtitle generator for Japanese Adult Videos.

A subtitle generator for Japanese Adult Videos. Transformer-based ASR architectures like Whisper suffer significant performance degradation when applied to the spontaneous and noisy domain of JAV. This degradation is driven by specific acoustic and temporal characteristics that defy the statistical distributions of standard training data.

1 Review

Downloads: 105 This Week

Last Update: 5 days ago
See Project
6

Amica

Amica is an open source interface for interactive communication

...Under the hood, Amica leverages modern web and desktop technologies: three.js and three-vrm for 3D rendering, Transformers.js for running models in the browser, Whisper and Silero VAD for speech recognition and voice-activity detection, and a variety of LLM backends such as llama.cpp servers, ChatGPT-compatible APIs, Ollama, KoboldCpp, and others. It also integrates multiple text-to-speech providers, including ElevenLabs, OpenAI, Coqui, RVC, and AllTalkTTS.

Downloads: 16 This Week

Last Update: 2025-11-30
See Project
7

OpenAppTarifas

Cálculo de tarifas típicas del sector de distribución eléctrica

OpenAppTarifas es una aplicación que permite calcular, en esta primer versión beta, 2 (dos) tarifas típicas del sector de distribución de energía eléctrica: una T1G con el VAD (Valor Agregado de Distribución) energizado y calculando los correspondientes segmentos de linealización, obteniendo un cargo fijo y un cargo variable; y la otra T2MD con el VAD asignado a un cargo fijo (comercial) y un cargo por potencia, dejando solo en un cargo variable la compra (pass-through) de la energía del Mercado Eléctrico Mayorista (MEM). ...

Downloads: 0 This Week

Last Update: 2023-09-14
See Project
8

VAD

Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM

This repository is a voice activity detection (VAD) toolkit that implements multiple models (DNN, bDNN, LSTM, ACAM) for detecting speech versus non-speech in audio. It also provides a recorded dataset in varied real-world settings (e.g. bus stop, construction site, park, room) with ground truth labeling. Acoustic feature extraction (multi-resolution cochleagram, MRCG). Post-processing modules (e.g. smoothing, thresholds).

Downloads: 2 This Week

Last Update: 2025-10-02
See Project
9

Live Transcribe Speech Engine

Live Transcribe is an Android application

Live Transcribe Speech Engine provides on-device speech recognition components that power real-time transcription for accessibility and everyday voice-first experiences. Its design prioritizes latency and robustness in noisy, far-field environments, enabling continuous transcription with low delay on mobile hardware. The engine manages audio front-end processing—such as noise suppression and voice activity detection—before feeding audio into compact, accurate acoustic and language models....

Downloads: 0 This Week

Last Update: 2025-10-10
See Project
$300 Free Credits to Build on Google Cloud
New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.

Claim $300 Free
10

VAD

Downloads: 0 This Week

Last Update: 2016-12-17
See Project
11

MCU Media Server

SIP Video Multiconference Media Server with WebRTC support.

REPOSITORY MOVED TO GITHUB!! https://github.com/medooze/media-server Video Multiconference Media Server with WebRTC support. Provide Multiconference and video broadcasting services to any SIP service. Supports VP8, H264, MP4V-ES, H263 and H263P, continuous presence, RTMP flash broadcasting, adhoc conferences, load balancing and administrative WEB interface. JSR309 driver implementation under development. .

16 Reviews

Downloads: 2 This Week

Last Update: 2017-05-25
See Project
12

VAD Tools

The VAD tools are a set of scripts for working with Virtual Address Descriptor structures in dumps of Windows physical memory to provide detailed information about a process's memory allocations to a forensic investigator.

Downloads: 0 This Week

Last Update: 2013-04-24
See Project
13

Storm Tracker OS X

A program for displaying the more esoteric NEXRAD products on a Macintosh running OS X 10.4 (Tiger) or later. TVS, VAD Winds, and Hail probability are some of the products this program will display. Others will be available in later releases.

Downloads: 0 This Week

Last Update: 2013-03-20
See Project