speaker detection free download

Showing 12 open source projects for "speaker detection"

View related business solutions

Fully Managed MySQL, PostgreSQL, and SQL Server
Automatic backups, patching, replication, and failover. Focus on your app, not your database.

Cloud SQL handles your database ops end to end, so you can focus on your app.

Try Free
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
1

sherpa-onnx

Speech-to-text, text-to-speech, and speaker recognition

Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without an Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter.

Downloads: 235 This Week

Last Update: 14 hours ago
See Project
2

WhisperX

Automatic Speech Recognition with Word-level Timestamps

WhisperX is an advanced speech recognition system built on top of OpenAI’s Whisper model, designed to improve transcription accuracy and timing precision for long-form audio. It addresses key limitations of standard Whisper implementations by introducing voice activity detection and forced alignment techniques to produce word-level timestamps. The system enables batched inference, significantly increasing transcription speed while maintaining high accuracy. It is particularly effective for long recordings, where traditional approaches may suffer from drift, repetition, or misalignment. whisperx also supports speaker diarization, allowing identification of different speakers within a conversation. ...

Downloads: 23 This Week

Last Update: 2026-04-06
See Project
3

Note67

A private, local meeting notes assistant

...Built with a cross-platform architecture using Rust (via Tauri) for backend logic and a TypeScript/React frontend, it prioritizes privacy by performing audio transcription locally with Whisper models and generating summaries with locally-hosted AI, eliminating the need to send sensitive meeting content to external servers. Users can record meetings directly from their microphone, view live transcriptions, filter by speaker, and export structured summaries, making it useful for professionals who need searchable, organized records of discussions. It also features thoughtful signal processing such as voice activity detection and echo deduplication to improve transcription accuracy, and provides standard note-taking features.

Downloads: 7 This Week

Last Update: 2026-04-18
See Project
4

Whisper-WebUI

A Web UI for easy subtitle using whisper model

...It supports multiple input sources including local files, YouTube content, and microphone input, making it versatile for different workflows. Whisper WebUI also includes advanced preprocessing and postprocessing features such as voice activity detection, background music separation, and speaker diarization, enabling more accurate and structured outputs.

Downloads: 3 This Week

Last Update: 2026-03-18
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
5

Glint Translator

...It supports 240+ languages using DeepL, Google, OpenAI, Azure, and Google Gemini models. The interface is available in 18 languages. Features • 3 Translation Modes: Fluent (parallel), Area (overlay), Full Screen (smart detection) • Speaker detection with color-coding • Glint AI custom terminology control • Game-based profile system • Advanced settings with 50+ parameters for fine-tuned control • Share and import custom profiles (.glint) between users • Low CPU/RAM usage, optimized for Windows 10/11 Live Subtitle (Real-Time Voice Translation) Real-time speech-to-text translation for games, movies, and voice chats. ...

1 Review

Downloads: 20 This Week

Last Update: 1 day ago
See Project
6

footswitch2

Audio Transcription software for Linux (Vlc) with a foot pedal

Footswitch 2 is a media player for transcribers on Linux. Written in python and using the python bindings for VLC it allows a transcriber to control the audio or video with a USB footpedal, and includes a set of macros that integrate into LibreOffice. This allows the transcriber to control the media player from within Libreoffice as well, making it useful for those who do not yet own a footpedal/footswitch. Control of the media player from LibreOffice can be via Hotkeys or an integrated...

Downloads: 8 This Week

Last Update: 2026-04-09
See Project
7

wukong-robot

Chinese voice dialogue robot/smart speaker project

wukong-robot is a Chinese voice assistant / smart speaker project built to let makers and hackers design highly customizable voice-controlled devices. It combines wake-word detection, automatic speech recognition, natural language understanding, and text-to-speech into a single framework aimed at the Chinese-speaking ecosystem. The project is positioned as a simple, flexible, and elegant platform that can run on devices like Raspberry Pi and other Linux-based boards, making it suitable for DIY smart speakers and home-automation hubs. ...

Downloads: 1 This Week

Last Update: 2025-11-28
See Project
8

footswitch3

Audio Transcription software for Linux (Gstreamer) with a foot pedal

Footswitch 3 is a media player for transcribers on Linux. Written in python using the python bindings for Gstreamer it allows a transcriber to control the audio or video with a foot pedal, and includes a set of macros that integrate into LibreOffice. This allows the transcriber to control the media player from within Libreoffice as well, making it useful for those who do not yet own a foot pedal/foot switch. Control of the media player from LibreOffice can be via Hotkeys or an integrated...

1 Review

Downloads: 2 This Week

Last Update: 2023-04-02
See Project
9

rims-arduino-library

Recirculation infusion mash system library for Arduino

This library implement RIMS controls for home brewers. For definition of a RIMS, see https://tinyurl.com/j3lyuyc For me, an Arduino micro controller + a LCD Keypad shield was cheaper and a lot more customizable than a commercial PID controller. So, with this library, a commercial PID controller is unnecessary. Automatic PID tuning toolkit is also included. Temperature can be read with a thermistor, a resistance temperature detector (RTD) or any custom temperature probe. Heater is...

Downloads: 0 This Week

Last Update: 2019-07-17
See Project
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime
General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.

Try Free
10

Automatic Volume Mixer

A tool for automatization of Windows Volume Mixer.

Automatic Volume Mixer is a tool that allows automatization of Windows Volume Mixer based on user's rules. You can open the Volume Mixer by right-clicking on the speaker icon in the system tray and selecting Open Volume Mixer. This application is an automatic version of that applet. Common usage examples - Pausing your audio player (e.g. foobar2000) whenever any other application makes a noise, - and resuming playback once the noise is gone. This enables you to keep your audio player...

3 Reviews

Downloads: 0 This Week

Last Update: 2018-04-01
See Project
11

Bero iOS Open Source Control App

ios app development for bero

This open source project is about controlling the Bero (Be The Robot) device using ios device. Provided that Bero is a 5 motors humanoid robot which also installed with SD card, speaker, Infra red detection, Bero has a lot of potential to be explored by all you developers. Now we are making the app open source so that developers can utilize and customize their own Bero app to make it more impressive!

Downloads: 0 This Week

Last Update: 2013-08-21
See Project
12

DefendLineII

ATMEL ATMega1280 based powerful, multifunctional, reliable, expandable and extremely flexible hardware platform for home and industrial processes automation, robotic toys, security systems, education and enjoyment.

Downloads: 0 This Week

Last Update: 2014-04-03
See Project