Showing 12 open source projects for "speaker detection"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 1
    sherpa-onnx

    sherpa-onnx

    Speech-to-text, text-to-speech, and speaker recognition

    Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without an Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter.
    Downloads: 204 This Week
    Last Update:
    See Project
  • 2
    WhisperX

    WhisperX

    Automatic Speech Recognition with Word-level Timestamps

    WhisperX is an advanced speech recognition system built on top of OpenAI’s Whisper model, designed to improve transcription accuracy and timing precision for long-form audio. It addresses key limitations of standard Whisper implementations by introducing voice activity detection and forced alignment techniques to produce word-level timestamps. The system enables batched inference, significantly increasing transcription speed while maintaining high accuracy. It is particularly effective for long recordings, where traditional approaches may suffer from drift, repetition, or misalignment. whisperx also supports speaker diarization, allowing identification of different speakers within a conversation. ...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 3
    Note67

    Note67

    A private, local meeting notes assistant

    ...Built with a cross-platform architecture using Rust (via Tauri) for backend logic and a TypeScript/React frontend, it prioritizes privacy by performing audio transcription locally with Whisper models and generating summaries with locally-hosted AI, eliminating the need to send sensitive meeting content to external servers. Users can record meetings directly from their microphone, view live transcriptions, filter by speaker, and export structured summaries, making it useful for professionals who need searchable, organized records of discussions. It also features thoughtful signal processing such as voice activity detection and echo deduplication to improve transcription accuracy, and provides standard note-taking features.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    Whisper-WebUI

    Whisper-WebUI

    A Web UI for easy subtitle using whisper model

    ...It supports multiple input sources including local files, YouTube content, and microphone input, making it versatile for different workflows. Whisper WebUI also includes advanced preprocessing and postprocessing features such as voice activity detection, background music separation, and speaker diarization, enabling more accurate and structured outputs.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 5
    Glint Translator
    ...It supports 240+ languages using DeepL, Google, OpenAI, Azure, and Google Gemini models. The interface is available in 18 languages. Features • 3 Translation Modes: Fluent (parallel), Area (overlay), Full Screen (smart detection) • Speaker detection with color-coding • Glint AI custom terminology control • Game-based profile system • Advanced settings with 50+ parameters for fine-tuned control • Share and import custom profiles (.glint) between users • Low CPU/RAM usage, optimized for Windows 10/11 Live Subtitle (Real-Time Voice Translation) Real-time speech-to-text translation for games, movies, and voice chats. ...
    Downloads: 21 This Week
    Last Update:
    See Project
  • 6
    footswitch2

    footswitch2

    Audio Transcription software for Linux (Vlc) with a foot pedal

    Footswitch 2 is a media player for transcribers on Linux. Written in python and using the python bindings for VLC it allows a transcriber to control the audio or video with a USB footpedal, and includes a set of macros that integrate into LibreOffice. This allows the transcriber to control the media player from within Libreoffice as well, making it useful for those who do not yet own a footpedal/footswitch. Control of the media player from LibreOffice can be via Hotkeys or an integrated...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 7
    wukong-robot

    wukong-robot

    Chinese voice dialogue robot/smart speaker project

    wukong-robot is a Chinese voice assistant / smart speaker project built to let makers and hackers design highly customizable voice-controlled devices. It combines wake-word detection, automatic speech recognition, natural language understanding, and text-to-speech into a single framework aimed at the Chinese-speaking ecosystem. The project is positioned as a simple, flexible, and elegant platform that can run on devices like Raspberry Pi and other Linux-based boards, making it suitable for DIY smart speakers and home-automation hubs. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    footswitch3

    footswitch3

    Audio Transcription software for Linux (Gstreamer) with a foot pedal

    Footswitch 3 is a media player for transcribers on Linux. Written in python using the python bindings for Gstreamer it allows a transcriber to control the audio or video with a foot pedal, and includes a set of macros that integrate into LibreOffice. This allows the transcriber to control the media player from within Libreoffice as well, making it useful for those who do not yet own a foot pedal/foot switch. Control of the media player from LibreOffice can be via Hotkeys or an integrated...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
     rims-arduino-library

    rims-arduino-library

    Recirculation infusion mash system library for Arduino

    This library implement RIMS controls for home brewers. For definition of a RIMS, see https://tinyurl.com/j3lyuyc For me, an Arduino micro controller + a LCD Keypad shield was cheaper and a lot more customizable than a commercial PID controller. So, with this library, a commercial PID controller is unnecessary. Automatic PID tuning toolkit is also included. Temperature can be read with a thermistor, a resistance temperature detector (RTD) or any custom temperature probe. Heater is...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    Automatic Volume Mixer

    Automatic Volume Mixer

    A tool for automatization of Windows Volume Mixer.

    Automatic Volume Mixer is a tool that allows automatization of Windows Volume Mixer based on user's rules. You can open the Volume Mixer by right-clicking on the speaker icon in the system tray and selecting Open Volume Mixer. This application is an automatic version of that applet. Common usage examples - Pausing your audio player (e.g. foobar2000) whenever any other application makes a noise, - and resuming playback once the noise is gone. This enables you to keep your audio player...
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    Bero iOS Open Source Control App

    ios app development for bero

    This open source project is about controlling the Bero (Be The Robot) device using ios device. Provided that Bero is a 5 motors humanoid robot which also installed with SD card, speaker, Infra red detection, Bero has a lot of potential to be explored by all you developers. Now we are making the app open source so that developers can utilize and customize their own Bero app to make it more impressive!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    DefendLineII
    ATMEL ATMega1280 based powerful, multifunctional, reliable, expandable and extremely flexible hardware platform for home and industrial processes automation, robotic toys, security systems, education and enjoyment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB