Multi-modal large language model designed for audio understanding
Speech-to-text, text-to-speech, and speaker recognition
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Audio Share can share Windows/Linux computer's audio to Android phone
Interface for OuteTTS models
Open-source multi-speaker long-form text-to-speech model
Clone a voice in 5 seconds to generate arbitrary speech in real-time
macOS System-wide audio equalizer & volume mixer
Automatic Speech Recognition with Word-level Timestamps
A native macOS menu bar app for managing audio device priorities
Self-hosted AI audio transcription
A private, local meeting notes assistant
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
A Web UI for easy subtitle using whisper model
One-click deployment (including offline integration package)
Official PyTorch Implementation
The HTML Presentation Framework
Control SONOS speakers from your terminal
Open source software for live streaming and recording
An Open Source implementation of Notebook LM with more flexibility
The ioquake3 community effort to continue supporting/developing id's
Synchronized Translation for Videos
Instantly generate AI-powered subtitles on your device
Web presentation editor replicating many PowerPoint features online
Audio Plugin for Audio to MIDI transcription using deep learning