Open-source framework for intelligent speech interaction
Repo of Qwen2-Audio chat & pretrained large audio language model
Comprehensive Gradio WebUI for audio processing
A sound cloning tool with a web interface, using your voice
Large Audio Language Model built for natural interactions
Chat & pretrained large audio language model proposed by Alibaba Cloud
A text-to-speech, speech-to-text and speech-to-speech library
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Multi-modal large language model designed for audio understanding
Instant voice cloning by MIT and MyShell. Audio foundation model
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Code for openai.fm, a demo for the OpenAI Speech API
The missing YouTube Music macOS app
A native macOS menu bar app for managing audio device priorities
LLM-based Reinforcement Learning audio edit model
Generate audiobooks from e-books, voice cloning & 1107+ languages
A set of AI-enabled effects, generators, and analyzers for Audacity
PersonaPlex code
Industrial-level controllable zero-shot text-to-speech system
The official Python SDK for the ElevenLabs API
Open source text-to-speech tool, supports extra-long text
A high-quality rapid TTS voice cloning model
Free, high-quality text-to-speech API endpoint to replace OpenAI
A lightweight text-to-speech model with zero-shot voice cloning
Official PyTorch Implementation