A high-quality rapid TTS voice cloning model
TTS with kokoro and onnx runtime
Capable of understanding text, audio, vision, video
Synchronized Translation for Videos
Towards Human-Sounding Speech
A TTS model capable of generating ultra-realistic dialogue
High-Quality Voice Cloning TTS for 600+ Languages
Instant voice cloning by MIT and MyShell. Audio foundation model
A simple, high-quality voice conversion tool focused on ease of use
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
State-of-the-art TTS model under 25MB
Open-source model for program synthesis
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Interface for OuteTTS models
A sound cloning tool with a web interface, using your voice
Inference code for CodeLlama models
LLM Large Model of Selling Anchor
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Large Audio Language Model built for natural interactions
A simple native web interface that uses ChatTTS to synthesize text
Repo of Qwen2-Audio chat & pretrained large audio language model
High-quality multi-lingual text-to-speech library by MyShell.ai
From Images to High-Fidelity 3D Assets
Multi-modal large language model designed for audio understanding
Framework for building neural networks