A simple, high-quality voice conversion tool focused on ease of use
A nearly-live implementation of OpenAI's Whisper
A simple native web interface that uses ChatTTS to synthesize text
Offline inference engine for art, real-time voice conversations
Generate audiobooks from EPUBs, PDFs and text with captions
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
Bailing is a voice dialogue robot similar to GPT-4o
Speech-AI-Forge is a project developed around TTS generation model
Automatically translates the text of a video based on a subtitle file
Controllable and fast Text-to-Speech for over 7000 languages
MARS5 speech model (TTS) from CAMB.AI
A text-to-speech, speech-to-text and speech-to-speech library
Towards Human-Sounding Speech
Free, high-quality text-to-speech API endpoint to replace OpenAI
Virtual AI anchor that combines state-of-the-art technology
VITS2 backbone with multilingual-bert
Multi-Voice and Prompt-Controlled TTS Engine
Open source implementation of Microsoft's VALL-E X zero-shot TTS model
Singing Voice Synthesis via Shallow Diffusion Mechanism
Clone a voice in 5 seconds to generate arbitrary speech in real-time
General Speech Restoration
Implementation of a Transformer based neural network
Pre-trained and Reproduced Deep Learning Models
Toolkit for efficient experimentation with Speech Recognition
TensorFlow Implementation of DC-TTS: yet another text-to-speech model