A cross-platform software for text translation and recognition
TTS with kokoro and onnx runtime
Bailing is a voice dialogue robot similar to GPT-4o
Generate audiobooks from EPUBs, PDFs and text with captions
End-to-end speech processing toolkit
Automatically translates the text of a video based on a subtitle file
StreamSpeech is a seamless model for offline speech recognition
Oobabooga - The definitive Web UI for local AI, with powerful features
Code for openai.fm, a demo for the OpenAI Speech API
MARS5 speech model (TTS) from CAMB.AI
C++ inference library for multiple SVC/TTS
Official PyTorch Implementation
High-quality multi-lingual text-to-speech library by MyShell.ai
Easy-to-use Speech Toolkit including Self-Supervised Learning model
Multi-lingual large voice generation model, providing inference
A high-quality rapid TTS voice cloning model
The official Python SDK for the ElevenLabs API
Open source text-to-speech tool, supports extra-long text
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Video translation and dubbing tool powered by LLMs
Controllable and fast Text-to-Speech for over 7000 languages
Towards Human-Level Text-to-Speech through Style Diffusion
A TTS model capable of generating ultra-realistic dialogue
Multi-Voice and Prompt-Controlled TTS Engine
Workflow and speech recognition app