Curated list of classic, high-quality computer science books
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Long-form streaming TTS system for multi-speaker dialogue generation
Interface for OuteTTS models
Open-source multi-speaker long-form text-to-speech model
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Generate audiobooks from e-books, voice cloning & 1107+ languages
Generate audiobooks from EPUBs, PDFs and text with captions
Official PyTorch Implementation
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
A Web UI for easy subtitle using whisper model
An Open Source implementation of Notebook LM with more flexibility
A PyTorch-based Speech Toolkit
One-click deployment (including offline integration package)
A generative speech model for daily dialogue
Synchronized Translation for Videos
Multi-modal large language model designed for audio understanding
Towards Human-Level Text-to-Speech through Style Diffusion
Instant voice cloning by MIT and MyShell. Audio foundation model
Spark-TTS Inference Code
Foundational model for human-like, expressive TTS
End-to-end speech processing toolkit
MARS5 speech model (TTS) from CAMB.AI
A deep learning toolkit for Text-to-Speech, battle-tested in research
Best practice TTS based on BERT and VITS