Industrial-level controllable zero-shot text-to-speech system
Framework for building neural networks
Automatically translates the text of a video based on a subtitle file
TTS with kokoro and onnx runtime
Unofficial Parallel WaveGAN
Generate audiobooks from e-books
End-to-end speech processing toolkit
A TTS model capable of generating ultra-realistic dialogue
A sound cloning tool with a web interface, using your voice
VITS2 backbone with multilingual-bert
A fast TTS architecture with conditional flow matching
SOTA discrete acoustic codec models with 40/75 tokens per second
A nearly-live implementation of OpenAI's Whisper
Official MiniMax Model Context Protocol (MCP) server
Build Vision Agents quickly with any model or video provider
A simple native web interface that uses ChatTTS to synthesize text
A webui for different audio related Neural Networks
WaveRNN Vocoder + TTS
General Speech Restoration
Real-Time State-of-the-art Speech Synthesis for Tensorflow 2
Implementation of a Transformer based neural network
Conditional Variational Autoencoder with Adversarial Learning
Generative Adversarial Networks for Efficient and High Fidelity Speech
The open-source virtual assistant for Ubuntu based Linux distributions
DeepMind's Tacotron-2 Tensorflow implementation