Bailing is a voice dialogue robot similar to GPT-4o
Build Vision Agents quickly with any model or video provider
Toolkit for audio, music, and speech generation
One-click deployment (including offline integration package)
Framework for building neural networks
A single Gradio + React WebUI with extensions for ACE-Step
SOTA discrete acoustic codec models with 40/75 tokens per second
Virtual AI anchor that combines state-of-the-art technology
Unofficial Parallel WaveGAN
A Conversational Speech Generation Model
Chinese text-to-speech engine
A webui for different audio related Neural Networks
Singing Voice Synthesis via Shallow Diffusion Mechanism
WaveRNN Vocoder + TTS
Real-Time State-of-the-art Speech Synthesis for Tensorflow 2
Conditional Variational Autoencoder with Adversarial Learning
Implementation of a Transformer based neural network
Generative Adversarial Networks for Efficient and High Fidelity Speech
DeepMind's Tacotron-2 Tensorflow implementation
Toolkit for efficient experimentation with Speech Recognition
TensorFlow Implementation of DC-TTS: yet another text-to-speech model