Build Vision Agents quickly with any model or video provider
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
Scalable generative AI framework built for researchers and developers
Bailing is a voice dialogue robot similar to GPT-4o
One-click deployment (including offline integration package)
SOTA discrete acoustic codec models with 40/75 tokens per second
Automatically translates the text of a video based on a subtitle file
Framework for building neural networks
Virtual AI anchor that combines state-of-the-art technology
Mice speech to text with MX Cinnamon OS ISO
Towards Human-Level Text-to-Speech through Style Diffusion
VITS2 backbone with multilingual-bert
Multi-Voice and Prompt-Controlled TTS Engine
Best practice TTS based on BERT and VITS
A webui for different audio related Neural Networks
Chinese voice dialogue robot/smart speaker project
Singing Voice Synthesis via Shallow Diffusion Mechanism
Clone a voice in 5 seconds to generate arbitrary speech in real-time
General Speech Restoration
Real-Time State-of-the-art Speech Synthesis for Tensorflow 2
Conditional Variational Autoencoder with Adversarial Learning
Implementation of a Transformer based neural network
Generative Adversarial Networks for Efficient and High Fidelity Speech
The open-source virtual assistant for Ubuntu based Linux distributions