SDK for building interactive UI components over MCP for AI tools
Build cross-modal and multimodal applications on the cloud
SPPAS - the automatic annotation and analyses of speech
VITS2 backbone with multilingual-bert
Multi-Voice and Prompt-Controlled TTS Engine
AIlice is a fully autonomous, general-purpose AI agent
A deep learning toolkit for Text-to-Speech, battle-tested in research
Open source implementation of Microsoft's VALL-E X zero-shot TTS model
Best practice TTS based on BERT and VITS
Transformers4Rec is a flexible and efficient library
Unofficial Parallel WaveGAN
Framework that is dedicated to making neural data processing
SoftVC VITS Singing Voice Conversion
Chinese voice dialogue robot/smart speaker project
A webui for different audio related Neural Networks
Multimodal AI Story Teller, built with Stable Diffusion, GPT, etc.
Video automatic transcribe and translated subtitle generator
A simple client for doccano API
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
Contextually-keyed word vectors
NLP, before and after spaCy
A walk along memory lane