Open-source framework for intelligent speech interaction
A text-to-speech, speech-to-text and speech-to-speech library
Oobabooga - The definitive Web UI for local AI, with powerful features
Multi-modal large language model designed for audio understanding
Official Python inference and LoRA trainer package
Large Audio Language Model built for natural interactions
Stable diffusion for real-time music generation (web app)
Transforming Multimodal Content into Captivating Multilingual Audio
Tokenizer-Free TTS for Multilingual Speech Generation
The open-source voice synthesis studio powered by Qwen3-TTS
Audiocraft is a library for audio processing and generation
A Family of Open Sourced Music Foundation Models
Streaming Real-time Audio-Driven Avatar Generation
AI video generator optimized for low VRAM and older GPUs use
Implementation of AudioLM audio generation model in Pytorch
Taming Stable Diffusion for Lip Sync
Create music with JavaScript
Multimodal Diffusion with Representation Alignment
R Package for Music Score and Audio Generation
HunyuanVideo: A Systematic Framework For Large Video Generation Model
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Generate music based on natural language prompts using LLMs
Open source AI model for generating full songs from lyrics prompts
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
The official Go library for the OpenAI API