Large Audio Language Model built for natural interactions
Comprehensive Gradio WebUI for audio processing
Miso TTS is an 8 billion, highly emotive text-to-speech model
Streaming Real-time Audio-Driven Avatar Generation
A Systematic Framework for Interactive World Modeling
State-of-the-art diffusion models for image and audio generation
Desktop piano playable with a PC keyboard, mouse, or MIDI device.
LLM Large Model of Selling Anchor
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Foundational model for human-like, expressive TTS
Two Integrated Text To Speech Engines uses MMS & Silero
Software that uses AI to perform real-time voice conversion
A webui for different audio related Neural Networks
A DLNA-compliant UPnP Media Server