Open-source multi-speaker long-form text-to-speech model
A Python library for audio
AI video generator optimized for low VRAM and older GPUs use
Unified web UI for training and running open models locally
48khz stereo neural audio codec for general audio
Edit videos with Claude Code
An Open Source implementation of Notebook LM with more flexibility
Qwen3-omni is a natively end-to-end, omni-modal LLM
Oobabooga - The definitive Web UI for local AI, with powerful features
Data Infrastructure providing an approach to multimodal AI workloads
Fast multimodal LLM for real-time voice interaction and AI apps
Python inference and LoRA trainer package for the LTX-2 audio–video
Open source AI model for generating full songs from lyrics prompts
High-resolution models for human tasks
Private AI platform for agents, enterprise search and RAG pipelines
Qwen3-TTS is an open-source series of TTS models
Instill Core is a full-stack AI infrastructure tool for data
PersonaPlex code
Official repository for LTX-Video
Qwen3-ASR is an open-source series of ASR models
Framework for building realtime multimodal voice AI agents apps
Capable of understanding text, audio, vision, video
Generate blog articles from video or audio
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Get your documents ready for gen AI