Open Source Speech Language Model
Unified web UI for training and running open models locally
48khz stereo neural audio codec for general audio
AI video generator optimized for low VRAM and older GPUs use
An Open Source implementation of Notebook LM with more flexibility
Edit videos with Claude Code
Data Infrastructure providing an approach to multimodal AI workloads
Oobabooga - The definitive Web UI for local AI, with powerful features
Open source AI model for generating full songs from lyrics prompts
High-resolution models for human tasks
Qwen3-omni is a natively end-to-end, omni-modal LLM
Fast multimodal LLM for real-time voice interaction and AI apps
State-of-the-art diffusion models for image and audio generation
Python inference and LoRA trainer package for the LTX-2 audio–video
Official repository for LTX-Video
Instill Core is a full-stack AI infrastructure tool for data
Private AI platform for agents, enterprise search and RAG pipelines
Generate high-definition story short videos with one click using AI
Qwen3-TTS is an open-source series of TTS models
Capable of understanding text, audio, vision, video
Generate blog articles from video or audio
Qwen3-ASR is an open-source series of ASR models
WhatsApp MCP server enabling AI access to chats and messaging
Framework for building realtime multimodal voice AI agents apps
Large Multimodal Models for Video Understanding and Editing