Powerful AI language model (MoE) optimized for efficiency/performance
Open-source, high-performance AI model with advanced reasoning
Python inference and LoRA trainer package for the LTX-2 audio–video
State-of-the-art TTS model under 25MB
Official Python inference and LoRA trainer package
Awesome multilingual OCR toolkits based on PaddlePaddle
Open Source Speech Language Model
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Qwen3-TTS is an open-source series of TTS models
Z80-μLM is a 2-bit quantized language model
Advancing Open-source World Models
Open-source multi-speaker long-form text-to-speech model
High-Resolution Image Synthesis with Latent Diffusion Models
Models for object and human mesh reconstruction
Accurate × Fast × Comprehensive
The ChatGPT Retrieval Plugin lets you easily find personal documents
A Powerful Native Multimodal Model for Image Generation
An Efficient Agentic Model for Computer Use
Open-source framework for intelligent speech interaction
Open-source deep-learning framework
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
A Systematic Framework for Interactive World Modeling
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Long-form streaming TTS system for multi-speaker dialogue generation