Powerful AI language model (MoE) optimized for efficiency/performance
Qwen2.5-VL is the multimodal large language model series
Advanced language and coding AI model
The official repo of Qwen chat & pretrained large language model
Phi-3.5 for Mac: Locally-run Vision and Language Models
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
New family of code large language models (LLMs)
CodeGeeX2: A More Powerful Multilingual Code Generation Model
OCR expert VLM powered by Hunyuan's native multimodal architecture
Official inference repo for FLUX.2 models
Qwen-Image is a powerful image generation foundation model
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
Long-form streaming TTS system for multi-speaker dialogue generation
Diffusion Transformer with Fine-Grained Chinese Understanding
Open-source multi-speaker long-form text-to-speech model
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI
Qwen3-omni is a natively end-to-end, omni-modal LLM
General-purpose image editing model that delivers high-fidelity
GPT4V-level open-source multi-modal model based on Llama3-8B
OpenTinker is an RL-as-a-Service infrastructure for foundation models
A state-of-the-art open visual language model
Controllable & emotion-expressive zero-shot TTS
FAIR Sequence Modeling Toolkit 2
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Multi-modal large language model designed for audio understanding