Qwen2.5-VL is the multimodal large language model series
Open-Source Financial Large Language Models
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Qwen-Image is a powerful image generation foundation model
Towards Real-World Vision-Language Understanding
Fast-stable-diffusion + DreamBooth
Video understanding codebase from FAIR for reproducing video models
Qwen3-ASR is an open-source series of ASR models
Renderer for the harmony response format to be used with gpt-oss
gpt-oss-120b and gpt-oss-20b are two open-weight language models
The official repo of Qwen chat & pretrained large language model
Memory-efficient and performant finetuning of Mistral's models
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
Tool for exploring and debugging transformer model behaviors
Official Python inference and LoRA trainer package
Industrial-level controllable zero-shot text-to-speech system
General-purpose image editing model that delivers high-fidelity
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Official implementation of DreamCraft3D
Powerful AI language model (MoE) optimized for efficiency/performance
LTX-Video Support for ComfyUI
Agentic, Reasoning, and Coding (ARC) foundation models
Awesome multilingual OCR toolkits based on PaddlePaddle
Chat & pretrained large vision language model
Generating Immersive, Explorable, and Interactive 3D Worlds