Official inference repo for FLUX.2 models
Code for running inference with the SAM 3D Body Model 3DB
Official inference repo for FLUX.1 models
Open image model at the forefront of design
A Multi-Modal World Model for Reconstructing, Generating, Simulation
Wan2.1: Open and Advanced Large-Scale Video Generative Model
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Code for running inference and finetuning with SAM 3 model
code for Mesh R-CNN, ICCV 2019
CLIP, Predict the most relevant text snippet given an image
GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image
Models for object and human mesh reconstruction
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Visual Causal Flow
Qwen-Image is a powerful image generation foundation model
Qwen2.5-VL is the multimodal large language model series
An easy 1-click way to create beautiful artwork on your PC using AI
Native and Compact Structured Latents for 3D Generation
Unified Multimodal Understanding and Generation Models
Moonshot's most powerful AI model
Multimodal embedding and reranking models built on Qwen3-VL
Collection of Gemma 3 variants that are trained for performance
Official implementation of Watermark Anything with Localized Messages