Image generation model with single-stream diffusion transformer
Official inference repo for FLUX.1 models
CLIP, Predict the most relevant text snippet given an image
Official inference repo for FLUX.2 models
Models for object and human mesh reconstruction
A Powerful Native Multimodal Model for Image Generation
Open-source image generative foundation model
Code for running inference with the SAM 3D Body Model 3DB
A Customizable Image-to-Video Model based on HunyuanVideo
Collection of Gemma 3 variants that are trained for performance
Recovering the Visual Space from Any Views
Diffusion Transformer with Fine-Grained Chinese Understanding
Code for running inference and finetuning with SAM 3 model
A 0.1B Omni model trained from scratch
Implementation of "MobileCLIP" CVPR 2024
PyTorch code and models for the DINOv2 self-supervised learning
Official implementation of DreamCraft3D
Accurate × Fast × Comprehensive
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Powerful open source image generation model
Official code for Style Aligned Image Generation via Shared Attention
Official PyTorch Implementation of "Scalable Diffusion Models"
Code release for ConvNeXt V2 model
Code for "Image Generation from Scene Graphs", Johnson et al, CVPR 201
Coding-focused Kimi model for long-horizon agent workflows