Qwen-Image is a powerful image generation foundation model
Fast-stable-diffusion + DreamBooth
A Customizable Image-to-Video Model based on HunyuanVideo
Official inference repo for FLUX.2 models
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Reference PyTorch implementation and models for DINOv3
Text and image to video generation: CogVideoX and CogVideo
High-Resolution Image Synthesis with Latent Diffusion Models
Sharp Monocular Metric Depth in Less Than a Second
Diffusion Transformer with Fine-Grained Chinese Understanding
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Chinese and English multimodal conversational language model
Capable of understanding text, audio, vision, video
Advancing Open-source World Models
Easy Docker setup for Stable Diffusion with user-friendly UI
RGBD video generation model conditioned on camera input
A state-of-the-art open visual language model
AI Suite for upscaling, interpolating & restoring images/videos
A latent text-to-image diffusion model