The most powerful and modular diffusion model GUI, api and backend
Models for object and human mesh reconstruction
Reverse engineering Gemini's SynthID detection
A Unified Framework for Text-to-3D and Image-to-3D Generation
All-in-one WebUI for AI generative image and video creation
Autoregressive Model Beats Diffusion
Flexible Photo Recrafting While Preserving Your Identity
CLIP, Predict the most relevant text snippet given an image
AI video generator optimized for low VRAM and older GPUs use
Synthesizing and manipulating 2048x1024 images with conditional GANs
Easily turn large sets of image urls to an image dataset
Text and image to video generation: CogVideoX and CogVideo
A SOTA open-source image editing model
Open Source Differentiable Computer Vision Library
A Customizable Image-to-Video Model based on HunyuanVideo
Collection of Gemma 3 variants that are trained for performance
Director, Screenwriter, Producer, and Video Generator All-in-One
High-Resolution Image Synthesis with Latent Diffusion Models
A neural network that transforms a design mock-up into static websites
Unsupervised Learning for Image Registration
CogView4, CogView3-Plus and CogView3(ECCV 2024)
State-of-the-art diffusion models for image and audio generation
Official MiniMax Model Context Protocol (MCP) server
This repo contains the code for 1D tokenizer and generator
Native and Compact Structured Latents for 3D Generation