MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Official implementation of DreamCraft3D
Inference code for CodeLlama models
Unified Multimodal Understanding and Generation Models
Sharp Monocular Metric Depth in Less Than a Second
Implementation of Vision Transformer, a simple way to achieve SOTA
Implementation of Make-A-Video, new SOTA text to video generator
Generate 3D objects conditioned on text or images
A fast, powerful, and simple hierarchical vision transformer
Refer and Ground Anything Anywhere at Any Granularity
PyTorch code and models for V-JEPA self-supervised learning from video
PyTorch code and models for the DINOv2 self-supervised learning
Language modeling in a sentence representation space
The repository provides code for running inference with SAM 2
High-Resolution Image Synthesis with Latent Diffusion Models
Plug-n-play module turning text-to-image models into animation
Scientific Visualisation Made Easy
Powerful open source image generation model
A Customizable Image-to-Video Model based on HunyuanVideo
Let us control diffusion models
Overcoming Data Limitations for High-Quality Video Diffusion Models
Code release for "Detecting Twenty-thousand Classes
Official repo for consistency models
Official PyTorch Implementation of "Scalable Diffusion Models"
High-Resolution 3D Human Digitization from A Single Image