Reference PyTorch implementation and models for DINOv3
PyTorch code and models for the DINOv2 self-supervised learning
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
This repository contains the official implementation of FastVLM
Multimodal-Driven Architecture for Customized Video Generation
Official code for Style Aligned Image Generation via Shared Attention
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
A method to increase the speed and lower the memory footprint
Code release for ConvNeXt V2 model
Code release for "Masked-attention Mask Transformer
Reproduces results of "Fixing the train-test resolution discrepancy"
A mix of GAN implementations including progressive growing