[CVPR 2025 Best Paper Award] VGGT
The repository provides code for running inference with SAM 2
Datasets, transforms and models specific to Computer Vision
A neural network that transforms a design mock-up into static websites
Phi-3.5 for Mac: Locally-run Vision and Language Models
Implementation of Vision Transformer, a simple way to achieve SOTA
ICLR2024 Spotlight: curation/training code, metadata, distribution
A fast, powerful, and simple hierarchical vision transformer
Open Source Computer Vision Library
CoTracker is a model for tracking any point (pixel) on a video
FAIR's research platform for object detection research
High-Resolution 3D Human Digitization from A Single Image
Code release for ConvNeXt model
Codebase for Image Classification Research, written in PyTorch
A real-time approach for mapping all human pixels of 2D RGB images
A modular framework for vision & language multimodal research
Fast, modular reference implementation of Instance Segmentation
A Python computer vision library
Efficient Approximate Nearest Neighbors for General Metric Spaces