Implementation of Vision Transformer, a simple way to achieve SOTA
A fast, powerful, and simple hierarchical vision transformer
Phi-3.5 for Mac: Locally-run Vision and Language Models
A neural network that transforms a design mock-up into static websites
Provides code for running inference with the SegmentAnything Model
ICLR2024 Spotlight: curation/training code, metadata, distribution
[CVPR 2025 Best Paper Award] VGGT
The repository provides code for running inference with SAM 2
CoTracker is a model for tracking any point (pixel) on a video
FAIR's research platform for object detection research
fast C++ library for linear algebra & scientific computing
A computer vision framework to create and deploy apps in minutes
High-Resolution 3D Human Digitization from A Single Image
Code release for ConvNeXt model
Codebase for Image Classification Research, written in PyTorch
A real-time approach for mapping all human pixels of 2D RGB images
Fast, modular reference implementation of Instance Segmentation
R-FCN: Object Detection via Region-based Fully Convolutional Networks
Efficient Approximate Nearest Neighbors for General Metric Spaces