Structure-from-Motion and Multi-View Stereo
The repository provides code for running inference with SAM 2
A fast, powerful, and simple hierarchical vision transformer
Phi-3.5 for Mac: Locally-run Vision and Language Models
A neural network that transforms a design mock-up into static websites
Implementation of Vision Transformer, a simple way to achieve SOTA
[CVPR 2025 Best Paper Award] VGGT
Provides code for running inference with the SegmentAnything Model
fast C++ library for linear algebra & scientific computing
A computer vision framework to create and deploy apps in minutes
Blazeface is a lightweight model that detects faces in images
FAIR's research platform for object detection research
Machine learning algorithms for advanced analytics
High-Resolution 3D Human Digitization from A Single Image
Resources to learn computer science in your spare time
Joint Face Detection and Alignment
Code release for ConvNeXt model
Class Activation Mapping
Codebase for Image Classification Research, written in PyTorch
A real-time approach for mapping all human pixels of 2D RGB images
Fast, modular reference implementation of Instance Segmentation
C++ library for image acquisition and visualization
Chrome Extension that displays automated image tags from Facebook
R-FCN: Object Detection via Region-based Fully Convolutional Networks