Phi-3.5 for Mac: Locally-run Vision and Language Models
A lightweight vision library for performing large object detection
The repository provides code for running inference with SAM 2
[CVPR 2025 Best Paper Award] VGGT
Implementation of Vision Transformer, a simple way to achieve SOTA
A neural network that transforms a design mock-up into static websites
A fast, powerful, and simple hierarchical vision transformer
Visual Instruction Tuning: Large Language-and-Vision Assistant
OpenFieldAI is an AI based Open Field Test Rodent Tracker
A computer vision framework to create and deploy apps in minutes
CoTracker is a model for tracking any point (pixel) on a video
High-Resolution 3D Human Digitization from A Single Image
A real-time approach for mapping all human pixels of 2D RGB images
End-to-end object detection with transformers
Python Computer Vision & Video Analytics Framework With Batteries Incl