Open source framework for deep learning satellite and aerial imagery
Implementation of Vision Transformer, a simple way to achieve SOTA
Enable AI to control your desktop, mobile and HMI devices
Build Vision Agents quickly with any model or video provider
Interactive video and image annotation tool for computer vision
Open Source Computer Vision Library
Phi-3.5 for Mac: Locally-run Vision and Language Models
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Witness the aha moment of VLM with less than $3
Structure-from-Motion and Multi-View Stereo
Visual Instruction Tuning: Large Language-and-Vision Assistant
The repository provides code for running inference with SAM 2
OpenVINO™ Toolkit repository
Fast image augmentation library and an easy-to-use wrapper
A lightweight vision library for performing large object detection
Go package for computer vision using OpenCV 4 and beyond
Medical imaging toolkit for deep learning
Google Testing and Mocking Framework
Set of comprehensive computer vision & machine intelligence libraries
ICLR2024 Spotlight: curation/training code, metadata, distribution
"Big Model" trains a visual multimodal VLM with 26M parameters
Java interface to OpenCV, FFmpeg, and more
A fast, powerful, and simple hierarchical vision transformer
Datasets, transforms and models specific to Computer Vision
Training data (data labeling, annotation, workflow) for all data types