Open source framework for deep learning satellite and aerial imagery
Implementation of Vision Transformer, a simple way to achieve SOTA
Enable AI to control your desktop, mobile and HMI devices
Build Vision Agents quickly with any model or video provider
Interactive video and image annotation tool for computer vision
Open Source Computer Vision Library
Phi-3.5 for Mac: Locally-run Vision and Language Models
Open Source Differentiable Computer Vision Library
3D reconstruction software
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Witness the aha moment of VLM with less than $3
Visual Instruction Tuning: Large Language-and-Vision Assistant
Structure-from-Motion and Multi-View Stereo
The repository provides code for running inference with SAM 2
OpenVINO™ Toolkit repository
A lightweight vision library for performing large object detection
Fast image augmentation library and an easy-to-use wrapper
Go package for computer vision using OpenCV 4 and beyond
Google Testing and Mocking Framework
Set of comprehensive computer vision & machine intelligence libraries
ICLR2024 Spotlight: curation/training code, metadata, distribution
A fast, powerful, and simple hierarchical vision transformer
Java interface to OpenCV, FFmpeg, and more
A framework to enable multimodal models to operate a computer
"Big Model" trains a visual multimodal VLM with 26M parameters