Witness the aha moment of VLM with less than $3
Visual Instruction Tuning: Large Language-and-Vision Assistant
A neural network that transforms a design mock-up into static websites
Phi-3.5 for Mac: Locally-run Vision and Language Models
[CVPR 2025 Best Paper Award] VGGT
Go package for computer vision using OpenCV 4 and beyond
ICLR2024 Spotlight: curation/training code, metadata, distribution
Code release for ConvNeXt model
A real-time approach for mapping all human pixels of 2D RGB images
PyTorch implementation of SimCLR: A Simple Framework
Estimates the psychovisual difference between two images
Chrome Extension that displays automated image tags from Facebook
Eye movements control portable on different robotic stereo heads
Matlab implementation of the ECO tracker
A curated list of resources dedicated to RNN
A simple opensource 3d network game