Open source framework for deep learning satellite and aerial imagery
Implementation of Vision Transformer, a simple way to achieve SOTA
Training data (data labeling, annotation, workflow) for all data types
Automatically find issues in image datasets
Data Science Guide With Videos And Materials
Official DeiT repository
Vision utilities for web interaction agents
Medical imaging toolkit for deep learning
Open Source Differentiable Computer Vision Library
Fast image augmentation library and an easy-to-use wrapper
The open-source tool for building high-quality datasets
The toolkit to test, validate, and evaluate your models and surface
ICLR2024 Spotlight: curation/training code, metadata, distribution
Hub of ready-to-use datasets for ML models
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Deep learning library
Automate browser-based workflows with LLMs and Computer Vision
Data integration platform for ELT pipelines from APIs, databases
Reference PyTorch implementation and models for DINOv3
The largest collection of PyTorch image encoders / backbones
[CVPR 2025 Best Paper Award] VGGT
PyTorch code and models for the DINOv2 self-supervised learning
Dataset Management Framework, a Python library and a CLI tool to build
Qwen2.5-VL is the multimodal large language model series