CoreNet: A library for training deep neural networks
Tooling for the Common Objects In 3D dataset
Contexts Optical Compression
A Foundation Model for Generalist Gaming Agents
Benchmarking Multimodal Agents for Open-Ended Tasks
OCR expert VLM powered by Hunyuan's native multimodal architecture
Automate native Android apps with AI using accessibility APIs
An open sourced end-to-end VLM-based GUI Agent
High-quality implementations of standard and SOTA methods
Scalable generative AI framework built for researchers and developers
Official codebase for I-JEPA
High-Resolution 3D Human Digitization from A Single Image
Fast Forward Computer Vision (and other ML workloads!)
PyTorch implementation of MAE
Automatically collect resources on your farm
Codebase for Image Classification Research, written in PyTorch
PyTorch implementation of SimCLR: A Simple Framework
linear algebra library for Python
Efficient 3D human pose estimation in video using 2D keypoint
Fast, modular reference implementation of Instance Segmentation
Convert JSON annotations into YOLO format.
S³FD: Single Shot Scale-invariant Face Detector, ICCV, 2017
RDF-based framework monitoring business systems activity
A program that gather trains delays data