DeepMind model for tracking arbitrary points across videos & robotics
Sharp Monocular Metric Depth in Less Than a Second
Expose your FastAPI endpoints as Model Context Protocol (MCP) tools
code for Mesh R-CNN, ICCV 2019
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Collection of common code shared among different research projects
PyTorch code and models for VJEPA2 self-supervised learning from video
Language modeling in a sentence representation space
An open sourced end-to-end VLM-based GUI Agent
Code for Language models can explain neurons in language models paper
Evals is a framework for evaluating LLMs and LLM systems
The ChatGPT Retrieval Plugin lets you easily find personal documents
Educational framework exploring multi-agent orchestration
Designed for text embedding and ranking tasks
Super Tiny Icons are miniscule SVG versions of your favourite website
Inference framework for 1-bit LLMs
Usage-based pricing and billing for developers
A dedicated app for collecting thousands of POI for OpenStreetMap
A modular high-level library to train embodied AI agents
A global resource download orchestration system
A terminal based Pokemon like game
Frame profiler
A blazing fast multi-language serialization framework
The standard data-centric AI package for data quality and ML