DeepMind model for tracking arbitrary points across videos & robotics
Uncommon Objects in 3D dataset
Official implementation of DreamCraft3D
code for Mesh R-CNN, ICCV 2019
Fast and Universal 3D reconstruction model for versatile tasks
Pretrained time-series foundation model developed by Google Research
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
MiMo-V2-Flash: Efficient Reasoning, Coding, and Agentic Foundation
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Qwen2.5-VL is the multimodal large language model series
Production-tested AI infrastructure tools
Tooling for the Common Objects In 3D dataset
Instructions on how to use the Realtime API on Microcontrollers
Foundational Models for State-of-the-Art Speech and Text Translation
RGBD video generation model conditioned on camera input
Python example app from the OpenAI API quickstart tutorial
Large-scale autoregressive pixel model for image generation by OpenAI
Elegant PyTorch implementation of paper Model-Agnostic Meta-Learning
Code for the paper "Improved Techniques for Training GANs"
Large language model developed and released by NVIDIA
Versatile 8B-base multimodal LLM, flexible foundation for custom AI