RGBD video generation model conditioned on camera input
Fast and Universal 3D reconstruction model for versatile tasks
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Sharp Monocular Metric Depth in Less Than a Second
Generate Any 3D Scene in Seconds
Advancing Open-source World Models
Tooling for the Common Objects In 3D dataset
Metric monocular depth estimation (vision model)
Vision-language-action model for robot control via images and text