Stable Virtual Camera: Generative View Synthesis with Diffusion Models
RGBD video generation model conditioned on camera input
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Fast and Universal 3D reconstruction model for versatile tasks
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Official Python inference and LoRA trainer package
Diffusion Transformer with Fine-Grained Chinese Understanding
Generate Any 3D Scene in Seconds
Sharp Monocular Metric Depth in Less Than a Second
Uncommon Objects in 3D dataset
Advancing Open-source World Models
Tooling for the Common Objects In 3D dataset