MapAnything: Universal Feed-Forward Metric 3D Reconstruction
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Sharp Monocular Metric Depth in Less Than a Second
Implementation of Make-A-Video, new SOTA text to video generator
Simple and easily configurable grid world environments
A walk along memory lane
Implementation of BEVFormer, a camera-only framework
3D-aware GANs based on NeRF (arXiv)
A real-time approach for mapping all human pixels of 2D RGB images
Efficient 3D human pose estimation in video using 2D keypoint
Cross Audio-Visual Recognition using 3D Architectures