DensePose is a computer vision system that maps all human pixels in an RGB image to the 3D surface of a human body model. It extends human pose estimation from predicting joint keypoints to providing dense correspondences between 2D images and a canonical 3D mesh (such as the SMPL model). This enables detailed understanding of human shape, motion, and surface appearance directly from images or videos. The repository includes the DensePose network architecture, training code, pretrained models, and dataset tools for annotation and visualization. DensePose is widely used in augmented reality, motion capture, virtual try-on, and visual effects applications because it enables real-time 3D human mapping from 2D inputs. The model architecture builds on Mask R-CNN, using additional regression heads to predict UV coordinates that map image pixels to 3D surfaces.
Features
- Dense pixel-to-surface mapping between 2D images and 3D human meshes
- Built on Mask R-CNN with UV coordinate regression for dense correspondence
- Pretrained models and training scripts for large-scale datasets
- Visualization and annotation tools for human surface mapping
- Applications in AR, virtual try-on, and human motion capture
- Real-time or near real-time inference pipelines for video and single images