Motion-controllable Video Generation via Latent Trajectory Guidance
Code to accompany "A Method for Animating Children's Drawings"
End-to-end pipeline converting generative videos
Structure-from-Motion and Multi-View Stereo
Video understanding codebase from FAIR for reproducing video models
NVR with realtime local object detection for IP cameras
Make videos programmatically with React
Tiny vision language model
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
[CVPR 2025 Best Paper Award] VGGT
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
PyTorch code and models for VJEPA2 self-supervised learning from video
Overcoming Data Limitations for High-Quality Video Diffusion Models
CoTracker is a model for tracking any point (pixel) on a video
A Strong and Easy-to-use Single View 3D Hand+Body Pose Estimator
The official pytorch implementation of our paper
A real-time approach for mapping all human pixels of 2D RGB images
Efficient 3D human pose estimation in video using 2D keypoint
Resources about activity recognition