Implementation of Make-A-Video, new SOTA text to video generator
The data structure for multimodal data
PyTorch code and models for V-JEPA self-supervised learning from video
Build cross-modal and multimodal applications on the cloud
Gluon CV Toolkit
We estimate dense, flicker-free, geometrically consistent depth
Deep learning person re-identification in PyTorch
World's simplest facial recognition api for Python & the command line