Ongoing research training transformer models at scale
PyTorch code and models for VJEPA2 self-supervised learning from video
Deep learning concepts in an approachable style
Learning multi-scale deep model correcting over- and under- exposed
Implementation of model parallel autoregressive transformers on GPUs
Example implementation for robust model predictive control using tube