Build VMs, containers, AI, databases, storage—all in one place.
Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
Get Started
Earn up to 16% annual interest with Nexo.
More flexibility. More control.
Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform.
Geographic restrictions, eligibility, and terms apply.
Implementation of model parallel autoregressive transformers on GPUs
This repository records EleutherAI's library for training large-scale language models on GPUs. Our current framework is based on NVIDIA's Megatron Language Model and has been augmented with techniques from DeepSpeed as well as some novel optimizations. We aim to make this repo a centralized and accessible place to gather techniques for training large-scale autoregressive language models, and accelerate research into large-scale training.
For those looking for a TPU-centric codebase, we...