Megatron-LM

Megatron-LM is a GPU-optimized deep learning framework from NVIDIA designed to train extremely large transformer-based language models efficiently at scale. The repository provides both a reference training implementation and Megatron Core, a composable library of high-performance building blocks for custom large-model pipelines. It supports advanced parallelism strategies including tensor, pipeline, data, expert, and context parallelism, enabling training across massive multi-GPU and multi-node clusters. The framework includes mixed-precision training options such as FP16, BF16, FP8, and FP4 to maximize performance and memory efficiency on modern hardware. Megatron-LM is widely used in research and industry for pretraining GPT-, BERT-, T5-, and multimodal-style models, with tooling for checkpoint conversion and interoperability with Hugging Face. Overall, it is a production-grade system for organizations pushing the limits of large-scale language model training.

Features

GPU-optimized transformer training
Advanced parallelism strategies
Mixed precision training support
Composable Megatron Core library
Hugging Face checkpoint conversion
Multi-node scalable training pipelines

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow Megatron-LM

Megatron-LM Web Site

Other Useful Business Software

Build AI Apps with Gemini 3 on Vertex AI

Access Google’s most capable multimodal models. Train, test, and deploy AI with 200+ foundation models on one platform.

Vertex AI gives developers access to Gemini 3—Google’s most advanced reasoning and coding model—plus 200+ foundation models including Claude, Llama, and Gemma. Build generative AI apps with Vertex AI Studio, customize with fine-tuning, and deploy to production with enterprise-grade MLOps. New customers get $300 in free credits.

Try Vertex AI Free

Rate This Project

User Reviews

Be the first to post a review of Megatron-LM!

Additional Project Details

Programming Language

Python

Related Categories

Python Research Software

Registered

6 hours ago

Report inappropriate content

Megatron-LM

Ongoing research training transformer models at scale

Get an email when there's a new version of Megatron-LM

Features

Project Samples

Project Activity

Categories

License

Follow Megatron-LM

User Reviews

Additional Project Details

Programming Language

Related Categories

Registered