Megatron-LM is a GPU-optimized deep learning framework from NVIDIA designed to train extremely large transformer-based language models efficiently at scale. The repository provides both a reference training implementation and Megatron Core, a composable library of high-performance building blocks for custom large-model pipelines. It supports advanced parallelism strategies including tensor, pipeline, data, expert, and context parallelism, enabling training across massive multi-GPU and multi-node clusters. The framework includes mixed-precision training options such as FP16, BF16, FP8, and FP4 to maximize performance and memory efficiency on modern hardware. Megatron-LM is widely used in research and industry for pretraining GPT-, BERT-, T5-, and multimodal-style models, with tooling for checkpoint conversion and interoperability with Hugging Face. Overall, it is a production-grade system for organizations pushing the limits of large-scale language model training.

Features

  • GPU-optimized transformer training
  • Advanced parallelism strategies
  • Mixed precision training support
  • Composable Megatron Core library
  • Hugging Face checkpoint conversion
  • Multi-node scalable training pipelines

Project Samples

Project Activity

See All Activity >

Categories

Research

License

MIT License

Follow Megatron-LM

Megatron-LM Web Site

Other Useful Business Software
Build AI Apps with Gemini 3 on Vertex AI Icon
Build AI Apps with Gemini 3 on Vertex AI

Access Google’s most capable multimodal models. Train, test, and deploy AI with 200+ foundation models on one platform.

Vertex AI gives developers access to Gemini 3—Google’s most advanced reasoning and coding model—plus 200+ foundation models including Claude, Llama, and Gemma. Build generative AI apps with Vertex AI Studio, customize with fine-tuning, and deploy to production with enterprise-grade MLOps. New customers get $300 in free credits.
Try Vertex AI Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Megatron-LM!

Additional Project Details

Programming Language

Python

Related Categories

Python Research Software

Registered

6 hours ago