Metaseq
Repo for external large-scale work
Metaseq is a flexible, high-performance framework for training and serving large-scale sequence models, such as language models, translation systems, and instruction-tuned LLMs. Built on top of PyTorch, it provides distributed training, model sharding, mixed-precision computation, and memory-efficient checkpointing to support models with hundreds of billions of parameters. The framework was used internally at Meta to train models like OPT (Open Pre-trained Transformer) and serves as a reference...