This repository records EleutherAI's library for training large-scale language models on GPUs. Our current framework is based on NVIDIA's Megatron Language Model and has been augmented with techniques from DeepSpeed as well as some novel optimizations. We aim to make this repo a centralized and accessible place to gather techniques for training large-scale autoregressive language models, and accelerate research into large-scale training.

For those looking for a TPU-centric codebase, we recommend Mesh Transformer JAX.

If you are not looking to train models with billions of parameters from scratch, this is likely the wrong library to use. For generic inference needs, we recommend you use the Hugging Face transformers library instead which supports GPT-NeoX models.

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow GPT-NeoX

GPT-NeoX Web Site

Other Useful Business Software
Train ML Models With SQL You Already Know Icon
Train ML Models With SQL You Already Know

BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
Try Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of GPT-NeoX!