This repository contains the code and model weights for GPT-2, a large-scale unsupervised language model described in the OpenAI paper “Language Models are Unsupervised Multitask Learners.” The intent is to provide a starting point for researchers and engineers to experiment with GPT-2: generate text, fine‐tune on custom datasets, explore model behavior, or study its internal phenomena. The repository includes scripts for sampling, training, downloading pre-trained models, and utilities for tokenization and model handling. Support for memory-saving gradient techniques/optimizations during training. Sampling/generation scripts (conditional, unconditional, interactive).
Features
- Pretrained model weights for multiple GPT-2 sizes (e.g. 117M, 345M, up to 1.5B parameters)
- Sampling / generation scripts (conditional, unconditional, interactive)
- Tokenizer and encoding / decoding utilities
- Training / fine-tuning script support (for smaller models)
- Support for memory-saving gradient techniques / optimizations during training
- Utilities to download / manage model checkpoints via script
Categories
Artificial IntelligenceLicense
MIT LicenseFollow GPT-2
Other Useful Business Software
Go From AI Idea to AI App Fast
Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
Rate This Project
Login To Rate This Project
User Reviews
Be the first to post a review of GPT-2!