This repository contains the code and model weights for GPT-2, a large-scale unsupervised language model described in the OpenAI paper “Language Models are Unsupervised Multitask Learners.” The intent is to provide a starting point for researchers and engineers to experiment with GPT-2: generate text, fine‐tune on custom datasets, explore model behavior, or study its internal phenomena. The repository includes scripts for sampling, training, downloading pre-trained models, and utilities for tokenization and model handling. Support for memory-saving gradient techniques/optimizations during training. Sampling/generation scripts (conditional, unconditional, interactive).

Features

  • Pretrained model weights for multiple GPT-2 sizes (e.g. 117M, 345M, up to 1.5B parameters)
  • Sampling / generation scripts (conditional, unconditional, interactive)
  • Tokenizer and encoding / decoding utilities
  • Training / fine-tuning script support (for smaller models)
  • Support for memory-saving gradient techniques / optimizations during training
  • Utilities to download / manage model checkpoints via script

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow GPT-2

GPT-2 Web Site

Other Useful Business Software
Go From AI Idea to AI App Fast Icon
Go From AI Idea to AI App Fast

One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
Try Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of GPT-2!

Additional Project Details

Programming Language

Python

Related Categories

Python Artificial Intelligence Software

Registered

2025-09-26