A simple Python package that wraps existing model fine-tuning and generation scripts for OpenAI's GPT-2 text generation model (specifically the "small" 124M and "medium" 355M hyperparameter versions). Additionally, this package allows easier generation of text, generating to a file for easy curation, allowing for prefixes to force the text to start with a given phrase. For finetuning, it is strongly recommended to use a GPU, although you can generate using a CPU (albeit much more slowly). If you are training in the cloud, using a Colaboratory notebook or a Google Compute Engine VM w/ the TensorFlow Deep Learning image is strongly recommended. (as the GPT-2 model is hosted on GCP) You can use gpt-2-simple to retrain a model using a GPU for free in this Colaboratory notebook, which also demos additional features of the package. Note: Development on gpt-2-simple has mostly been superceded by aitextgen, which has similar AI text generation capabilities with more efficient training time.

Features

  • Model management from OpenAI's official GPT-2 repo (MIT License)
  • Model finetuning from Neil Shepperd's fork of GPT-2 (MIT License)
  • Text generation output management from textgenrnn (MIT License / also created by me)
  • gpt-2-simple can be installed via PyPI
  • The original GPT-2 model was trained on a very large variety of sources, allowing the model to incorporate idioms not seen in the input text
  • GPT-2 can only generate a maximum of 1024 tokens per request (about 3-4 paragraphs of English text)
  • A simple Python package that wraps existing model fine-tuning and generation scripts for OpenAI's GPT-2 text generation model

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow gpt-2-simple

gpt-2-simple Web Site

Other Useful Business Software
Build Securely on AWS with Proven Frameworks Icon
Build Securely on AWS with Proven Frameworks

Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
Download Now
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of gpt-2-simple!

Additional Project Details

Programming Language

Python

Related Categories

Python AI Text Generators, Python ChatGPT Apps, Python Generative AI

Registered

2023-03-23