Ongoing research training transformer models at scale
Run 100B+ language models at home, BitTorrent-style
Implementation of model parallel autoregressive transformers on GPUs
A text generation library with pre-trained language models github.com
An implementation of model parallel GPT-2 and GPT-3-style models
Deep Learning in Haskell