GPT-NeoX
Implementation of model parallel autoregressive transformers on GPUs
... recommend Mesh Transformer JAX.
If you are not looking to train models with billions of parameters from scratch, this is likely the wrong library to use. For generic inference needs, we recommend you use the Hugging Face transformers library instead which supports GPT-NeoX models.