Implementation of model parallel autoregressive transformers on GPUs
A minimal PyTorch re-implementation of the OpenAI GPT
Code release for "Masked-attention Mask Transformer
GLIDE: a diffusion-based text-conditional image synthesis model
Dual LSTM Encoder for Dialog Response Generation
Open language model developed by NVIDIA as part of Nemotron-3 family