A library for accelerating Transformer models on NVIDIA GPUs
The Triton Inference Server provides an optimized cloud
A set of Docker images for training and serving models in TensorFlow
Open platform for training, serving, and evaluating language models
Toolbox of models, callbacks, and datasets for AI/ML researchers
Toolkit for allowing inference and serving with MXNet in SageMaker
CPU/GPU inference server for Hugging Face transformer models