Deep learning optimization library: makes distributed training easy
MII makes low-latency and high-throughput inference possible
Replace OpenAI GPT with another LLM in your app
Serve machine learning models within a Docker container
Lightweight anchor-free object detection model
Implementation of model parallel autoregressive transformers on GPUs
Toolkit for allowing inference and serving with MXNet in SageMaker