A lightweight vision library for performing large object detection
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method
A GPU-accelerated library containing highly optimized building blocks
AIMET is a library that provides advanced quantization and compression
Fast inference engine for Transformer models
Replace OpenAI GPT with another LLM in your app
Serve machine learning models within a Docker container
Implementation of model parallel autoregressive transformers on GPUs
Toolkit for allowing inference and serving with MXNet in SageMaker