FlashInfer: Kernel Library for LLM Serving
Optimizing inference proxy for LLMs
AIMET is a library that provides advanced quantization and compression
A unified framework for scalable computing
Open platform for training, serving, and evaluating language models
A toolkit to optimize ML models for deployment for Keras & TensorFlow