A toolkit to optimize ML models for deployment for Keras & TensorFlow
LMDeploy is a toolkit for compressing, deploying, and serving LLMs
ONNX Runtime: cross-platform, high performance ML inferencing
Build your chatbot within minutes on your favorite device
Bolt is a deep learning library with high performance
Trainable models and NN optimization tools
Easy-to-use deep learning framework with 3 key features
Framework that is dedicated to making neural data processing
CPU/GPU inference server for Hugging Face transformer models