A set of Docker images for training and serving models in TensorFlow
PArallel Distributed Deep LEarning: Machine Learning Framework
C++ library for high performance inference on NVIDIA GPUs
OpenVINO™ Toolkit repository
Deep Learning API and Server in C++14 support for Caffe, PyTorch
Open standard for machine learning interoperability
MNN is a blazing fast, lightweight deep learning framework
ONNX Runtime: cross-platform, high performance ML inferencing
A GPU-accelerated library containing highly optimized building blocks
Trainable models and NN optimization tools
Probabilistic reasoning and statistical analysis in TensorFlow
High-performance neural network inference framework for mobile
Library for OCR-related tasks powered by Deep Learning
Bolt is a deep learning library with high performance
Sparsity-aware deep learning inference runtime for CPUs
Deep learning optimization library: makes distributed training easy
A library for accelerating Transformer models on NVIDIA GPUs
The Triton Inference Server provides an optimized cloud
A unified framework for scalable computing
Powering Amazon custom machine learning chips
Libraries for applying sparsification recipes to neural networks
Library for serving Transformers models on Amazon SageMaker
MII makes low-latency and high-throughput inference possible
Uncover insights, surface problems, monitor, and fine tune your LLM
Fast inference engine for Transformer models