C++ library for high performance inference on NVIDIA GPUs
High-performance neural network inference framework for mobile
Lightweight inference library for ONNX files, written in C++
Fast inference engine for Transformer models
A general-purpose probabilistic programming system
An innovative library for efficient LLM inference
A GPU-accelerated library containing highly optimized building blocks
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Bolt is a deep learning library with high performance
A library for accelerating Transformer models on NVIDIA GPUs
Lightweight anchor-free object detection model
The deep learning toolkit for speech-to-text
Deep learning inference framework optimized for mobile platforms