Port of OpenAI's Whisper model in C/C++
High-performance neural network inference framework for mobile
Fast inference engine for Transformer models
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
A library for accelerating Transformer models on NVIDIA GPUs
Bolt is a deep learning library with high performance
Easy-to-use deep learning framework with 3 key features
Lightweight inference library for ONNX files, written in C++