Port of Facebook's LLaMA model in C/C++
Fast inference engine for Transformer models
Easy-to-use deep learning framework with 3 key features
lightweight, standalone C++ inference engine for Google's Gemma models
MNN is a blazing fast, lightweight deep learning framework
Deep Learning API and Server in C++14 support for Caffe, PyTorch
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Set of comprehensive computer vision & machine intelligence libraries
The deep learning toolkit for speech-to-text
Fast and user-friendly runtime for transformer inference