Port of Facebook's LLaMA model in C/C++
User-friendly AI Interface
Official inference library for Mistral models
A set of Docker images for training and serving models in TensorFlow
Deep Learning API and Server in C++14 support for Caffe, PyTorch
CPU/GPU inference server for Hugging Face transformer models
Deploy a ML inference service on a budget in 10 lines of code