Browse free open source LLM Inference tools and projects for Linux below. Use the toggles on the left to filter open source LLM Inference tools by OS, license, language, programming language, and project status.
Port of OpenAI's Whisper model in C/C++
Port of Facebook's LLaMA model in C/C++
Run Local LLMs on Any Device. Open-source
OpenVINO™ Toolkit repository
User-friendly AI Interface
Self-hosted, community-driven, local OpenAI compatible API
ONNX Runtime: cross-platform, high performance ML inferencing
A high-throughput and memory-efficient inference and serving engine
A set of Docker images for training and serving models in TensorFlow
C++ library for high performance inference on NVIDIA GPUs
Implementation of model parallel autoregressive transformers on GPUs
Protect and discover secrets using Gitleaks
OpenMMLab Model Deployment Framework
Easy-to-use deep learning framework with 3 key features
Open standard for machine learning interoperability
A RWKV management and startup tool, full automation, only 8MB
High-performance neural network inference framework for mobile
Connect home devices into a powerful cluster to accelerate LLM
The deep learning toolkit for speech-to-text
A high-performance ML model serving framework, offers dynamic batching
Run local LLMs like llama, deepseek, kokoro etc. inside your browser
Deep learning optimization library: makes distributed training easy
Training and deploying machine learning models on Amazon SageMaker
Large Language Model Text Generation Inference
Multilingual Automatic Speech Recognition with word-level timestamps