Browse free open source Python LLM Inference Tools and projects below. Use the toggles on the left to filter open source Python LLM Inference Tools by OS, license, language, programming language, and project status.
Sparsity-aware deep learning inference runtime for CPUs
MII makes low-latency and high-throughput inference possible
Database system for building simpler and faster AI-powered application
Open platform for training, serving, and evaluating language models
FlashInfer: Kernel Library for LLM Serving
OpenMMLab Video Perception Toolbox
Official inference library for Mistral models
Bring the notion of Model-as-a-Service to life
Lightweight Python library for adding real-time multi-object tracking
Trainable, memory-efficient, and GPU-friendly PyTorch reproduction
Trainable models and NN optimization tools
Everything you need to build state-of-the-art foundation models
State-of-the-art Parameter-Efficient Fine-Tuning
Easy-to-use Speech Toolkit including Self-Supervised Learning model
High quality, fast, modular reference implementation of SSD in PyTorch
Library for serving Transformers models on Amazon SageMaker
Training and deploying machine learning models on Amazon SageMaker
Large Language Model Text Generation Inference
The Triton Inference Server provides an optimized cloud
A graphical manager for ollama that can manage your LLMs
Training & Implementation of chatbots leveraging GPT-like architecture
AIMET is a library that provides advanced quantization and compression
Openai style api for open large language models
Powering Amazon custom machine learning chips
A Unified Library for Parameter-Efficient Learning