Browse free open source Python LLM Inference Tools and projects below. Use the toggles on the left to filter open source Python LLM Inference Tools by OS, license, language, programming language, and project status.
Run Local LLMs on Any Device. Open-source
Ready-to-use OCR with 80+ supported languages
Lightweight anchor-free object detection model
A high-throughput and memory-efficient inference and serving engine
Everything you need to build state-of-the-art foundation models
Uncover insights, surface problems, monitor, and fine tune your LLM
GPU environment management and cluster orchestration
State-of-the-art diffusion models for image and audio generation
Library for OCR-related tasks powered by Deep Learning
Implementation of model parallel autoregressive transformers on GPUs
The unofficial python package that returns response of Google Bard
Bring the notion of Model-as-a-Service to life
Trainable, memory-efficient, and GPU-friendly PyTorch reproduction
Data manipulation and transformation for audio signal processing
Pytorch domain library for recommendation systems
A library for accelerating Transformer models on NVIDIA GPUs
The Triton Inference Server provides an optimized cloud
The official Python client for the Huggingface Hub
A graphical manager for ollama that can manage your LLMs
A set of Docker images for training and serving models in TensorFlow
FlashInfer: Kernel Library for LLM Serving
LMDeploy is a toolkit for compressing, deploying, and serving LLMs
Official inference library for Mistral models
Operating LLMs in production
Trainable models and NN optimization tools