Browse free open source Python LLM Inference Tools and projects below. Use the toggles on the left to filter open source Python LLM Inference Tools by OS, license, language, programming language, and project status.
PyTorch library of curated Transformer models and their components
Sparsity-aware deep learning inference runtime for CPUs
MII makes low-latency and high-throughput inference possible
DoWhy is a Python library for causal inference
Database system for building simpler and faster AI-powered application
Open platform for training, serving, and evaluating language models
Gaussian processes in TensorFlow
GPU environment management and cluster orchestration
Low-latency REST API for serving text-embeddings
Build your chatbot within minutes on your favorite device
LLMFlows - Simple, Explicit and Transparent LLM Apps
Toolbox of models, callbacks, and datasets for AI/ML researchers
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
OpenMMLab Video Perception Toolbox
Framework for Accelerating LLM Generation with Multiple Decoding Heads
A high-performance ML model serving framework, offers dynamic batching
Neural Network Compression Framework for enhanced OpenVINO
Lightweight Python library for adding real-time multi-object tracking
OpenFieldAI is an AI based Open Field Test Rodent Tracker
Trainable, memory-efficient, and GPU-friendly PyTorch reproduction
Operating LLMs in production
State-of-the-art Parameter-Efficient Fine-Tuning
Create HTML profiling reports from pandas DataFrame objects
Run 100B+ language models at home, BitTorrent-style