Browse free open source Python LLM Inference Tools and projects below. Use the toggles on the left to filter open source Python LLM Inference Tools by OS, license, language, programming language, and project status.
Python Package for ML-Based Heterogeneous Treatment Effects Estimation
Open platform for training, serving, and evaluating language models
FlashInfer: Kernel Library for LLM Serving
Implementation of model parallel autoregressive transformers on GPUs
Gaussian processes in TensorFlow
GPU environment management and cluster orchestration
CPU/GPU inference server for Hugging Face transformer models
LLMFlows - Simple, Explicit and Transparent LLM Apps
Easiest and laziest way for building multi-agent LLMs applications
Toolbox of models, callbacks, and datasets for AI/ML researchers
20+ high-performance LLMs with recipes to pretrain, finetune at scale
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
OpenMMLab Model Deployment Framework
OpenMMLab Video Perception Toolbox
Framework for Accelerating LLM Generation with Multiple Decoding Heads
Official inference library for Mistral models
Neural Network Compression Framework for enhanced OpenVINO
OpenFieldAI is an AI based Open Field Test Rodent Tracker
Trainable, memory-efficient, and GPU-friendly PyTorch reproduction
Trainable models and NN optimization tools
State-of-the-art Parameter-Efficient Fine-Tuning
Create HTML profiling reports from pandas DataFrame objects
Run 100B+ language models at home, BitTorrent-style
Phi-3.5 for Mac: Locally-run Vision and Language Models