Openai style api for open large language models
Run Local LLMs on Any Device. Open-source
A high-throughput and memory-efficient inference and serving engine
Everything you need to build state-of-the-art foundation models
Ready-to-use OCR with 80+ supported languages
FlashInfer: Kernel Library for LLM Serving
Trainable, memory-efficient, and GPU-friendly PyTorch reproduction
Library for serving Transformers models on Amazon SageMaker
An MLOps framework to package, deploy, monitor and manage models
LMDeploy is a toolkit for compressing, deploying, and serving LLMs
AIMET is a library that provides advanced quantization and compression
Library for OCR-related tasks powered by Deep Learning
Operating LLMs in production
Single-cell analysis in Python
Bring the notion of Model-as-a-Service to life
Unified Model Serving Framework
Training and deploying machine learning models on Amazon SageMaker
LLM training code for MosaicML foundation models
Visual Instruction Tuning: Large Language-and-Vision Assistant
Optimizing inference proxy for LLMs
Official inference library for Mistral models
Gaussian processes in TensorFlow
Easiest and laziest way for building multi-agent LLMs applications
Open-source tool designed to enhance the efficiency of workloads
Data manipulation and transformation for audio signal processing