Training and deploying machine learning models on Amazon SageMaker
Run Local LLMs on Any Device. Open-source
A high-throughput and memory-efficient inference and serving engine
Everything you need to build state-of-the-art foundation models
Standardized Serverless ML Inference Platform on Kubernetes
Optimizing inference proxy for LLMs
Framework that is dedicated to making neural data processing
Create HTML profiling reports from pandas DataFrame objects
A set of Docker images for training and serving models in TensorFlow
Operating LLMs in production
Single-cell analysis in Python
Sparsity-aware deep learning inference runtime for CPUs
MII makes low-latency and high-throughput inference possible
OpenMMLab Model Deployment Framework
Deep learning optimization library: makes distributed training easy
DoWhy is a Python library for causal inference
FlashInfer: Kernel Library for LLM Serving
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
An easy-to-use LLMs quantization package with user-friendly apis
A Pythonic framework to simplify AI service building
20+ high-performance LLMs with recipes to pretrain, finetune at scale
Easiest and laziest way for building multi-agent LLMs applications
Official inference library for Mistral models
Data manipulation and transformation for audio signal processing
Visual Instruction Tuning: Large Language-and-Vision Assistant