Everything you need to build state-of-the-art foundation models
LMDeploy is a toolkit for compressing, deploying, and serving LLMs
Run Local LLMs on Any Device. Open-source
Build your chatbot within minutes on your favorite device
Probabilistic reasoning and statistical analysis in TensorFlow
Uncover insights, surface problems, monitor, and fine tune your LLM
Adversarial Robustness Toolbox (ART) - Python Library for ML security
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method
A high-throughput and memory-efficient inference and serving engine
Trainable models and NN optimization tools
An MLOps framework to package, deploy, monitor and manage models
A library to communicate with ChatGPT, Claude, Copilot, Gemini
Powering Amazon custom machine learning chips
Simplifies the local serving of AI models from any source
State-of-the-art diffusion models for image and audio generation
FlashInfer: Kernel Library for LLM Serving
The official Python client for the Huggingface Hub
Tensor search for humans
A set of Docker images for training and serving models in TensorFlow
GPU environment management and cluster orchestration
Optimizing inference proxy for LLMs
Sparsity-aware deep learning inference runtime for CPUs
Official inference library for Mistral models
Data manipulation and transformation for audio signal processing
A Pythonic framework to simplify AI service building