A high-throughput and memory-efficient inference and serving engine
Official inference library for Mistral models
Everything you need to build state-of-the-art foundation models
Unified Model Serving Framework
Ready-to-use OCR with 80+ supported languages
OpenVINO™ Toolkit repository
LMDeploy is a toolkit for compressing, deploying, and serving LLMs
Easiest and laziest way for building multi-agent LLMs applications
Training and deploying machine learning models on Amazon SageMaker
C++ library for high performance inference on NVIDIA GPUs
A Pythonic framework to simplify AI service building
The official Python client for the Huggingface Hub
Open standard for machine learning interoperability
Bring the notion of Model-as-a-Service to life
PArallel Distributed Deep LEarning: Machine Learning Framework
State-of-the-art diffusion models for image and audio generation
Large Language Model Text Generation Inference
A set of Docker images for training and serving models in TensorFlow
The AI-native (edge and LLM) proxy for agents
Operating LLMs in production
Visual Instruction Tuning: Large Language-and-Vision Assistant
Libraries for applying sparsification recipes to neural networks
Gaussian processes in TensorFlow
A GPU-accelerated library containing highly optimized building blocks
Easy-to-use deep learning framework with 3 key features