ONNX Runtime: cross-platform, high performance ML inferencing
Protect and discover secrets using Gitleaks
C++ library for high performance inference on NVIDIA GPUs
Run Local LLMs on Any Device. Open-source
High-performance neural network inference framework for mobile
Official inference library for Mistral models
Unified Model Serving Framework
MNN is a blazing fast, lightweight deep learning framework
Easy-to-use deep learning framework with 3 key features
AIMET is a library that provides advanced quantization and compression
A GPU-accelerated library containing highly optimized building blocks
A set of Docker images for training and serving models in TensorFlow
Powering Amazon custom machine learning chips
Library for OCR-related tasks powered by Deep Learning
Neural Network Compression Framework for enhanced OpenVINO
An MLOps framework to package, deploy, monitor and manage models
Set of comprehensive computer vision & machine intelligence libraries
A general-purpose probabilistic programming system
Library for serving Transformers models on Amazon SageMaker
Superduper: Integrate AI models and machine learning workflows
The Triton Inference Server provides an optimized cloud
Standardized Serverless ML Inference Platform on Kubernetes
A unified framework for scalable computing
Build Production-ready Agentic Workflow with Natural Language
LLM.swift is a simple and readable library