High-performance neural network inference framework for mobile
C++ library for high performance inference on NVIDIA GPUs
Protect and discover secrets using Gitleaks
ONNX Runtime: cross-platform, high performance ML inferencing
AIMET is a library that provides advanced quantization and compression
Official inference library for Mistral models
A general-purpose probabilistic programming system
Library for serving Transformers models on Amazon SageMaker
Standardized Serverless ML Inference Platform on Kubernetes
Powering Amazon custom machine learning chips
Neural Network Compression Framework for enhanced OpenVINO
MNN is a blazing fast, lightweight deep learning framework
Superduper: Integrate AI models and machine learning workflows
Set of comprehensive computer vision & machine intelligence libraries
Easy-to-use deep learning framework with 3 key features
A set of Docker images for training and serving models in TensorFlow
A unified framework for scalable computing
Port of Facebook's LLaMA model in C/C++
A GPU-accelerated library containing highly optimized building blocks
An MLOps framework to package, deploy, monitor and manage models
LLM.swift is a simple and readable library
Unified Model Serving Framework
Run Local LLMs on Any Device. Open-source
Build Production-ready Agentic Workflow with Natural Language
The free, Open Source alternative to OpenAI, Claude and others