ONNX Runtime: cross-platform, high performance ML inferencing
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
High-performance neural network inference framework for mobile
Multi-Modal Neural Networks for Semantic Search, based on Mid-Fusion
An Open-Source Programming Framework for Agentic AI
A unified framework for scalable computing
Multilingual Automatic Speech Recognition with word-level timestamps
OpenVINO™ Toolkit repository
Pytorch domain library for recommendation systems
Lightweight Python library for adding real-time multi-object tracking
Build Production-ready Agentic Workflow with Natural Language
A GPU-accelerated library containing highly optimized building blocks
Serve, optimize and scale PyTorch models in production
Large Language Model Text Generation Inference
Easiest and laziest way for building multi-agent LLMs applications
Bring the notion of Model-as-a-Service to life
Private Open AI on Kubernetes
Phi-3.5 for Mac: Locally-run Vision and Language Models
Deep Learning API and Server in C++14 support for Caffe, PyTorch
MII makes low-latency and high-throughput inference possible
Open platform for training, serving, and evaluating language models
High quality, fast, modular reference implementation of SSD in PyTorch
A computer vision framework to create and deploy apps in minutes
Self-contained Machine Learning and Natural Language Processing lib
Serve machine learning models within a Docker container