ONNX Runtime: cross-platform, high performance ML inferencing
LiteRT is the new name for TensorFlow Lite (TFLite)
MLX: An array framework for Apple silicon
A retargetable MLIR-based machine learning compiler runtime toolkit
Port of OpenAI's Whisper model in C/C++
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
Port of Facebook's LLaMA model in C/C++
OpenVINO™ Toolkit repository
Clean and efficient FP8 GEMM kernels with fine-grained scaling
On-device AI across mobile, embedded and edge for PyTorch
Emscripten: An LLVM-to-WebAssembly Compiler
Fast inference engine for Transformer models
Open source codebase for Scale Agentex
5ire is a cross-platform desktop AI assistant, MCP client
C++ library for high performance inference on NVIDIA GPUs
oneAPI Deep Neural Network Library (oneDNN)
An open-source AI framework for developers and entrepreneurs
OneFlow is a deep learning framework designed to be user-friendly
Set of comprehensive computer vision & machine intelligence libraries
Runtime extension of Proximus enabling Deployment on AMD Ryzen™ AI
Plan execution language and executive
A High Performance Library for Sequence Processing and Generation
UME is an in-app debug kits platform for Flutter
Deep learning inference framework optimized for mobile platforms