ONNX Runtime: cross-platform, high performance ML inferencing
Rust async runtime based on io-uring
ByteHook is an Android PLT hook library
TTS with kokoro and onnx runtime
Deep learning at the speed of light
LiteRT is the new name for TensorFlow Lite (TFLite)
A retargetable MLIR-based machine learning compiler runtime toolkit
Tools like web browser, computer access and code runner for LLMs
Port of OpenAI's Whisper model in C/C++
A self-hostable CDN for databases
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
Open source solution that can meet the requirements of workloads
MLX: An array framework for Apple silicon
Port of Facebook's LLaMA model in C/C++
OpenVINO™ Toolkit repository
Open source codebase for Scale Agentex
SGLang is a fast serving framework for large language models
Multi-Agent daTa geneRation Infra and eXperimentation framework
Implementation of "MobileCLIP" CVPR 2024
Clean and efficient FP8 GEMM kernels with fine-grained scaling
Android inline hook library which supports thumb, arm32 and arm64
NVIDIA Federated Learning Application Runtime Environment
Run Stable Diffusion on Mac natively
5ire is a cross-platform desktop AI assistant, MCP client
Deploy and share agents with open infrastructure