C++ library for high performance inference on NVIDIA GPUs
Awesome multilingual OCR toolkits based on PaddlePaddle
Port of OpenAI's Whisper model in C/C++
A lightweight, lightning-fast, in-process vector database
A @ClickHouse fork that supports high-performance vector search
High-performance library for gradient boosting on decision trees
Low-latency machine code generation
High-performance neural network inference framework for mobile
TensorFlow is an open source library for machine learning
A scalable inference server for models optimized with OpenVINO
Fast, Sharp & Reliable Agentic Intelligence
Connect home devices into a powerful cluster to accelerate LLM
CUDA Templates for Linear Algebra Subroutines
Alibaba's high-performance LLM inference engine for diverse apps
High-speed Large Language Model Serving for Local Deployment
ONNX Runtime: cross-platform, high performance ML inferencing
Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference
Mooncake is the serving platform for Kimi
On-device AI across mobile, embedded and edge for PyTorch
C++-based high-performance parallel environment execution engine
Bolt is a deep learning library with high performance
LLM inference in C/C++
Clean and efficient FP8 GEMM kernels with fine-grained scaling
LiteRT, successor to TensorFlow Lite
OpenMLDB is an open-source machine learning database