Build your own AI friend
Run Local LLMs on Any Device. Open-source
ONNX Runtime: cross-platform, high performance ML inferencing
Fast LLM speculative inference server for consumer hardware
Port of OpenAI's Whisper model in C/C++
TT-NN operator library, and TT-Metalium low level kernel programming
Fast Multimodal LLM on Mobile Devices
LiteRT, successor to TensorFlow Lite
An Easy-to-Use and High-Performance AI Deployment Framework
LiteRT-LM is Google's production-ready inference framework
Official inference framework for 1-bit LLMs
Tensor library for machine learning
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
QVAC Fabric: cross-platform LLM inference and fine-tuning
Set of comprehensive computer vision & machine intelligence libraries
LLM inference in C/C++
ArrayFire, a general purpose GPU library
HeavyDB (formerly MapD/OmniSciDB)
Bolt is a deep learning library with high performance
Speech Note Linux app. Note taking, reading and translating
Low-latency AI inference engine optimized for mobile devices
High-speed Large Language Model Serving for Local Deployment
Open standard for machine learning interoperability
OpenVINO™ Toolkit repository
On-device AI across mobile, embedded and edge for PyTorch