Port of OpenAI's Whisper model in C/C++
Run Local LLMs on Any Device. Open-source
Port of Facebook's LLaMA model in C/C++
User-friendly AI Interface
ONNX Runtime: cross-platform, high performance ML inferencing
A high-throughput and memory-efficient inference and serving engine
The free, Open Source alternative to OpenAI, Claude and others
High-performance neural network inference framework for mobile
C#/.NET binding of llama.cpp, including LLaMa/GPT model inference
Everything you need to build state-of-the-art foundation models
OpenVINO™ Toolkit repository
Protect and discover secrets using Gitleaks
Bayesian inference with probabilistic programming
A library for accelerating Transformer models on NVIDIA GPUs
Optimizing inference proxy for LLMs
Large Language Model Text Generation Inference
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
Open-Source AI Camera. Empower any camera/CCTV
Connect home devices into a powerful cluster to accelerate LLM
LLM.swift is a simple and readable library
Neural Network Compression Framework for enhanced OpenVINO
LLMs as Copilots for Theorem Proving in Lean
Simplifies the local serving of AI models from any source
AIMET is a library that provides advanced quantization and compression
An MLOps framework to package, deploy, monitor and manage models