User-friendly AI Interface
Openai style api for open large language models
Port of OpenAI's Whisper model in C/C++
Run Local LLMs on Any Device. Open-source
Port of Facebook's LLaMA model in C/C++
OpenVINO™ Toolkit repository
ONNX Runtime: cross-platform, high performance ML inferencing
The free, Open Source alternative to OpenAI, Claude and others
A high-throughput and memory-efficient inference and serving engine
Protect and discover secrets using Gitleaks
High-performance neural network inference framework for mobile
Open standard for machine learning interoperability
Open-Source AI Camera. Empower any camera/CCTV
C++ library for high performance inference on NVIDIA GPUs
LLM.swift is a simple and readable library
FlashInfer: Kernel Library for LLM Serving
Everything you need to build state-of-the-art foundation models
Connect home devices into a powerful cluster to accelerate LLM
Bayesian inference with probabilistic programming
C#/.NET binding of llama.cpp, including LLaMa/GPT model inference
The official Python client for the Huggingface Hub
Operating LLMs in production
GPU environment management and cluster orchestration
An MLOps framework to package, deploy, monitor and manage models
Pure C++ implementation of several models for real-time chatting