Port of Facebook's LLaMA model in C/C++
Run Local LLMs on Any Device. Open-source
Library for OCR-related tasks powered by Deep Learning
A high-throughput and memory-efficient inference and serving engine
Optimizing inference proxy for LLMs
User-friendly AI Interface
The AI-native (edge and LLM) proxy for agents
The free, Open Source alternative to OpenAI, Claude and others
C#/.NET binding of llama.cpp, including LLaMa/GPT model inference
Easiest and laziest way for building multi-agent LLMs applications
LLMs as Copilots for Theorem Proving in Lean
Framework which allows you transform your Vector Database
LLM.swift is a simple and readable library
LMDeploy is a toolkit for compressing, deploying, and serving LLMs
Large Language Model Text Generation Inference
lightweight, standalone C++ inference engine for Google's Gemma models
Easy-to-use Speech Toolkit including Self-Supervised Learning model
20+ high-performance LLMs with recipes to pretrain, finetune at scale
Run local LLMs like llama, deepseek, kokoro etc. inside your browser
An MLOps framework to package, deploy, monitor and manage models
Simplifies the local serving of AI models from any source
Serving system for machine learning models
AICI: Prompts as (Wasm) Programs
A library to communicate with ChatGPT, Claude, Copilot, Gemini