Inference Llama 2 in one file of pure C
Port of Facebook's LLaMA model in C/C++
C#/.NET binding of llama.cpp, including LLaMa/GPT model inference
The media player for language learning, with dual subtitles
Run models like Kimi-K2.5, GLM-5, DeepSeek, gpt-oss, Gemma, Qwen etc.
The easiest way to use Ollama in .NET
AI-powered bridge connecting LLMs and advanced AI agents
Integrate cutting-edge LLM technology quickly and easily into your app
Run Local LLMs on Any Device. Open-source
TT-NN operator library, and TT-Metalium low level kernel programming
Distribute and run LLMs with a single file
Emscripten: An LLVM-to-WebAssembly Compiler
Research project. A Memory solution for users, teams, and applications
Next-gen AI+IoT framework for T2/T3/T5AI/ESP32/and more
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
Fast Multimodal LLM on Mobile Devices
Mooncake is the serving platform for Kimi
LLM training in simple, raw C/CUDA
Alibaba's high-performance LLM inference engine for diverse apps
Run a 1-billion parameter LLM on a $10 board with 256MB RAM
Run PyTorch LLMs locally on servers, desktop and mobile
An Easy-to-Use and High-Performance AI Deployment Framework
High-speed Large Language Model Serving for Local Deployment
Vector database plugin for Postgres, written in Rust