Here comes a selection of technology stacks and tool repositories
Port of Facebook's LLaMA model in C/C++
Run Local LLMs on Any Device. Open-source
Distribute and run LLMs with a single file
LLM inference in C/C++
Emscripten: An LLVM-to-WebAssembly Compiler
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
Fast Multimodal LLM on Mobile Devices
TT-NN operator library, and TT-Metalium low level kernel programming
GLM-4 series: Open Multilingual Multimodal Chat LMs
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
Open-source large language model family from Tencent Hunyuan
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Beautiful, AI-native markdown editor and LLM Wiki
Ling is a MoE LLM provided and open-sourced by InclusionAI
Flutter-based cross-platform app integrating major AI models
270+ Claude Code plugins with 739 agent skills
A @ClickHouse fork that supports high-performance vector search
Alibaba's high-performance LLM inference engine for diverse apps
UCCL is an efficient communication library for GPUs
An Easy-to-Use and High-Performance AI Deployment Framework
High-speed Large Language Model Serving for Local Deployment
Production ready toolkit to run AI locally
A series of math-specific large language models of our Qwen2 series
Mooncake is the serving platform for Kimi