Inference Llama 2 in one file of pure C
Distribute and run LLMs with a single file
Port of Facebook's LLaMA model in C/C++
Qwen2.5-VL is the multimodal large language model series
Get up and running with Llama 2 and other large language models
Run Local LLMs on Any Device. Open-source
Beyond the Imitation Game collaborative benchmark for measuring
Adding guardrails to large language models
Emscripten: An LLVM-to-WebAssembly Compiler
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
Self-hosted, community-driven, local OpenAI compatible API
Swirl queries any number of data sources with APIs
C#/.NET binding of llama.cpp, including LLaMa/GPT model inference
Integrate cutting-edge LLM technology quickly and easily into your app
Ongoing research training transformer models at scale
Vector database plugin for Postgres, written in Rust
LLM training in simple, raw C/CUDA
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Open-source large language model family from Tencent Hunyuan
Open source large language model by Alibaba
Python bindings for the Transformer models implemented in C/C++
llama.go is like llama.cpp in pure Golang
An ecosystem of Rust libraries for working with large language models
Locally run an Instruction-Tuned Chat-Style LLM