Inference Llama 2 in one file of pure C
C#/.NET binding of llama.cpp, including LLaMa/GPT model inference
Port of Facebook's LLaMA model in C/C++
The media player for language learning, with dual subtitles
Get up and running with Llama 2 and other large language models
Integrate cutting-edge LLM technology quickly and easily into your app
AI-powered bridge connecting LLMs and advanced AI agents
Run Local LLMs on Any Device. Open-source
Next-gen AI+IoT framework for T2/T3/T5AI/ESP32/and more
Emscripten: An LLVM-to-WebAssembly Compiler
The easiest way to use Ollama in .NET
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
TT-NN operator library, and TT-Metalium low level kernel programming
Distribute and run LLMs with a single file
Research project. A Memory solution for users, teams, and applications
Mooncake is the serving platform for Kimi
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
Alibaba's high-performance LLM inference engine for diverse apps
An Easy-to-Use and High-Performance AI Deployment Framework
Run PyTorch LLMs locally on servers, desktop and mobile
Leveraging BERT and c-TF-IDF to create easily interpretable topics
Vector database plugin for Postgres, written in Rust
High-speed Large Language Model Serving for Local Deployment
Run a 1-billion parameter LLM on a $10 board with 256MB RAM
Fast Multimodal LLM on Mobile Devices