Run Local LLMs on Any Device. Open-source
AirLLM 70B inference with single 4GB GPU
950 line, minimal, extensible LLM inference engine built from scratch
Clippy, now with some AI
A high-performance inference engine for AI models
Bringing large-language models and chat to web browsers
LLM training in simple, raw C/CUDA
Implement CPU from scratch and play with large model deployments
Course to get into Large Language Models (LLMs)
A simple, performant and scalable Jax LLM
Explore large language models in 512MB of RAM