tiny-llm is an educational open-source project designed to teach system engineers how large language model inference and serving systems work by building them from scratch. The project is structured as a guided course that walks developers through the process of implementing the core components required to run a modern language model, including attention mechanisms, token generation, and optimization techniques. Rather than relying on high-level machine learning frameworks, the codebase uses mostly low-level array and matrix manipulation APIs so that developers can understand exactly how model inference works internally. The project demonstrates how to load and run models such as Qwen-style architectures while progressively implementing performance improvements like KV caching, request batching, and optimized attention mechanisms. It also introduces concepts behind modern LLM serving systems that resemble simplified versions of production inference engines such as vLLM.

Features

  • Step-by-step implementation of LLM inference infrastructure
  • Low-level matrix and tensor operations instead of high-level frameworks
  • Hands-on implementation of transformer attention and RoPE mechanisms
  • Support for serving Qwen-style language models
  • Demonstrations of optimization techniques such as KV cache and batching
  • Educational workflow explaining how modern LLM serving systems operate

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow tiny-llm

tiny-llm Web Site

Other Useful Business Software
Full-stack observability with actually useful AI | Grafana Cloud Icon
Full-stack observability with actually useful AI | Grafana Cloud

Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
Create free account
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of tiny-llm!

Additional Project Details

Programming Language

Python

Related Categories

Python Large Language Models (LLM)

Registered

2026-03-05