LLM101n is an educational repository that walks you through building and understanding large language models from first principles. It emphasizes intuition and hands-on implementation, guiding you from tokenization and embeddings to attention, transformer blocks, and sampling. The materials favor compact, readable code and incremental steps, so learners can verify each concept before moving on. You’ll see how data pipelines, batching, masking, and positional encodings fit together to train a small GPT-style model end to end. The repo often complements explanations with runnable notebooks or scripts, encouraging experimentation and modification. By the end, the focus is less on polishing a production system and more on internalizing how LLM components interact to produce coherent text.
Features
- Step-by-step build of a GPT-style transformer from scratch
- Clear coverage of tokenization, embeddings, attention, and MLP blocks
- Runnable code and exercises for experiential learning
- Demonstrations of batching, masking, and positional encodings
- Training and sampling loops you can inspect and modify
- Emphasis on readability and conceptual understanding over framework magic