LLM101n is an educational repository that walks you through building and understanding large language models from first principles. It emphasizes intuition and hands-on implementation, guiding you from tokenization and embeddings to attention, transformer blocks, and sampling. The materials favor compact, readable code and incremental steps, so learners can verify each concept before moving on. You’ll see how data pipelines, batching, masking, and positional encodings fit together to train a small GPT-style model end to end. The repo often complements explanations with runnable notebooks or scripts, encouraging experimentation and modification. By the end, the focus is less on polishing a production system and more on internalizing how LLM components interact to produce coherent text.

Features

  • Step-by-step build of a GPT-style transformer from scratch
  • Clear coverage of tokenization, embeddings, attention, and MLP blocks
  • Runnable code and exercises for experiential learning
  • Demonstrations of batching, masking, and positional encodings
  • Training and sampling loops you can inspect and modify
  • Emphasis on readability and conceptual understanding over framework magic

Project Samples

Project Activity

See All Activity >

Categories

Education

Follow LLM101n

LLM101n Web Site

Other Useful Business Software
Gen AI apps are built with MongoDB Atlas Icon
Gen AI apps are built with MongoDB Atlas

Build gen AI apps with an all-in-one modern database: MongoDB Atlas

MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of LLM101n!

Additional Project Details

Registered

2025-10-15