Curated list of datasets and tools for post-training
Strong, Economical, and Efficient Mixture-of-Experts Language Model
LLM training in simple, raw C/CUDA
Open-source large language model family from Tencent Hunyuan
Deep learning concepts in an approachable style
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Kimi K2 is the large language model series developed by Moonshot AI
Interview guide for machine learning, mathematics, and deep learning
Scaling Reinforcement Learning with LLMs
Open-weight, large-scale hybrid-attention reasoning model
Super Comprehensive Deep Learning Notes
A general-purpose probabilistic programming system
Skywork-R1V is an advanced multimodal AI model series
GLM-4 series: Open Multilingual Multimodal Chat LMs
Advancing Formal Mathematical Reasoning via Reinforcement Learning
Tool-integrated Reasoning LLM Agents
Implementation / replication of DALL-E, OpenAI's Text to Image
8.5K high quality grade school math problems
A tiny scalar-valued autograd engine and a neural net library
Hermes 4 FP8: hybrid reasoning Llama-3.1-405B model by Nous Research
Efficient 8B multimodal model tuned for advanced reasoning tasks.
VaultGemma: 1B DP-trained Gemma variant for private NLP tasks