SmolLM3 is a 3.08B parameter decoder-only language model designed by Hugging Face to deliver high performance in reasoning, math, and multilingual understanding. Trained on 11.2T tokens with a curriculum of web, code, and mathematical data, it uses advanced features like GQA and NoPE. The model supports extended context lengths up to 128k tokens via YaRN extrapolation, making it highly suitable for long-context applications. It outperforms or rivals larger models like Qwen3 and LLaMA3 on several reasoning, commonsense, and multilingual benchmarks. SmolLM3 natively supports six languages—English, French, Spanish, German, Italian, and Portuguese—while also having exposure to Arabic, Chinese, and Russian. It is open-source under Apache 2.0, with transparent training data, configs, and available quantized versions. The model is optimized through Anchored Preference Optimization (APO), achieving strong alignment and instruction-following behavior across a broad range of tasks.
Features
- 3B parameter decoder-only architecture with bfloat16 precision
- Trained on 11.2T tokens, including code, math, and web data
- Supports 64k–128k context lengths via YaRN extrapolation
- Optimized using Anchored Preference Optimization (APO)
- Strong multilingual performance in six core languages
- Competitive results on reasoning and math tasks (e.g., GSM8k, HumanEval+)
- Fully open: weights, training code, and data mixes are available
- Deployable with ONNX, vLLM, Transformers.js, and llama.cpp