SmolLM3

SmolLM3 is a 3.08B parameter decoder-only language model designed by Hugging Face to deliver high performance in reasoning, math, and multilingual understanding. Trained on 11.2T tokens with a curriculum of web, code, and mathematical data, it uses advanced features like GQA and NoPE. The model supports extended context lengths up to 128k tokens via YaRN extrapolation, making it highly suitable for long-context applications. It outperforms or rivals larger models like Qwen3 and LLaMA3 on several reasoning, commonsense, and multilingual benchmarks. SmolLM3 natively supports six languages—English, French, Spanish, German, Italian, and Portuguese—while also having exposure to Arabic, Chinese, and Russian. It is open-source under Apache 2.0, with transparent training data, configs, and available quantized versions. The model is optimized through Anchored Preference Optimization (APO), achieving strong alignment and instruction-following behavior across a broad range of tasks.

Features

3B parameter decoder-only architecture with bfloat16 precision
Trained on 11.2T tokens, including code, math, and web data
Supports 64k–128k context lengths via YaRN extrapolation
Optimized using Anchored Preference Optimization (APO)
Strong multilingual performance in six core languages
Competitive results on reasoning and math tasks (e.g., GSM8k, HumanEval+)
Fully open: weights, training code, and data mixes are available
Deployable with ONNX, vLLM, Transformers.js, and llama.cpp

Project Samples

Project Activity

See All Activity >

Follow SmolLM3

SmolLM3 Web Site

Other Useful Business Software

MongoDB Atlas runs apps anywhere

Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free

Rate This Project

User Reviews

Be the first to post a review of SmolLM3!

Additional Project Details

Registered

2025-07-09

Similar Business Software

Yi-Large

Yi-Large is a proprietary large language model developed by 01.AI, offering a 32k context length with both input and output costs at $2 per million tokens. It stands out with its advanced capabilities in natural language processing, common-sense reasoning, and multilingual support, performing on...

See Software
GLM-4.5

GLM‑4.5 is Z.ai’s latest flagship model in the GLM family, engineered with 355 billion total parameters (32 billion active) and a companion GLM‑4.5‑Air variant (106 billion total, 12 billion active) to unify advanced reasoning, coding, and agentic capabilities in one architecture. It operates in...

See Software
DeepSeek-V2

DeepSeek-V2 is a state-of-the-art Mixture-of-Experts (MoE) language model introduced by DeepSeek-AI, characterized by its economical training and efficient inference capabilities. With a total of 236 billion parameters, of which only 21 billion are active per token, it supports a context length...

See Software
Mistral Large 2

Mistral AI has launched the Mistral Large 2, an advanced AI model designed to excel in code generation, multilingual capabilities, and complex reasoning tasks. The model features a 128k context window, supporting dozens of languages including English, French, Spanish, and Arabic, as well as over...

See Software
Kimi K2

Kimi K2 is a state-of-the-art open source large language model series built on a mixture-of-experts (MoE) architecture, featuring 1 trillion total parameters and 32 billion activated parameters for task-specific efficiency. Trained with the Muon optimizer on over 15.5 trillion tokens and...

See Software
DeepSeekMath

DeepSeekMath is a specialized 7B parameter language model developed by DeepSeek-AI, designed to push the boundaries of mathematical reasoning in open-source language models. It starts from the DeepSeek-Coder-v1.5 7B model and undergoes further pre-training with 120B math-related tokens sourced...

See Software