Gemma-7B is a lightweight, open-source, decoder-only language model developed by Google, built using the same research and technology behind the Gemini family. With 8.5 billion parameters and an 8192-token context window, it is optimized for English text generation tasks like question answering, summarization, reasoning, and creative writing. Trained on 6 trillion tokens including web documents, code, and mathematical texts, Gemma-7B provides competitive performance across a wide range of NLP benchmarks. The model was trained using JAX and Google's ML Pathways on TPUv5e hardware, and supports deployment on CPUs, GPUs, and via quantization (int8/4bit) for efficient inference. Benchmark evaluations show it outperforms comparably sized open models in tasks measuring factuality, common sense, and code generation. Ethics evaluations demonstrate low levels of toxicity and bias, and Google provides responsible AI guidelines for safe usage.
Features
- 8.5B parameter decoder-only model trained on 6T high-quality tokens
- 8192-token context length for handling long inputs
- Excels at reasoning, summarization, and question answering
- Trained using JAX and ML Pathways on TPUv5e infrastructure
- Supports CPU/GPU inference and 4-bit/8-bit quantization
- Outperforms comparable open models across major NLP benchmarks
- Includes built-in filters for CSAM, PII, and sensitive content
- Released with responsible AI toolkit and usage guidelines by Google