DiffusionGemmaGoogle
|
GemmaGoogle
|
|||||
Related Products
|
||||||
About
DiffusionGemma is an experimental open model that explores text diffusion, an exceptionally fast approach to text generation. Released under an Apache 2.0 license, this 26B Mixture of Experts (MoE) model moves beyond the sequential token-by-token processing of typical autoregressive Large Language Models (LLMs). Instead, it generates entire blocks of text simultaneously, delivering up to 4x faster text generation on GPUs. Built on the intelligence-per-parameter of the Gemma 4 family and Gemini Diffusion research, DiffusionGemma integrates a novel diffusion head designed to maximize generation speed. It is designed for researchers and developers exploring speed-critical, interactive local workflows such as in-line editing, rapid iteration, and non-linear text structures. By shifting the decode bottleneck from memory bandwidth to compute, it can generate more than 1,000 tokens per second on a single NVIDIA H100 and more than 700 tokens per second on an NVIDIA GeForce RTX 5090.
|
About
Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Developed by Google DeepMind and other teams across Google, Gemma is inspired by Gemini, and the name reflects the Latin gemma, meaning “precious stone.” Accompanying our model weights, we’re also releasing tools to support developer innovation, foster collaboration, and guide the responsible use of Gemma models. Gemma models share technical and infrastructure components with Gemini, our largest and most capable AI model widely available today. This enables Gemma 2B and 7B to achieve best-in-class performance for their sizes compared to other open models. And Gemma models are capable of running directly on a developer laptop or desktop computer. Notably, Gemma surpasses significantly larger models on key benchmarks while adhering to our rigorous standards for safe and responsible outputs.
|
|||||
Platforms Supported
Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook
|
Platforms Supported
Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook
|
|||||
Audience
AI researchers building low-latency local applications who need faster experimental text generation for interactive workflows
|
Audience
Developers wanting a powerful suite of open AI models
|
|||||
Support
Phone Support
24/7 Live Support
Online
|
Support
Phone Support
24/7 Live Support
Online
|
|||||
API
Offers API
|
API
Offers API
|
|||||
Screenshots and Videos |
Screenshots and Videos |
|||||
Pricing
Free
Free Version
Free Trial
|
Pricing
No information available.
Free Version
Free Trial
|
|||||
Reviews/
|
Reviews/
|
|||||
Training
Documentation
Webinars
Live Online
In Person
|
Training
Documentation
Webinars
Live Online
In Person
|
|||||
Company InformationGoogle
Founded: 1998
United States
blog.google/innovation-and-ai/technology/developers-tools/diffusion-gemma-faster-text-generation/
|
Company InformationGoogle
Founded: 1998
United States
ai.google.dev/gemma
|
|||||
Alternatives |
Alternatives |
|||||
|
|
|
|||||
|
|
|
|||||
|
|
||||||
|
|
|
|||||
Categories |
Categories |
|||||
Integrations
Gemini Enterprise Agent Platform
AiAssistWorks
Atomic Chat
Axolotl
C#
C++
CSS
Cake AI
CodeMender
Elixir
|
Integrations
Gemini Enterprise Agent Platform
AiAssistWorks
Atomic Chat
Axolotl
C#
C++
CSS
Cake AI
CodeMender
Elixir
|
|||||
|
|
|