DiffusionGemma 26B A4B IT NVFP4 is NVIDIA’s Model Optimizer quantized release of Google DeepMind’s DiffusionGemma 26B A4B IT model. It is an open-weights multimodal generative model that processes text, images, and video inputs to produce text output through discrete diffusion. Built on the Gemma 4 26B A4B Mixture-of-Experts architecture, it has 25.2B total parameters and 3.8B active parameters, balancing capability with efficient inference. Its diffusion-based generation produces tokens in parallel 256-token blocks, enabling very high-speed output, with reported generation above 1,100 tokens per second on NVIDIA Hopper H100 in FP8. The model supports a 256K-token context window, configurable thinking mode, native function calling, structured JSON output, and multilingual inference across 35+ languages. The NVFP4 quantization reduces weights and activations from 16-bit to 4-bit, lowering disk size and GPU memory needs for vLLM deployment.

Features

  • NVFP4 4-bit quantization for lower memory usage
  • 25.2B total parameters with 3.8B active parameters
  • Multimodal input support for text, images, and video
  • Discrete diffusion generation with parallel token blocks
  • 256K-token context window for long-context workflows
  • Native function calling and structured JSON output
  • Multilingual inference across more than 35 languages
  • Optimized for vLLM on NVIDIA Hopper and Blackwell GPUs

Project Samples

Project Activity

See All Activity >

Categories

AI Models

License

Apache License V2.0

Follow DiffusionGemma

DiffusionGemma Web Site

Other Useful Business Software
Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
Sign Up Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of DiffusionGemma!

Additional Project Details

Registered

1 day ago