MedGemma is a collection of specialized open-source AI models created by Google as part of its Health AI Developer Foundations initiative, built on the Gemma 3 family of transformer models and trained for medical text and image comprehension tasks that help accelerate the development of healthcare-focused AI applications. It includes multiple variants such as a 4 billion-parameter multimodal model that can process both medical images and text and a 27 billion-parameter text-only (and multimodal) model that offers deeper clinical reasoning and understanding at higher capacity, making it suitable for complex tasks like medical question answering, summarization of clinical notes, or generating reports from radiology images. The multimodal versions pair a SigLIP-based image encoder pre-trained on diverse de-identified medical imaging data.
Features
- Multimodal medical AI with text and image comprehension
- Variants at different scales (4B multimodal, 27B text-only and multimodal)
- Pre-trained SigLIP image encoder for medical imagery
- Baseline support for clinical question answering and report generation
- Fine-tuning workflows for domain-specific adaptation
- Open-source with code and notebooks for experimentation