GLM-4.6VZhipu AI
|
MedGemmaGoogle DeepMind
|
|||||
Related Products
|
||||||
About
GLM-4.6V is a state-of-the-art open source multimodal vision-language model from the Z.ai (GLM-V) family designed for reasoning, perception, and action. It ships in two variants: a full-scale version (106B parameters) for cloud or high-performance clusters, and a lightweight “Flash” variant (9B) optimized for local deployment or low-latency use. GLM-4.6V supports a native context window of up to 128K tokens during training, enabling it to process very long documents or multimodal inputs. Crucially, it integrates native Function Calling, meaning the model can take images, screenshots, documents, or other visual media as input directly (without manual text conversion), reason about them, and trigger tool calls, bridging “visual perception” with “executable action.” This enables a wide spectrum of capabilities; interleaved image-and-text content generation (for example, combining document understanding with text summarization or generation of image-annotated responses).
|
About
MedGemma is a collection of Gemma 3 variants that are trained for performance on medical text and image comprehension. Developers can use MedGemma to accelerate building healthcare-based AI applications. MedGemma currently comes in two variants: a 4B multimodal version and a 27B text-only version. MedGemma 4B utilizes a SigLIP image encoder that has been specifically pre-trained on a variety of de-identified medical data, including chest X-rays, dermatology images, ophthalmology images, and histopathology slides. Its LLM component is trained on a diverse set of medical data, including radiology images, histopathology patches, ophthalmology images, and dermatology images. MedGemma 4B is available in both pre-trained (suffix: -pt) and instruction-tuned (suffix -it) versions. The instruction-tuned version is a better starting point for most applications.
|
|||||
Platforms Supported
Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook
|
Platforms Supported
Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook
|
|||||
Audience
Developers, researchers, and AI engineers wanting a solution to build agents that understand images and text, manipulate documents or UIs, and generate complex image-text outputs
|
Audience
Healthcare AI developers wanting a solution offering fine-tunable models for medical text and image comprehension tasks
|
|||||
Support
Phone Support
24/7 Live Support
Online
|
Support
Phone Support
24/7 Live Support
Online
|
|||||
API
Offers API
|
API
Offers API
|
|||||
Screenshots and Videos |
Screenshots and Videos |
|||||
Pricing
Free
Free Version
Free Trial
|
Pricing
No information available.
Free Version
Free Trial
|
|||||
Reviews/
|
Reviews/
|
|||||
Training
Documentation
Webinars
Live Online
In Person
|
Training
Documentation
Webinars
Live Online
In Person
|
|||||
Company InformationZhipu AI
Founded: 2023
China
chat.z.ai/
|
Company InformationGoogle DeepMind
Founded: 2010
United Kingdom
deepmind.google/models/gemma/medgemma/
|
|||||
Alternatives |
Alternatives |
|||||
|
|
|
|||||
|
|
|
|||||
|
|
|
|||||
|
|
|
|||||
Categories |
Categories |
|||||
Integrations
Claude Code
Cline
Dr7.ai
Gemma 2
Gemma 3
Gemma 4
Hugging Face
Kilo Code
OpenRouter
Roo Code
|
Integrations
Claude Code
Cline
Dr7.ai
Gemma 2
Gemma 3
Gemma 4
Hugging Face
Kilo Code
OpenRouter
Roo Code
|
|||||
|
|
|