GLM-4.5V

GLM-4.5V

Zhipu AI
GLM-OCR

GLM-OCR

Z.ai
+
+

Related Products

  • Google AI Studio
    11 Ratings
    Visit Website
  • Vertex AI
    783 Ratings
    Visit Website
  • LM-Kit.NET
    23 Ratings
    Visit Website
  • Google Cloud Speech-to-Text
    373 Ratings
    Visit Website
  • FastBound
    24 Ratings
    Visit Website
  • Ango Hub
    15 Ratings
    Visit Website
  • LogicalDOC
    124 Ratings
    Visit Website
  • Nectar
    8,785 Ratings
    Visit Website
  • Awardco
    10,955 Ratings
    Visit Website
  • Windsurf Editor
    156 Ratings
    Visit Website

About

GLM-4.5V builds on the GLM-4.5-Air foundation, using a Mixture-of-Experts (MoE) architecture with 106 billion total parameters and 12 billion activation parameters. It achieves state-of-the-art performance among open-source VLMs of similar scale across 42 public benchmarks, excelling in image, video, document, and GUI-based tasks. It supports a broad range of multimodal capabilities, including image reasoning (scene understanding, spatial recognition, multi-image analysis), video understanding (segmentation, event recognition), complex chart and long-document parsing, GUI-agent workflows (screen reading, icon recognition, desktop automation), and precise visual grounding (e.g., locating objects and returning bounding boxes). GLM-4.5V also introduces a “Thinking Mode” switch, allowing users to choose between fast responses or deeper reasoning when needed.

About

GLM-OCR is a multimodal optical character recognition model and open source repository that provides accurate, efficient, and comprehensive document understanding by combining text and visual modalities into a unified encoder–decoder architecture derived from the GLM-V family. Built with a visual encoder pre-trained on large-scale image–text data and a lightweight cross-modal connector feeding into a GLM-0.5B language decoder, the model supports layout detection, parallel region recognition, and structured output for text, tables, formulas, and complicated real-world document formats. It introduces Multi-Token Prediction (MTP) loss and stable full-task reinforcement learning to improve training efficiency, recognition accuracy, and generalization, achieving state-of-the-art benchmarks on major document understanding tasks.

Platforms Supported

Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook

Platforms Supported

Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook

Audience

Developers and AI researchers requiring a solution to build applications that interpret and reason about images, video, documents or GUIs

Audience

Developers, researchers, and engineers wanting a tool to accurately parse and understand complex documents, layouts, and visual-text content at scale

Support

Phone Support
24/7 Live Support
Online

Support

Phone Support
24/7 Live Support
Online

API

Offers API

API

Offers API

Screenshots and Videos

Screenshots and Videos

Pricing

Free
Free Version
Free Trial

Pricing

Free
Free Version
Free Trial

Reviews/Ratings

Overall 0.0 / 5
ease 0.0 / 5
features 0.0 / 5
design 0.0 / 5
support 0.0 / 5

This software hasn't been reviewed yet. Be the first to provide a review:

Review this Software

Reviews/Ratings

Overall 0.0 / 5
ease 0.0 / 5
features 0.0 / 5
design 0.0 / 5
support 0.0 / 5

This software hasn't been reviewed yet. Be the first to provide a review:

Review this Software

Training

Documentation
Webinars
Live Online
In Person

Training

Documentation
Webinars
Live Online
In Person

Company Information

Zhipu AI
Founded: 2023
China
chat.z.ai/

Company Information

Z.ai
Founded: 2019
China
github.com/zai-org/GLM-OCR

Alternatives

GPT-5.2

GPT-5.2

OpenAI

Alternatives

CodeT5

CodeT5

Salesforce
GLM-4.1V

GLM-4.1V

Zhipu AI
HunyuanOCR

HunyuanOCR

Tencent
GLM-4.5V-Flash

GLM-4.5V-Flash

Zhipu AI
Mu

Mu

Microsoft
Qwen2

Qwen2

Alibaba
Ministral 3

Ministral 3

Mistral AI
Mistral OCR 3

Mistral OCR 3

Mistral AI

Categories

Categories

Integrations

Claude Code
Cline
Kilo Code
OpenRouter
Roo Code
Sup AI

Integrations

Claude Code
Cline
Kilo Code
OpenRouter
Roo Code
Sup AI
Claim GLM-4.5V and update features and information
Claim GLM-4.5V and update features and information
Claim GLM-OCR and update features and information
Claim GLM-OCR and update features and information