GLM-OCR

GLM-OCR

Z.ai
+
+

Related Products

  • Vertex AI
    827 Ratings
    Visit Website
  • LM-Kit.NET
    23 Ratings
    Visit Website
  • Google AI Studio
    11 Ratings
    Visit Website
  • SpamTitan
    815 Ratings
    Visit Website
  • Titan
    368 Ratings
    Visit Website
  • PackageX OCR Scanning
    46 Ratings
    Visit Website
  • Fraud.net
    56 Ratings
    Visit Website
  • Datasite Diligence Virtual Data Room
    619 Ratings
    Visit Website
  • Windsurf Editor
    159 Ratings
    Visit Website
  • Google Cloud BigQuery
    1,939 Ratings
    Visit Website

About

Pre-trained language models have achieved state-of-the-art results in various Natural Language Processing (NLP) tasks. GPT-3 has shown that scaling up pre-trained language models can further exploit their enormous potential. A unified framework named ERNIE 3.0 was recently proposed for pre-training large-scale knowledge enhanced models and trained a model with 10 billion parameters. ERNIE 3.0 outperformed the state-of-the-art models on various NLP tasks. In order to explore the performance of scaling up ERNIE 3.0, we train a hundred-billion-parameter model called ERNIE 3.0 Titan with up to 260 billion parameters on the PaddlePaddle platform. Furthermore, We design a self-supervised adversarial loss and a controllable language modeling loss to make ERNIE 3.0 Titan generate credible and controllable texts.

About

GLM-OCR is a multimodal optical character recognition model and open source repository that provides accurate, efficient, and comprehensive document understanding by combining text and visual modalities into a unified encoder–decoder architecture derived from the GLM-V family. Built with a visual encoder pre-trained on large-scale image–text data and a lightweight cross-modal connector feeding into a GLM-0.5B language decoder, the model supports layout detection, parallel region recognition, and structured output for text, tables, formulas, and complicated real-world document formats. It introduces Multi-Token Prediction (MTP) loss and stable full-task reinforcement learning to improve training efficiency, recognition accuracy, and generalization, achieving state-of-the-art benchmarks on major document understanding tasks.

Platforms Supported

Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook

Platforms Supported

Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook

Audience

AI developers

Audience

Developers, researchers, and engineers wanting a tool to accurately parse and understand complex documents, layouts, and visual-text content at scale

Support

Phone Support
24/7 Live Support
Online

Support

Phone Support
24/7 Live Support
Online

API

Offers API

API

Offers API

Screenshots and Videos

Screenshots and Videos

Pricing

No information available.
Free Version
Free Trial

Pricing

Free
Free Version
Free Trial

Reviews/Ratings

Overall 0.0 / 5
ease 0.0 / 5
features 0.0 / 5
design 0.0 / 5
support 0.0 / 5

This software hasn't been reviewed yet. Be the first to provide a review:

Review this Software

Reviews/Ratings

Overall 0.0 / 5
ease 0.0 / 5
features 0.0 / 5
design 0.0 / 5
support 0.0 / 5

This software hasn't been reviewed yet. Be the first to provide a review:

Review this Software

Training

Documentation
Webinars
Live Online
In Person

Training

Documentation
Webinars
Live Online
In Person

Company Information

Baidu
Founded: 2000
China
research.baidu.com/Public/uploads/61c4362c79ee8.pdf

Company Information

Z.ai
Founded: 2019
China
github.com/zai-org/GLM-OCR

Alternatives

PanGu-α

PanGu-α

Huawei

Alternatives

HunyuanOCR

HunyuanOCR

Tencent
CodeT5

CodeT5

Salesforce
ERNIE 4.5

ERNIE 4.5

Baidu
Mu

Mu

Microsoft
ERNIE X1

ERNIE X1

Baidu
ERNIE Bot

ERNIE Bot

Baidu
Mistral OCR 3

Mistral OCR 3

Mistral AI

Categories

Categories

Integrations

ERNIE Bot

Integrations

ERNIE Bot
Claim ERNIE 3.0 Titan and update features and information
Claim ERNIE 3.0 Titan and update features and information
Claim GLM-OCR and update features and information
Claim GLM-OCR and update features and information