GLM-OCRZ.ai
|
Yandex VisionYandex
|
|||||
Related Products
|
||||||
About
GLM-OCR is a multimodal optical character recognition model and open source repository that provides accurate, efficient, and comprehensive document understanding by combining text and visual modalities into a unified encoder–decoder architecture derived from the GLM-V family. Built with a visual encoder pre-trained on large-scale image–text data and a lightweight cross-modal connector feeding into a GLM-0.5B language decoder, the model supports layout detection, parallel region recognition, and structured output for text, tables, formulas, and complicated real-world document formats. It introduces Multi-Token Prediction (MTP) loss and stable full-task reinforcement learning to improve training efficiency, recognition accuracy, and generalization, achieving state-of-the-art benchmarks on major document understanding tasks.
|
About
Yandex Vision OCR recognizes text in an image and outputs it along with automatic punctuation. The service supports and automatically identifies more than 50 languages. Extract standard fields and recognize text in templates and documents, e.g., passports, driver’s licenses, vehicle registration certificates, and license plates. With support for Russian and English, as well as combinations of handwritten and printed texts. The service scans the table structure and outputs text in row and column coordinates. Optical character recognition (OCR), document recognition, and license plate number recognition. Yandex Vision OCR allows you to work with JPEG, PNG, and PDF formats. File sizes should be no larger than 20 MB with no more than 300 pages per file. The service can scan images and find passports from 20 countries, driver’s licenses, vehicle registration documents, and license plates.
|
|||||
Platforms Supported
Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook
|
Platforms Supported
Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook
|
|||||
Audience
Developers, researchers, and engineers wanting a tool to accurately parse and understand complex documents, layouts, and visual-text content at scale
|
Audience
Professionals looking for a solution providing computer vision services for text recognition and extraction
|
|||||
Support
Phone Support
24/7 Live Support
Online
|
Support
Phone Support
24/7 Live Support
Online
|
|||||
API
Offers API
|
API
Offers API
|
|||||
Screenshots and Videos |
Screenshots and Videos |
|||||
Pricing
Free
Free Version
Free Trial
|
Pricing
No information available.
Free Version
Free Trial
|
|||||
Reviews/
|
Reviews/
|
|||||
Training
Documentation
Webinars
Live Online
In Person
|
Training
Documentation
Webinars
Live Online
In Person
|
|||||
Company InformationZ.ai
Founded: 2019
China
github.com/zai-org/GLM-OCR
|
Company InformationYandex
Founded: 1997
Russia
cloud.yandex.com/en/services/vision
|
|||||
Alternatives |
Alternatives |
|||||
|
|
|
|||||
|
|
|
|||||
|
|
||||||
|
|
||||||
Categories |
Categories |
|||||
Integrations
Yandex Cloud
|
||||||
|
|
|