OpenOCR

OpenOCR is an open-source General OCR toolkit developed by the OCR team at Fudan University for research and real-world document processing applications. It provides a unified platform for text detection, text recognition, formula recognition, table recognition, and document parsing. Built on advanced OCR technologies such as SVTRv2 and UniRec-0.1B, OpenOCR delivers high accuracy while maintaining efficient inference performance. The toolkit supports both Chinese and English content, making it suitable for multilingual document analysis. OpenOCR includes training, evaluation, fine-tuning, and deployment tools, allowing users to customize models for specific OCR tasks. Its comprehensive ecosystem bridges academic research and industrial applications through reproducible benchmarks and commercial-grade OCR solutions.

Features

Supports text detection and recognition for both Chinese and English documents.
Provides document parsing capabilities for extracting text, formulas, tables, and layout structures.
Includes UniRec-0.1B for unified recognition of text, mathematical formulas, and mixed content.
Offers fine-tuning support for custom OCR datasets and specialized use cases.
Enables ONNX model export for wider deployment compatibility across platforms.
Integrates a unified training and evaluation benchmark with reproductions of leading OCR research models.

Project Activity

See All Activity >

License

Apache License V2.0

Follow OpenOCR

OpenOCR Web Site

Other Useful Business Software

Ship Agents Faster

Transform your applications and workflows into powerful agentic systems at global scale.

Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.

Get Started Free

Rate This Project

User Reviews

Be the first to post a review of OpenOCR!

Additional Project Details

Programming Language

Python

Related Categories

Python OCR Software

Registered

2026-06-20

Similar Business Software

PaddleOCR

PaddleOCR is a leading open source OCR toolkit and document AI engine that turns PDFs and images into structured, LLM-ready data with high accuracy. It is designed to bridge the gap between documents and large language models by extracting, recognizing, parsing, and organizing information from...

See Software
Nutrient SDK

Nutrient is the comprehensive solution for all your PDF needs, offering tools that effortlessly integrate and operate PDF functionality across any platform. 1. SDK PRODUCTS Integrate robust PDF functionality into iOS, Android, Windows, web (JavaScript), or any cross-platform technology,...

See Software
PackageX OCR Scanning

PackageX OCR API converts any smartphone into a powerful universal label scanner that reads every bit of text on the label, including barcodes and QR codes. Our state-of-the-art OCR technology uses robust deep learning models and proprietary algorithms to extract information from package...

See Software
MyQ

MyQ develops print management solutions designed to make printing personalized, secure, and cost-effective. MyQ X features an intuitive user interface that supports deep personalization, allowing users to complete everyday tasks quickly through one-click actions. Powerful document workflows...

See Software
Foxit Document Workflow APIs

Foxit provides a powerful suite of cloud-native APIs that help organizations automate, secure, and modernize document workflows. Built on scalable REST architecture, Foxit APIs enable developers to generate, convert, extract, sign, and display documents directly within applications—eliminating...

See Software
GLM-OCR

GLM-OCR is a multimodal optical character recognition model and open source repository that provides accurate, efficient, and comprehensive document understanding by combining text and visual modalities into a unified encoder–decoder architecture derived from the GLM-V family. Built with a...

See Software