OpenOCR is an open-source General OCR toolkit developed by the OCR team at Fudan University for research and real-world document processing applications. It provides a unified platform for text detection, text recognition, formula recognition, table recognition, and document parsing. Built on advanced OCR technologies such as SVTRv2 and UniRec-0.1B, OpenOCR delivers high accuracy while maintaining efficient inference performance. The toolkit supports both Chinese and English content, making it suitable for multilingual document analysis. OpenOCR includes training, evaluation, fine-tuning, and deployment tools, allowing users to customize models for specific OCR tasks. Its comprehensive ecosystem bridges academic research and industrial applications through reproducible benchmarks and commercial-grade OCR solutions.

Features

  • Supports text detection and recognition for both Chinese and English documents.
  • Provides document parsing capabilities for extracting text, formulas, tables, and layout structures.
  • Includes UniRec-0.1B for unified recognition of text, mathematical formulas, and mixed content.
  • Offers fine-tuning support for custom OCR datasets and specialized use cases.
  • Enables ONNX model export for wider deployment compatibility across platforms.
  • Integrates a unified training and evaluation benchmark with reproductions of leading OCR research models.

Project Activity

See All Activity >

Categories

OCR

License

Apache License V2.0

Follow OpenOCR

OpenOCR Web Site

Other Useful Business Software
Error to trace to log to deploy. One click. No SSH. Icon
Error to trace to log to deploy. One click. No SSH.

Catch the cause before the pager goes off.

AppSignal links every error to the trace, the trace to the log, the log to the deploy that shipped it.
Free 30 days.
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of OpenOCR!

Additional Project Details

Programming Language

Python

Related Categories

Python OCR Software

Registered

1 day ago