Tesseract OCR

Tesseract is an open source OCR or optical character recognition engine and command line program. OCR is a technology that allows for the recognition of text characters within a digital image. With the latest version of Tesseract, there is a greater focus on line recognition, however it still supports the legacy Tesseract OCR engine which recognizes character patterns.

Tesseract can recognize over 100 languages out-of-the-box, and can be trained to recognize other languages. It supports various output formats, including plain text, HTML, PDF and more. It also has unicode (UTF-8) support.

Features

OCR engine and command line program
Line recognition and character pattern recognition
Unicode (UTF-8) support
Recognizes more than 100 languages, and can be trained to recognize others
Supports various output formats

Project Samples

Tesseract 3.02 running on Gnome Terminal 3.8.0 (screenshot by Naga2raja)

Project Activity

See All Activity >

License

Apache License V2.0

Follow Tesseract OCR

Tesseract OCR Web Site

Other Useful Business Software

MongoDB Atlas runs apps anywhere

Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free

Rate This Project

User Ratings

5.0 out of 5 stars

★★★★★

★★★★

★★★

★★

★

ease 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 4 / 5

features 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 4 / 5

design 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 4 / 5

support 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 4 / 5

User Reviews

Filter Reviews:

All

newlts Posted 2016-05-23

Enjoy this project for my mission
inimmo Posted 2013-09-03

Brilliant. Worked properly first time. great code.
biello Posted 2012-08-03

very good OCR project!
efaefa Posted 2010-06-07

wow, good OCR. The release files are very oldest than http://code.google.com/p/tesseract-ocr/ I packed tesseract with gImageReader http://sourceforge.net/projects/gimagereader/
babaphone Posted 2009-10-20

how to install in win Xp?

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

C++

Related Categories

C++ Image Recognition Software, C++ OCR Software

Registered

2020-05-04

Similar Business Software

FreeOCR

FreeOCR is a free Optical Character Recognition Software for Windows and supports scanning from most Twain scanners and can also open most scanned PDF's and multi-page Tiff images as well as popular image file formats. FreeOCR outputs plain text and can export directly to Microsoft Word format....

See Software
Tesseract

Tesseract is an OCR engine with support for unicode and the ability to recognize more than 100 languages out of the box. It can be trained to recognize other languages. Tesseract is used for text detection on mobile devices, in video, and in Gmail image spam detection.

See Software
PrecisionOCR

PrecisionOCR is a ready-to-use, secure, HIPAA-compliant, cloud-based platform for extracting medical meaning from unstructured documents using Optical Character Recognition (OCR). PrecisionOCR uses custom Optical Character Recognition and AI algorithms to convert PDFs/JPEGs/PNGs into...

See Software
RoboOCR

Easy to use OCR software (optical character recognition) that can capture text from screen, images, PDFs, videos and other digital documents. It can quickly extract and recognize any non-selectable and non-editable text on your Windows screen.

See Software
Nutrient SDK

Nutrient is the comprehensive solution for all your PDF needs, offering tools that effortlessly integrate and operate PDF functionality across any platform. 1. SDK PRODUCTS Integrate robust PDF functionality into iOS, Android, Windows, web (JavaScript), or any cross-platform technology,...

See Software
Tabscanner

Tabscanner is an AI-powered receipt OCR (Optical Character Recognition) API that enables fast and accurate data extraction from receipt images. With over eight years of experience and more than a billion receipts processed, Tabscanner offers a simple and easy-to-use API that integrates...

See Software

Report inappropriate content

Tesseract OCR

Open Source OCR Engine

Get an email when there's a new version of Tesseract OCR

Features

Project Samples

Project Activity

Categories

License

Follow Tesseract OCR

User Ratings

User Reviews

Additional Project Details

Operating Systems

Programming Language

Related Categories

Registered