Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
OCR Software
Search Results

Search Results for "character recognition source code"

x

Sort By:

Relevance

Clear All Filters

OS

Linux 31
Windows 25
Mac 21
More...
BSD 14
ChromeOS 7
Desktop Operating Systems 2

Category

Artificial Intelligence 31
Multimedia 7
Scientific/Engineering 4
Business 3
Text Editors 3
Software Development 2
Communications 1
Education 1
Formats and Protocols 1
Internet 1
Security 1
System 1

License

OSI-Approved Open Source 29

Translations

English 10
Italian 2
Catalan 1
Czech 1
More...
Dutch 1
French 1
Hindi 1
Japanese 1
Lithuanian 1
Persian 1
Polish 1
Russian 1
Slovak 1
Spanish 1
Turkish 1
Vietnamese 1

Programming Language

Python 8
C++ 7
Java 6
C 5
More...
JavaScript 5
C# 2
Ada 1
OCaml (Objective Caml) 1
PHP 1
Ruby 1

Status

Beta 6
Production/Stable 4
Pre-Alpha 2
Alpha 2
More...
Mature 2
Planning 1

Showing 31 open source projects for "character recognition source code"

View related business solutions

OCR Linux Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
1

Tesseract OCR

Open Source OCR Engine

Tesseract is an open source OCR or optical character recognition engine and command line program. OCR is a technology that allows for the recognition of text characters within a digital image. With the latest version of Tesseract, there is a greater focus on line recognition, however it still supports the legacy Tesseract OCR engine which recognizes character patterns.

5 Reviews

Downloads: 3,399 This Week

Last Update: 2025-12-26
See Project
2

Umi-OCR

OCR software, free and offline

Umi-OCR is a free and open-source optical character recognition (OCR) tool designed to provide fast, offline text extraction from images, screenshots, PDFs, and more without requiring a network connection. It includes a highly efficient offline OCR engine with built-in multilingual recognition libraries, so users can extract text across multiple languages with high accuracy directly on their machines.

Downloads: 35 This Week

Last Update: 2026-01-15
See Project
3

PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle

PaddleOCR offers exceptional, multilingual, and practical Optical Character Recognition (OCR) tools that can help users train better models and apply them into practice. Inspired by PaddlePaddle, PaddleOCR is an ultra lightweight OCR system, with multilingual recognition, digit recognition, vertical text recognition, as well as long text recognition. It features a PPOCR series of high-quality pre-trained models, which includes: ultra lightweight ppocr_mobile series models, general...

Downloads: 81 This Week

Last Update: 2 days ago
See Project
4

DeepSeek-OCR

Contexts Optical Compression

DeepSeek-OCR is an open-source optical character recognition solution built as part of the broader DeepSeek AI vision-language ecosystem. It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body text, interpreting tables, or recognizing handwritten versus printed words. ...

Downloads: 2 This Week

Last Update: 2026-01-27
See Project
AI-generated apps that pass security review
Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.

Try Retool free
5

Tesseract.js

A pure Javascript Multilingual OCR

Tesseract.js is a pure Javascript port of the popular Tesseract OCR engine. Tesseract.js' library supports more than 100 languages, automatic text orientation and script detection, a simple interface for reading paragraph, word, and character bounding boxes. Tesseract.js can run either in a browser and on a server with NodeJS. Tesseract.js is a javascript library that gets words in almost any spoken language out of images. The main Tesseract.js functions (ex. recognize, detect) take an image...

Downloads: 25 This Week

Last Update: 2025-12-15
See Project
6

GLM-OCR

Accurate × Fast × Comprehensive

GLM-OCR is an open-source multimodal optical character recognition (OCR) model built on a GLM-V encoder–decoder foundation that brings robust, accurate document understanding to complex real-world layouts and modalities. Designed to handle text recognition, table parsing, formula extraction, and general information retrieval from documents containing mixed content, GLM-OCR excels across major benchmarks while remaining highly efficient with a relatively compact parameter size (~0.9B), enabling deployment in high-concurrency services and edge environments. ...

Downloads: 23 This Week

Last Update: 2026-04-08
See Project
7

HunyuanOCR

OCR expert VLM powered by Hunyuan's native multimodal architecture

HunyuanOCR is an open-source, end-to-end OCR (optical character recognition) Vision-Language Model (VLM) developed by Tencent‑Hunyuan. It’s designed to unify the entire OCR pipeline, detection, recognition, layout parsing, information extraction, translation, and even subtitle or structured output generation, into a single model inference instead of a cascade of separate tools.

Downloads: 0 This Week

Last Update: 2026-04-08
See Project
8

DeepSeek-OCR 2

Visual Causal Flow

DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents with rich spatial structure. ...

Downloads: 7 This Week

Last Update: 2026-02-03
See Project
9

OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files

OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.

Downloads: 108 This Week

Last Update: 2026-04-06
See Project
AI-powered service management for IT and enterprise teams
Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.

Try it Free
10

Scribe.js

JavaScript OCR and text extraction for images and PDFs

Scribe.js is a JavaScript library that provides Optical Character Recognition (OCR) and text extraction capabilities for both images and PDF documents, aimed at developers who want to build OCR features directly into their applications. The library can take image files (such as PNG or JPEG) and recognize the text they contain, and it can also extract text from PDF files that either already contain text or are image-based scans, using modern web standards and WebAssembly under the hood. In...

Downloads: 9 This Week

Last Update: 2026-03-14
See Project
11

dots.ocr

Multilingual Document Layout Parsing in a Single Vision-Language Model

dots.ocr is a cutting-edge multilingual document parsing system built on a unified vision-language model that combines layout detection, text recognition, and structural understanding into a single architecture. Unlike traditional OCR pipelines that rely on multiple specialized components, dots.ocr integrates these processes end-to-end, reducing error propagation and improving consistency across tasks. The model is designed to recognize virtually any human script, making it highly effective...

Downloads: 0 This Week

Last Update: 2026-03-24
See Project
12

VietOCR

Provides optical character recognition (OCR) solutions for Vietnamese language.

24 Reviews

Downloads: 183 This Week

Last Update: 2026-01-17
See Project
13

DocWire SDK

Award-winning modern data processing SDK in C++20

DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to...

Downloads: 7 This Week

Last Update: 2026-03-27
See Project
14

Dual Clip Translator

Translation of Selected text or Clipboard contents powered by Google. HotKeys Paste/Change Text auto translated. View in Balloon/Window the result of translation, besides being sent to the clipboard. Screen Capture of Desktop/Game > OCR > Translated.

5 Reviews

Downloads: 29 This Week

Last Update: 2023-05-26
See Project
15

pdfsandwich

pdfsandwich generates "sandwich" OCR pdf files, i.e. pdf files which contain only images (but no editable text) will be processed by optical character recognition (OCR) and the text will be added to each page invisibly "behind" the images. pdfsandwich is a command line tool which is supposed to be useful to OCR scanned books or journals. It is able to recognize the page layout even for multicolumn text. Essentially, pdfsandwich is a wrapper script which calls the following binaries:...

8 Reviews

Downloads: 348 This Week

Last Update: 2018-08-12
See Project
16

Devanagari OCR

Devanagari Optical Character Recognition, Annotation tool

The project has source code and data related to the following tools: 1. Optical Character Recognition. Recognize machine printed Devanagari with or without a dictionary. 2. Document Image Analysis. Automatic page segmentation of document images in multiple Indian languages. Identifies pictures, lines, and words in a document scanned at 300 dpi. 3.

1 Review

Downloads: 8 This Week

Last Update: 2019-07-25
See Project
17

OCR Web based

OCR web based for Browser Firefox & PC

Optical Character Recognition in JS for Browser is based on ocrad.js. OCR for Browser is a free extension and You can use this application to extract text from any image you supply. Just upload your image files. OCR for Browser takes either a JPG, GIF, TIFF, BMP, PNG. ========= Get OCR for Android (Beta release) - https://play.google.com/store/apps/details?id=com.ulm.ocr ========= Add-on for Opera: http://bit.ly/1F0E0wP ========= Release 1.0.1 For safety reasons, I disabled...

2 Reviews

Downloads: 0 This Week

Last Update: 2018-09-05
See Project
18

FormRead

Free OMR - OCR web sofware based on javascript and PHP

https://formread.org FormRead is a completely free OMR (optical mark recognition) web software for scanning and grading user-filled, multiple choice forms. Create your formats with any of your office or drawing tools, scan them and parameterize their coordinates in an easy way. Once you have parameterized your form, you can print many of them, give it to your students/respondents, scan and recognize them with formread, and you can finally export the data in your preferred formats...

Downloads: 10 This Week

Last Update: 2022-03-04
See Project
19

Java OCR

Java OCR is a suite of pure java libraries for image processing and character recognition. Small memory footprint and lack of external dependencies makes it suitable for android development. Provides modular structure for easier deployment

21 Reviews

Downloads: 1 This Week

Last Update: 2016-11-29
See Project
20

VedVarsha - Rain Of Knowledge

Vedvarsha is an application for 2 purposes: 1. Handwariting script recognition that extracts recognized letters into documents. 2. OCR (Optical Character Recogniton) that works only for non-cursive and isolated characters. It depends upon libsyntactic,

Downloads: 0 This Week

Last Update: 2014-06-09
See Project
21

ANPR for National Borders

ANPR for National Borders Systems

The idea is to enhance and develop the national border crossing process by the integration of automated vehicle recognition while crossing country borders. i'm going to use Automatic number plate recognition (ANPR): a system that recognizes the numbers of the vehicle plates by using OCR (optical character recognition) technology and Infrared cameras. This is going to be achieved by taking the license plate image from the camera and processing it using the software I’m going to develop together with an open source OCR system. ...

1 Review

Downloads: 0 This Week

Last Update: 2019-09-20
See Project
22

Unified Character Recognition

UCR is a project name for the development of an handwritten characters in Korean language. The goal is to create a UCR Library for handwriting as well as OCR from off-line, on-line data. And we have a plan to build a UCR library for mobile.

Downloads: 0 This Week

Last Update: 2014-07-02
See Project
23

COSI

The Common OCR Service Interface. COSI is an API that allows developpers to easily bring OCR (Optical Character Recognition) capabilities to image processing applications. COSI supports existing OCR tools such as Tesseract, GOCR or GNU Ocrad.

Downloads: 3 This Week

Last Update: 2014-06-14
See Project
24

Socr3

Socr3 is a plugin-oriented, open source platform upon which I'm building an OCR suite. The name Socr3 stands for "Open Source Optical Character Recognition, Reading, Rendering, and Exporting", and is subject to change in the future.

Downloads: 0 This Week

Last Update: 2016-11-29
See Project
25

JOcrad

JOcrad is a graphical frontend for GNU/Ocrad written in Java. GNU Ocrad is an OCR (Optical Character Recognition) program based on a feature extraction method.JOcrad supports italian and english languages, JPG,PNG and GIF images.

Downloads: 0 This Week

Last Update: 2014-05-10
See Project

Previous
You're on page 1
2
Next

Related Searches

tesseract-ocr-w64-setup-v5.x.x.exe

tesseract ocr

tesseract

tesseract-ocr-w64-setup-v5.3.3.20231005.exe，64

tesseract-ocr-w64-setup-5.5.1.20251218.exe

tesseract-ocr

ocr

tesseract-ocr-w64-setup.exe

tesseract-ocr-w64-setup

umi-ocr

Related Categories

Artificial Intelligence

Multimedia

Scientific/Engineering

Business

Text Editors

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise