Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
OCR Software
Search Results

Search Results for "java ocr extraction text"

x

Sort By:

Relevance

Clear All Filters

OS

Windows 23
Linux 20
Mac 14
More...
BSD 7
ChromeOS 5
Mobile Operating Systems 2
Desktop Operating Systems 1

Category

Artificial Intelligence 27
Multimedia 9
Software Development 6
Business 4
Education 2
Mobile 2
System 2
Communications 1
Database 1
Desktop Environment 1
Games 1
Internet 1
Scientific/Engineering 1
Text Editors 1

License

OSI-Approved Open Source 19
Other License 1

Translations

Programming Language

Java 15
Python 5
C++ 4
JavaScript 3
More...
C# 1

Status

Production/Stable 8
Beta 4
Alpha 2
Mature 2

Showing 27 open source projects for "java ocr extraction text"

View related business solutions

OCR Clear Filters & Widen Search

Custom VMs From 1 to 96 vCPUs With 99.95% Uptime
General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.

Try Free
Enterprise-grade ITSM, for every business
Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.

Try it Free
1

Umi-OCR

OCR software, free and offline

Umi-OCR is a free and open-source optical character recognition (OCR) tool designed to provide fast, offline text extraction from images, screenshots, PDFs, and more without requiring a network connection. It includes a highly efficient offline OCR engine with built-in multilingual recognition libraries, so users can extract text across multiple languages with high accuracy directly on their machines.

Downloads: 47 This Week

Last Update: 2026-01-15
See Project
2

GLM-OCR

Accurate × Fast × Comprehensive

GLM-OCR is an open-source multimodal optical character recognition (OCR) model built on a GLM-V encoder–decoder foundation that brings robust, accurate document understanding to complex real-world layouts and modalities. Designed to handle text recognition, table parsing, formula extraction, and general information retrieval from documents containing mixed content, GLM-OCR excels across major benchmarks while remaining highly efficient with a relatively compact parameter size (~0.9B), enabling deployment in high-concurrency services and edge environments. ...

Downloads: 1 This Week

Last Update: 2026-04-08
See Project
3

Video-subtitle-extractor

A GUI tool for extracting hard-coded subtitle (hardsub) from videos

...Use local OCR recognition, no need to set up and call any API, and do not need to access online OCR services such as Baidu and Ali to complete text recognition locally. Support GPU acceleration, after GPU acceleration, you can get higher accuracy and faster extraction speed. (CLI version) No need for users to manually set the subtitle area, the project automatically detects the subtitle area through the text detection model.

1 Review

Downloads: 39 This Week

Last Update: 2026-04-05
See Project
4

Scribe.js

JavaScript OCR and text extraction for images and PDFs

Scribe.js is a JavaScript library that provides Optical Character Recognition (OCR) and text extraction capabilities for both images and PDF documents, aimed at developers who want to build OCR features directly into their applications. The library can take image files (such as PNG or JPEG) and recognize the text they contain, and it can also extract text from PDF files that either already contain text or are image-based scans, using modern web standards and WebAssembly under the hood. ...

Downloads: 1 This Week

Last Update: 2026-05-06
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
5

PaddleOCR-json

OCR offline image text recognition command line windows program

PaddleOCR-json is an OCR engine based on the PaddleOCR project that provides a command-line interface and tools for extracting text from images and exporting results in structured JSON format. It wraps the PaddleOCR models, which are capable of detecting and recognizing text in a wide variety of languages and layouts, into a self-contained executable that can be run locally without needing a deep learning environment configured manually. This makes it practical for developers or system...

Downloads: 9 This Week

Last Update: 2026-01-15
See Project
6

HunyuanOCR

OCR expert VLM powered by Hunyuan's native multimodal architecture

HunyuanOCR is an open-source, end-to-end OCR (optical character recognition) Vision-Language Model (VLM) developed by Tencent‑Hunyuan. It’s designed to unify the entire OCR pipeline, detection, recognition, layout parsing, information extraction, translation, and even subtitle or structured output generation, into a single model inference instead of a cascade of separate tools. Despite being fairly lightweight (about 1 billion parameters), it delivers state-of-the-art performance across a...

Downloads: 1 This Week

Last Update: 2026-05-11
See Project
7

VietOCR

Provides optical character recognition (OCR) solutions for Vietnamese language.

24 Reviews

Downloads: 193 This Week

Last Update: 2026-01-17
See Project
8

chessPDFBrowser

Chess application whichs allows working with chess PDF books and PGNs.

Chess application which allows working with PDFs and PGNs. You can work with the chess games of the PDF and edit their tree of variants. Graphical environment. Standard PGN TAGs. PGN comments. Ocr like (Fen string detection from chess board position images). Connection to Uci chess engines (like stockfish). Position analysis, full game analysis. You can now play games against uci engines. pdf2pgn command line command included. Detailed documentation. Multilanguage...

1 Review

Downloads: 32 This Week

Last Update: 2026-04-04
See Project
9

DocWire SDK

Award-winning modern data processing SDK in C++20

DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to...

Downloads: 1 This Week

Last Update: 2026-03-27
See Project
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
10

OpenKM Document Management - DMS

Document Management System and Content Management System

OpenKM Community Edition is a free Document Management System (DMS) that helps businesses control the production, storage, management and distribution of electronic documents, boosting effectiveness and productivity. It integrates document management, collaboration and advanced search into one easy-to-use solution, including administration tools for user roles, access control, security levels, activity logs and automation setup. With OpenKM Community Edition you can: Collect information...

32 Reviews

Downloads: 333 This Week

Last Update: 2026-05-14
See Project
11

MyBox

Easy Tools of PDF, Image, File, Network, Data, and Medias

javafx-desktop-apps pdf image ocr icc barcode color-palette text bytes markdown html archive compress digest video audio editor converter media https://github.com/Mararsh/MyBox Self-contain packages need not java env nor installation. Jar packages need Java 16 or higher.

Downloads: 0 This Week

Last Update: 2026-02-10
See Project
12

OCR Manga Reader for Android

Android Manga reader with Japanese OCR and dictionary capabilities

OCR Manga Reader is a free and open source Android app that allows you to quickly OCR and lookup Japanese words in real-time. It does not have ads or telemetry/spyware and does not require an Internet connection. Supports both EDICT and EPWING dictionaries. Requires Android 4.0 (Ice Cream Sandwich) or higher. See http://ocrmangareaderforandroid.sourceforge.net/ for details.

3 Reviews

Downloads: 21 This Week

Last Update: 2023-10-07
See Project
13

Dual Clip Translator

Translation of Selected text or Clipboard contents powered by Google. HotKeys Paste/Change Text auto translated. View in Balloon/Window the result of translation, besides being sent to the clipboard. Screen Capture of Desktop/Game > OCR > Translated.

5 Reviews

Downloads: 17 This Week

Last Update: 2023-05-26
See Project
14

VideoSubFinder

The main purpose of this program is to provide functionality for extract hardcoded subtitles (hardsub) from video. It provides two main features: 1) Autodetection of frames with hardcoded text (hardsub) on video with saving info about timing positions. 2) Generation of cleared from background text images, which allows with usage of OCR programs (like FineReader, Subtitle Edit, Google Drive) to generate complete subtitles with original text and timing. For working of this program on...

18 Reviews

Downloads: 547 This Week

Last Update: 2023-05-01
See Project
15

Common Resource Grep - crgrep

Common Resource Grep

CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you...

3 Reviews

Downloads: 1 This Week

Last Update: 2023-04-23
See Project
16

Manga Rikai OCR

Manga Rikai is the first consumer-ready multi-page manga OCR/translation engine. It is a spiritual successor to Capture2Text, Visual Novel Reader, and Textractor. At the moment, the engine can capture and translate single text box, detect all text boxes in a page or as many pages as you want. Not only that, you can edit the text, save your progress, and even export your work as an HTML file. Got problems? Join our discord: https://discord.com/invite/BuNuanw

1 Review

Downloads: 6 This Week

Last Update: 2021-02-23
See Project
17

cbrTekStraktor

an application to automatically extract text from comic books.

cbrTekStraktor is an application to automatically extract text from the text bubbles or speech balloons present in comic book reader files (CBR). Its prime goal is to perform analysis on the texts of comic books. cbrTekStraktor can however also be used for scanlation or similar purposes. The application also enables to manually define text areas in CBR files. The application comprises a simple graphical editor for further processing the extracted text. The text extraction is...

Downloads: 2 This Week

Last Update: 2017-06-14
See Project
18

OCR Web based

OCR web based for Browser Firefox & PC

...Finally, I wish to inform you that you can write or draw directly on the canvas to get the subsequent character recognition and text extraction

2 Reviews

Downloads: 0 This Week

Last Update: 2018-09-05
See Project
19

OCR For Visually Challenged Person

Provides GUI for Tessaract OCR

It converts scanned image into text, braille and audio format. The image should be scanned with atleast 300 dpi for better accuracy.

Downloads: 5 This Week

Last Update: 2015-05-24
See Project
20

DJVU++

The DjVu complete solution,with OCR Technology(Arabic ,English).

DjVu++ is a user-friendly program that used to manipulate DjVu file formats such as eBooks with a penalty of editing features. The program introduce a free replacement for the property PDF format with similar resolution and smaller file size DjVu++ also support OCR to handle text in scanned books and images. The program shows good performance for English. In addition to the Arabic language to lead free and commercial software in this area. The main features of DjVu++ program are: o...

4 Reviews

Downloads: 6 This Week

Last Update: 2015-08-24
See Project
21

Vision2u

free image processing software

Vision2u offers a free image processing software for personal use and research. Primary tasks of the image processing can be realized during simple operation of the software. Every Web cam owner can have simplest measuring, counting or tasks of monitoring done without high capital outlays.

Downloads: 0 This Week

Last Update: 2015-05-01
See Project
22

Eye

Eye is an experimental OCR (image-to-text) application.

2 Reviews

Downloads: 1 This Week

Last Update: 2014-09-27
See Project
23

LynxSight Mobile

An OCR assistant for visually impaired people

LynxSight mobile is an android application that serves as OCR assistant. Application scans pictures taken by camera for text and reads it to user. LynxSight mobile is designed for use by visually impaired people. It contains voice assistant, voice commands and simple UI to make using easier.

Downloads: 0 This Week

Last Update: 2013-09-26
See Project
24

3Aafreet

Java Linux, Windows Sometimes I need to copy a text from a web page and it turns out to be an image, that can be copied as an image. The idea is to feed the copied image to an OCR and than paste the resulting text. We would probably rely on jocr to do th

Downloads: 0 This Week

Last Update: 2013-04-17
See Project
25

TCR Neuroph -Text Character Recognition

TCR Neuroph - Text Character Recognition is java tool developed to recognize scanned text , using Java Neural Network Framework - Neuroph

Downloads: 0 This Week

Last Update: 2015-09-01
See Project

Previous
You're on page 1
2
Next

Related Searches

ocr

umi-ocr

video to srt

openkm

dms

umi

video text ocr

umi-ocr_paddle_v2.1.5.7z.exe

pgn-chessbook

document management

Related Categories

Artificial Intelligence

Multimedia

Software Development

Business

Education

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise