Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "java ocr extraction text" - Page 4

x

Sort By:

Relevance

Clear All Filters

OS

Windows 101
Linux 89
Mac 82
More...
BSD 35
ChromeOS 32
Desktop Operating Systems 2
Mobile Operating Systems 1

Category

Artificial Intelligence 58
Software Development 19
Scientific/Engineering 18
Business 14
Internet 12
Multimedia 12
Text Editors 12
System 10
Formats and Protocols 9
Education 5
Communications 3
Database 3
Desktop Environment 1
Games 1

License

OSI-Approved Open Source 74
Other License 8
Creative Commons Attribution License 3
Public Domain 1

Translations

Programming Language

Java 69
Python 14
C++ 7
JavaScript 7
More...
C# 3
Perl 2
Ruby 2
Rust 2
TypeScript 2
Unix Shell 2
ActionScript 1
C 1
COBOL 1
Common Lisp 1
Fortran 1
Go 1
Prolog 1
Scala 1
XSL (XSLT/XPath/XSL-FO) 1

Status

Beta 25
Production/Stable 21
Alpha 9
Pre-Alpha 4
More...
Mature 4
Planning 3
Inactive 1

Showing 101 open source projects for "java ocr extraction text"

View related business solutions

Windows Clear Filters & Widen Search

AI-powered service management for IT and enterprise teams
Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.

Try it Free
Forever Free Full-Stack Observability | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
1

AADRTE

Automatic Arabic Domain-Relevant Term Extraction

In this research we propose a model for automatic domain-relevant term extraction from Arabic text corpus. The proposed model uses a hybrid approach composed of linguistic and statistical methods to extract terms relevant to specific domains depending on prevalence and tendency term ranking mechanism. This increases precision and recall as a measures of relevancy of extracted terms to a specific domain.

Downloads: 0 This Week

Last Update: 2013-05-30
See Project
2

DBpedia Spotlight

DBpedia Spotlight is a tool for annotating mentions of DBpedia resources in natural language text. The source code is now hosted on GitHub: https://github.com/dbpedia-spotlight

1 Review

Downloads: 0 This Week

Last Update: 2013-06-04
See Project
3

Ticket Cluster

Java aplication who groups related text documents, text mining

Ticket Cluster is a java aplication who groups related text documents(text is extracted from a helpdesk) into clusters, providing an overview of the document set. This is done without preconceptions about keywords — this Java software analyzes the text and identifies the structure that arises naturally. The extraction phase depends of the data of the helpdesk, in the current implementation there is a php script who extracts all text from numered tickets (Facil HelpDesk ) to a folder. ...

Downloads: 0 This Week

Last Update: 2012-07-13
See Project
4

BioEvent

This is a Java-based project for complex event extraction from text and co-reference resolution. Currently the code can read BioNLP shared task format (http://2011.bionlp-st.org/) and i2b2 Natural Language Processing for Clinical Data shared task format (https://www.i2b2.org/NLP/DataSets/Main.php). Event extraction includes finding events and the parameters for an event in a text.

Downloads: 0 This Week

Last Update: 2013-04-25
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.

Start Free
5

text-analysis

This project aims to implement in java the following text mining techniques: Text Language Detection, Keywords and keyphrases extraction, Text Classification, Text Clustering, Single or multiple documents Summarization, Plagiarism Detection.

Downloads: 0 This Week

Last Update: 2014-05-20
See Project
6

TextMarker

TextMarker is now developed and hosted at Apache UIMA (http://uima.apache.org/textmarker.html). TextMarker is a UIMA-based tool for information extraction and more. The full featured editor of the rule language and the build process of UIMA descriptors are complemented with components for visualization, explanation, testing and rule learning.

1 Review

Downloads: 0 This Week

Last Update: 2013-04-29
See Project
7

Lioness (Languages Interop Framework)

Framework for making Windows applications that are one .exe file in AutoHotKey_L,C++,C#, VB.NET,Java,Groovy,Common Lisp,Nemerle,Ruby,Python,PHP,Lua,Tcl,Perl,Jint,S#,WSH VBScript,HTML/JavaScript/CSS,COM, PowerShell without compiling . For .NET 4.

Downloads: 0 This Week

Last Update: 2014-03-23
See Project
8

SEMANTIXS

SEMANTIXS is a semantic information extraction system that can extract, represent and visualize domain-specific information from free-text in the form of complex (and simple) relationships. Refer - http://www.cs.iastate.edu/~semantix/ for more info.

Downloads: 0 This Week

Last Update: 2013-05-02
See Project
9

iracema

An information extraction library implementing modern algorithms for the extraction of named entities from text.

Downloads: 0 This Week

Last Update: 2013-04-19
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
10

FX Player : yet another streaming server

FX Player is a Web-based streaming server with a Flash iTunes-like interface. It shares your MP3 library and allow access to your tracks through the Internet. Coded in Java, FX Player run on most platforms, including Mac OS X, Windows, Linux and Unix.

Downloads: 1 This Week

Last Update: 2013-04-26
See Project
11

textkit4j

Provides a set of tools for processing text, such as text extraction and classification. Classification implementations to be implemented include: Bayesian and Statistical (N-gram).

Downloads: 0 This Week

Last Update: 2013-04-24
See Project
12

moara

Moara is a biological text mining tool and consists of a Java library and some auxiliary MySQL databases for gene/protein training and extraction of mentions and its further normalization and disambiguation.

Downloads: 0 This Week

Last Update: 2013-04-10
See Project
13

TCR Neuroph -Text Character Recognition

TCR Neuroph - Text Character Recognition is java tool developed to recognize scanned text , using Java Neural Network Framework - Neuroph

Downloads: 0 This Week

Last Update: 2015-09-01
See Project
14

OpenDMAP

OpenDMAP (Open Source Direct Memory Access Parser) is a natural language processing (text mining) application: a semantic parser for information extraction.

Downloads: 0 This Week

Last Update: 2013-04-30
See Project
15

JOcrad

JOcrad is a graphical frontend for GNU/Ocrad written in Java. GNU Ocrad is an OCR (Optical Character Recognition) program based on a feature extraction method.JOcrad supports italian and english languages, JPG,PNG and GIF images.

Downloads: 0 This Week

Last Update: 2014-05-10
See Project
16

Trainable Relation Extraction framework

T-Rex (Trainable Relation Extraction) is a highly configurable machine learning-based Information Extraction from Text framework, which includes tools for document classification, entity extraction and relation extraction.

Downloads: 0 This Week

Last Update: 2013-05-02
See Project
17

MutationFinder

MutationFinder is a biomedical natural language processing (NLP) system for extracting mentions of point mutations from free text. MutationFinder achieves high performance (99% precision, 81% recall on blind test data) as an information extraction system

Downloads: 0 This Week

Last Update: 2013-03-22
See Project
18

FacialDAS

This project aims to distribute a facial animation system with speech, developed to brazilian portuguese case. This system is composed by many modules: movement extraction, facial animation and speech, through a text-to-speech system.

Downloads: 0 This Week

Last Update: 2015-09-22
See Project
19

sinon

Sinon is a Java tool that extracts textual information from Web sites. In other words, it is a tool that can be used to scrape any kind of text (HTML included) available in the Internet or in a filesystem. The extraction is driven by a XML file.

Downloads: 0 This Week

Last Update: 2013-03-08
See Project
20

OCR Reader

The tool supports template-based parsing, allowing structured output i

OCR Reader is a lightweight Windows utility designed to extract text from PDF files and images using OCR (Tesseract engine). The tool supports template-based parsing, allowing structured output into CSV or TXT without manual coding. Core components Tesseract OCR engine Poppler (PDF rendering) Template-based extraction system Homepage: https://martan1484.github.io/OCR_Reader

Downloads: 0 This Week

Last Update: 2026-04-17
See Project
21

Calendar Extraction Tool

Extracts ics from free form text.

Downloads: 0 This Week

Last Update: 2013-04-12
See Project
22

translategemma-4b-it

Lightweight multimodal translation model for 55 languages

translategemma-4b-it is a lightweight, state-of-the-art open translation model from Google, built on the Gemma 3 family and optimized for high-quality multilingual translation across 55 languages. It supports both text-to-text translation and image-to-text extraction with translation, enabling workflows such as OCR-style translation of signs, documents, and screenshots. With a compact ~5B parameter footprint and BF16 support, the model is designed to run efficiently on laptops, desktops, and private cloud infrastructure, making advanced translation accessible without heavy hardware requirements. ...

Downloads: 0 This Week

Last Update: 2026-01-16
See Project
23

Logfile Viewer

The approach of this project is to provide a tool to analyse logfiles. You can specify rules for the extraction of information from any text files and display and browse them in a swing-based GUI.

Downloads: 0 This Week

Last Update: 2014-07-03
See Project
24

GraphSpider/MPL

GraphSpider is a pattern matcher which searches parsed text in phrase-structure tree or dependency graph format for syntactic structures matching a set of patterns in MPL, a regexp-like pattern language. Applications: information extraction, text mining.

Downloads: 0 This Week

Last Update: 2013-04-19
See Project
25

reputron

reputron is a knowledge extraction engine platform that covers all aspect of text mining, relevance, indexing and querying on a corpus of text documents.

Downloads: 0 This Week

Last Update: 2015-04-08
See Project

Previous
1
2
3
You're on page 4
5
Next

Related Searches

arabic corpus

helpdesk java

svm java

text summarization

point of sale ms access

fx player

java ocr and captcha recognition

ocrad

document classification

facial animation

Related Categories

Artificial Intelligence

Software Development

Scientific/Engineering

Business

Internet

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise