Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "java ocr extraction text" - Page 3

x

Sort By:

Relevance

Clear All Filters

OS

Mac 83
Windows 82
Linux 80
More...
BSD 32
ChromeOS 32
Desktop Operating Systems 1
Mobile Operating Systems 1

Category

Artificial Intelligence 46
Scientific/Engineering 15
Software Development 14
Internet 13
Text Editors 11
Business 10
Multimedia 9
Formats and Protocols 6
System 6
Education 4
Communications 3
Database 2
Games 1

License

OSI-Approved Open Source 61
Other License 6
Creative Commons Attribution License 3
Public Domain 1

Translations

Programming Language

Java 57
Python 10
JavaScript 6
C++ 3
More...
TypeScript 3
C# 2
Perl 2
Rust 2
ActionScript 1
C 1
COBOL 1
Fortran 1
Prolog 1
Ruby 1
Scala 1
Unix Shell 1
XSL (XSLT/XPath/XSL-FO) 1

Status

Beta 23
Production/Stable 14
Alpha 7
Mature 4
More...
Planning 3
Pre-Alpha 3
Inactive 1

Showing 83 open source projects for "java ocr extraction text"

View related business solutions

Mac Clear Filters & Widen Search

Forever Free Full-Stack Observability | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
1

Eye

Eye is an experimental OCR (image-to-text) application.

2 Reviews

Downloads: 0 This Week

Last Update: 2014-09-27
See Project
2

webtextanalysis

Mining knowledge from text data

This project aims to implement in java the following text mining techniques: Text Language Detection, Keywords and keyphrases extraction, Text Classification, Text Clustering, Single or multiple documents Summarization, Plagiarism Detection.

Downloads: 0 This Week

Last Update: 2016-03-07
See Project
3

FALCON - Text Search Java Project

JSON based text search Java Project

----------------- - What is it? - ----------------- The "Falcon Search" is a JAVA API and tool to search inside the documents. It was originally started to search the content in pdf files under the project "HAWK Search". Searching with this tool is query-based not word-based as in most of the document search tools OR document readers. It also takes care of jumbling of words within query and spelling mistakes. Commonly used techniques in this project are Natural Language...

Downloads: 0 This Week

Last Update: 2014-04-18
See Project
4

Detexter

Detexter is an app designed to extract text from PDF files.

Detexter lets you extract text from multiple PDF files. Detexter uses the PDFBox library for its text extraction.

Downloads: 0 This Week

Last Update: 2015-09-01
See Project
Auth0 B2B Essentials: SSO, MFA, and RBAC Built In
Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.

Sign Up Free
5

TML - Text Mining Library for LSA & CMM

TML is a Java Library for LSA and extracting Concept Maps from text

TML has moved to http://www.villalon.cl/tml.html and the code to https://github.com/villalon/tml

3 Reviews

Downloads: 0 This Week

Last Update: 2013-08-05
See Project
6

TextProcessor

A Java package to preprocess text datasets for posterior text analysis

The TextProcessor Java package is a text processing toolkit, which provides some frequently used text processing functions such as stemming, removing stop-words, generating a term vocabulary, and calculating the term-doc frequency matrix. Basic topic mining models such as LDA and sparse NMF are also supported. The package can also generate feature files from a given text dataset with LDA and LIBSVM format for posterior procedures such as classification or clustering. The toolkit is also...

Downloads: 0 This Week

Last Update: 2015-11-23
See Project
7

Anteater

Annotation Tool to Extract Endangered Animals from Text Resources

The goal of this project is the extraction the information listed below from texts downloaded from the Federal Register (https://www.federalregister.gov). The texts are mainly applications for permits, notices about given permits, etc. This software tool is developed by the Max Planck Institute for the History of Science (http://www.mpiwg-berlin.mpg.de) in collaboration with Dirk Wintergrün and Etienne Benson.

Downloads: 0 This Week

Last Update: 2013-07-18
See Project
8

RapidMiner Information Extraction Plugin

The Information Extraction Plugin allows the use of information extraction techniques within RapidMiner. It can be seen as an interface between natural language and IE- or datamining-methods, by extracting interesting information out of documents.

Downloads: 0 This Week

Last Update: 2015-08-07
See Project
9

G-Asks

G-Asks is a question generation system, developed by LATTE(Learning and Affect Technologies Engineering) research group at The University of Sydney. It uses Natural Language Processing techniques and Machine learning algorithms to generate specific trigger questions. If you use this software in a publication, please cite the paper 2. 1.Ming Liu and Rafael A. Calvo (2012) “Using Information Extraction to Generate Trigger Question for Academic Writing Support”, 11th International Conference...

Downloads: 0 This Week

Last Update: 2013-04-29
See Project
Secure File Transfer for Windows with Cerberus by Redwood
Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.

Try for Free
10

Large Text File converter

Java Based Heavy-duty utilitity to process large delimited text files

TextZilla is a Multithreaded Java utility which can process huge size delimited text files to extract, convert, encode, decode, encrypt/decrypt text data from source and write it in desired output file or files. It provides fully extensible framework based on which Java classes can be created, for example it currently has MD5 conversion capability, based on same design classes for 3DES ,AES or any other Algo can be created.

Downloads: 0 This Week

Last Update: 2015-05-31
See Project
11

SeerSuite

SeerSuite is an application toolkit for digital libraries and search engines; i.e., CiteSeerX. CiteSeerX has moved to GitHub, please get the latest code from: https://github.com/SeerLabs/CiteSeerX

2 Reviews

Downloads: 0 This Week

Last Update: 2014-01-24
See Project
12

AADRTE

Automatic Arabic Domain-Relevant Term Extraction

In this research we propose a model for automatic domain-relevant term extraction from Arabic text corpus. The proposed model uses a hybrid approach composed of linguistic and statistical methods to extract terms relevant to specific domains depending on prevalence and tendency term ranking mechanism. This increases precision and recall as a measures of relevancy of extracted terms to a specific domain.

Downloads: 0 This Week

Last Update: 2013-05-30
See Project
13

DBpedia Spotlight

DBpedia Spotlight is a tool for annotating mentions of DBpedia resources in natural language text. The source code is now hosted on GitHub: https://github.com/dbpedia-spotlight

1 Review

Downloads: 0 This Week

Last Update: 2013-06-04
See Project
14

BioEvent

This is a Java-based project for complex event extraction from text and co-reference resolution. Currently the code can read BioNLP shared task format (http://2011.bionlp-st.org/) and i2b2 Natural Language Processing for Clinical Data shared task format (https://www.i2b2.org/NLP/DataSets/Main.php). Event extraction includes finding events and the parameters for an event in a text.

Downloads: 0 This Week

Last Update: 2013-04-25
See Project
15

text-analysis

This project aims to implement in java the following text mining techniques: Text Language Detection, Keywords and keyphrases extraction, Text Classification, Text Clustering, Single or multiple documents Summarization, Plagiarism Detection.

Downloads: 0 This Week

Last Update: 2014-05-20
See Project
16

TextMarker

TextMarker is now developed and hosted at Apache UIMA (http://uima.apache.org/textmarker.html). TextMarker is a UIMA-based tool for information extraction and more. The full featured editor of the rule language and the build process of UIMA descriptors are complemented with components for visualization, explanation, testing and rule learning.

1 Review

Downloads: 0 This Week

Last Update: 2013-04-29
See Project
17

SEMANTIXS

SEMANTIXS is a semantic information extraction system that can extract, represent and visualize domain-specific information from free-text in the form of complex (and simple) relationships. Refer - http://www.cs.iastate.edu/~semantix/ for more info.

Downloads: 0 This Week

Last Update: 2013-05-02
See Project
18

iracema

An information extraction library implementing modern algorithms for the extraction of named entities from text.

Downloads: 0 This Week

Last Update: 2013-04-19
See Project
19

FX Player : yet another streaming server

FX Player is a Web-based streaming server with a Flash iTunes-like interface. It shares your MP3 library and allow access to your tracks through the Internet. Coded in Java, FX Player run on most platforms, including Mac OS X, Windows, Linux and Unix.

Downloads: 1 This Week

Last Update: 2013-04-26
See Project
20

textkit4j

Provides a set of tools for processing text, such as text extraction and classification. Classification implementations to be implemented include: Bayesian and Statistical (N-gram).

Downloads: 0 This Week

Last Update: 2013-04-24
See Project
21

moara

Moara is a biological text mining tool and consists of a Java library and some auxiliary MySQL databases for gene/protein training and extraction of mentions and its further normalization and disambiguation.

Downloads: 0 This Week

Last Update: 2013-04-10
See Project
22

TCR Neuroph -Text Character Recognition

TCR Neuroph - Text Character Recognition is java tool developed to recognize scanned text , using Java Neural Network Framework - Neuroph

Downloads: 0 This Week

Last Update: 2015-09-01
See Project
23

OpenDMAP

OpenDMAP (Open Source Direct Memory Access Parser) is a natural language processing (text mining) application: a semantic parser for information extraction.

Downloads: 0 This Week

Last Update: 2013-04-30
See Project
24

JOcrad

JOcrad is a graphical frontend for GNU/Ocrad written in Java. GNU Ocrad is an OCR (Optical Character Recognition) program based on a feature extraction method.JOcrad supports italian and english languages, JPG,PNG and GIF images.

Downloads: 0 This Week

Last Update: 2014-05-10
See Project
25

Trainable Relation Extraction framework

T-Rex (Trainable Relation Extraction) is a highly configurable machine learning-based Information Extraction from Text framework, which includes tools for document classification, entity extraction and relation extraction.

Downloads: 0 This Week

Last Update: 2013-05-02
See Project

Previous
1
2
You're on page 3
4
Next

Related Searches

ocr

text clustering

war files

pdf extract

document term matrix in java

rapidminer

question paper generator in java

decrypt md5

arabic corpus

svm java

Related Categories

Artificial Intelligence

Scientific/Engineering

Software Development

Internet

Text Editors

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise