Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
Natural Language Processing (NLP) Tools

Open Source Java Natural Language Processing (NLP) Tools

x

Sort By:

Most Popular

Clear All Filters

OS

Linux 77
Windows 75
Mac 65
More...
BSD 46
ChromeOS 38
Desktop Operating Systems 6
Game Consoles 1
Mobile Operating Systems 1
Server Operating Systems 1

Category

Artificial Intelligence 84
Scientific/Engineering 53
Software Development 9
Education 7
Business 6
Internet 5
System 5
Text Editors 3
Communications 2
Database 2
Multimedia 2
Formats and Protocols 1
Games 1

License

OSI-Approved Open Source 73
Public Domain 4
Creative Commons Attribution License 3
Other License 2
More...
GNU Free Documentation License 1

Translations

English 29
French 6
Spanish 2
Chinese (Simplified) 1
More...
Czech 1
Dutch 1
German 1
Hebrew 1
Japanese 1
Malay 1
Russian 1
Turkish 1

Programming Language

Java 84
C++ 9
Perl 6
C 3
Python 3
More...
JavaScript 2
Groovy 1
IDL 1
JSP 1
Lisp 1
PHP 1
Prolog 1
Ruby 1
S/R 1
Unix Shell 1
XSL (XSLT/XPath/XSL-FO) 1

Status

Production/Stable 23
Alpha 20
Beta 20
Pre-Alpha 13
More...
Planning 6
Mature 2
Inactive 1

Java Natural Language Processing (NLP) Tools

View 188 business solutions

Natural Language Processing (NLP) Java Clear Filters

Browse free open source Java Natural Language Processing (NLP) Tools and projects below. Use the toggles on the left to filter open source Java Natural Language Processing (NLP) Tools by OS, license, language, programming language, and project status.

Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
Enterprise-grade ITSM, for every business
Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.

Try it Free
1

Modular Audio Recognition Framework

MARF is a general cross-platform framework with a collection of algorithms for audio (voice, speech, and sound) and natural language text analysis and recognition along with sample applications (identification, NLP, etc.) of its use, implemented in Java.

3 Reviews

Downloads: 24 This Week

Last Update: 2015-10-06
See Project
2

Stanford CoreNLP

Stanford CoreNLP, a Java suite of core NLP tools

CoreNLP is your one stop shop for natural language processing in Java! CoreNLP enables users to derive linguistic annotations for text, including token and sentence boundaries, parts of speech, named entities, numeric and time values, dependency and constituency parses, coreference, sentiment, quote attributions, and relations. CoreNLP currently supports 6 languages, Arabic, Chinese, English, French, German, and Spanish. The centerpiece of CoreNLP is the pipeline. Pipelines take in raw text, run a series of NLP annotators on the text, and produce a final set of annotations. Pipelines produce CoreDocuments, data objects that contain all of the annotation information, accessible with a simple API, and serializable to a Google Protocol Buffer. CoreNLP generates a variety of linguistic annotations, including parts of speech, named entities, dependency parses, and coreference.

Downloads: 2 This Week

Last Update: 2025-06-07
See Project
3

OpenNLP

OpenNLP provides the organizational structure for coordinating several different projects which approach some aspect of Natural Language Processing. OpenNLP also defines a set of Java interfaces and implements some basic infrastructure for NLP compon

2 Reviews

Downloads: 8 This Week

Last Update: 2014-10-28
See Project
4

JWNL (Java WordNet Library)

JWNL is a Java API for accessing the WordNet relational dictionary. WordNet is widely used for developing NLP applications, and a Java API such as JWNL will allow developers to more easily use Java for building NLP applications.

4 Reviews

Downloads: 2 This Week

Last Update: 2013-04-29
See Project
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
5

AminePlatform

Amine is a Multi-Layer Platform for the dev. of Intelligent Systems

Amine is an Artificial Intelligence Multi-Layer Java Open Source Platform dedicated to the development of various kinds of Intelligent Systems and Agents (Knowledge-Based, Ontology-Based, Conceptual Graph -CG- Based, NLP, Reasoning and Learning, Natural Language Processing, etc.). Ontology, KB can be created and manipulated with various processes. CG theory is used as the main knowledge representation language. Amine provides two languages: PROLOG+CG which extends PROLOG with CG and Amine modules, and SYNERGY which is a visual activation/propagation based language. CGs are considered by SYNERGY as activable/executable graphs. See for more detail: //amine-platform.sourceforge.net/

3 Reviews

Downloads: 2 This Week

Last Update: 2023-10-12
See Project
6

BioC

We describe a simple XML format to share text documents and annotation

A minimalist approach to share text documents and data annotations. Allows a large number of different annotations to be represented. Project files contain: - simple code to hold/read/write data and perform sample processing. - BioC-formatted corpora - BioC tools that work with BioC corpora BioC goals - simplicity - interoperability - broad use - reuse There should be little investment required to learn to use a format or a software module to process that format. We are interested in reuse, and we focus on common NLP tasks that are broadly useful for textmining.

Downloads: 7 This Week

Last Update: 2016-08-08
See Project
7

TXM

Unicode XML TEI text analysis platform

TXM is a free and open-source cross-platform Unicode & XML based text analysis environment and graphical client, supporting Windows, Linux and Mac OS X. It can also be used online as a J2EE standard compliant web portal (GWT based) with access control built in. DOWNLOAD LATEST VERSION OF TXM : http://textometrie.ens-lyon.fr/spip.php?rubrique61&lang=en TXM offers a comprehensive range of analysis tools (concordances, collocate search, frequency lists, etc.) based on the powerfull CQP full text search engine (http://cwb.sourceforge.net) and a range of statistical functions (factorial analysis, classification, cooccurrency analysis, etc.) based on R packages (http://www.r-project.org). Read the scientific background at the Textométrie project web site http://textometrie.ens-lyon.fr/?lang=en. Read a full description at the TEI Tools wiki http://wiki.tei-c.org/index.php/TXM.

Downloads: 6 This Week

Last Update: 2024-12-09
See Project
8

IceNLP

IceNLP is an open source Natural Language Processing (NLP) toolkit for analyzing and processing Icelandic text. The toolkit is implemented in Java.

1 Review

Downloads: 2 This Week

Last Update: 2018-04-13
See Project
9

Open Health Natural Language Processing

This ohnlp project has released "pipelines" that were contributed by members of the OHNLP Consortium. The pipelines are based on the Apache UIMA framework. medKAT/P, MedCoref, MedTagger, MedXN, and cTAKES are licensed under Apache License V2.0. MedTime is licensed under GNU General Public License version 3.0 (GPLv3). cTAKES development has moved to apache.org. See http://ctakes.apache.org/

2 Reviews

Downloads: 1 This Week

Last Update: 2016-05-20
See Project
Train ML Models With SQL You Already Know
BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.

Try Free
10

masmt

A frame work for Multi agent system development

MaSMT is a java based multi-agent system development framework, especially designed for development of English to Sinhala machine translation system. MaSMT also capable to develop any multi-agent based system through its architecture. Reference: B. Hettige, A. S. Karunananda, G. Rzevski, Multi-agent solution for managing complexity in English to Sinhala Machine Translation, International Journal of Design & Nature and Ecodynamics, Volume 11, Issue 2, 2016, 88 – 96. B. Hettige, A. S. Karunananda, G. Rzevski, ” MaSMT: A Multi-agent System Development Framework for English-Sinhala Machine Translation”, International Journal of Computational Linguistics and Natural Language Processing (IJCLNLP), Volume 2 Issue 7 July 2013.

1 Review

Downloads: 1 This Week

Last Update: 2021-09-23
See Project
11

Bermuda Text-to-Speech

This project includes basic NLP and DSP techniques for Text-to-Speech

See TTS demo at: http://rslp.racai.ro/index.php?page=tts This is an entirely written in JAVA project which includes a set of tools and methods designed to enable Multilingual Text-to-Speech (TTS) synthesis. We currently support English and Romanian but we will soon train more models and make them available for download. If you want to read more about our other NLP and TTS tools check out http://nlptools.racai.ro.

Downloads: 1 This Week

Last Update: 2014-03-24
See Project
12

MutationFinder

MutationFinder is a biomedical natural language processing (NLP) system for extracting mentions of point mutations from free text. MutationFinder achieves high performance (99% precision, 81% recall on blind test data) as an information extraction system

Downloads: 1 This Week

Last Update: 2013-03-22
See Project
13

Next Generation Programming

Compose Software Without Writing Any Programing Code

"Next Generation Programming - Programming Without Coding Software" is a drag-drop wizard for creating simple or complex applications without writing any programming language code The Software is coded/designed with "Java Programming Language" for novice/expert programmers; Programmers can write softwares with visual tools : drag-drop components;visual editors... Programmers can use the software to compose of simple/complex applications : Database programs, circuit design, generate code and upload to chip for designed circuits (ESP8266, ESP32 chips) The Software in question is much simpler to use than PWCT (https://sourceforge.net/projects/doublesvsoop/) software. The Software has more features than PWCT software such as SCADA. Please start by looking at examples from the website first. In this way, you can learn the features of the software and how to use the software in a very short time. More Information (Documents, Videos, Examples ...) : negep.epizy.com

2 Reviews

Downloads: 1 This Week

Last Update: 2022-01-14
See Project
14

Aikernel

The Aikernel is an intelligence server and cell runtime environment that uses natural language processing and other pattern matching with Activators, Contexts, Concepts to allow multi tasking between installed cells.

2 Reviews

Downloads: 0 This Week

Last Update: 2013-03-08
See Project
15

Ansj Chinese word segmentation

Ansj word segmentation

The real java implementation of ict. The word segmentation effect is faster than the open source version of ict. Chinese word segmentation, name recognition, part-of-speech tagging, user-defined dictionary. This is a java implementation of Chinese word segmentation based on n-Gram+CRF+HMM. The word segmentation speed reaches about 2 million words per second (tested under mac air), and the accuracy rate can reach more than 96%. At present, it has realized the functions of Chinese word segmentation, Chinese name recognition, user-defined dictionary, keyword extraction, automatic summarization, and keyword tagging. It can be applied to natural language processing and other aspects, and is suitable for various projects that require high word segmentation effects.

1 Review

Downloads: 0 This Week

Last Update: 2021-09-22
See Project
16

Apache OpenNLP

Apache OpenNLP

Apache OpenNLP is a machine learning-based NLP library that provides tools for text-processing tasks such as tokenization, sentence segmentation, and named entity recognition.

Downloads: 0 This Week

Last Update: 6 days ago
See Project
17

AutoSummary Semantic Analysis Engine

AutoSummary uses Natural Language Processing to generate a contextually-relevant synopsis of plain text. It uses statistical and rule-based methods for part-of-speech tagging, word sense disambiguation, sentence deconstruction and semantic analysis.

1 Review

Downloads: 0 This Week

Last Update: 2013-03-25
See Project
18

BioEvent

This is a Java-based project for complex event extraction from text and co-reference resolution. Currently the code can read BioNLP shared task format (http://2011.bionlp-st.org/) and i2b2 Natural Language Processing for Clinical Data shared task format (https://www.i2b2.org/NLP/DataSets/Main.php). Event extraction includes finding events and the parameters for an event in a text. The method is based on SVM but other ML algorithms can be adopted. The method details are explained in the following paper: Ehsan Emadzadeh, Azadeh Nikfarjam, and Graciela Gonzalez. 2011. Double Layered Learning for Biological Event Extraction from Text. In Proceedings of the BioNLP 2011 Workshop Companion Volume for Shared Task, Portland, Oregon, June. Association for Computational Linguistic

Downloads: 0 This Week

Last Update: 2013-04-25
See Project
19

BioNLP

BioNLP is an initiative by the University of Colorado Denver Health Sciences Center to create and distribute code, software, and data for applying natural language processing techniques to biomedical texts

Downloads: 0 This Week

Last Update: 2022-10-26
See Project
20

BioNLP UIMA Component Repository

The BioNLP UIMA Component Repository provides UIMA wrappers for novel and well-known 3rd-party NLP tools used in biomedical text prosessing, such as tokenizers, parsers, named entity taggers, and tools for evaluation.

Downloads: 0 This Week

Last Update: 2014-07-09
See Project
21

CoPT, Corpus Processing Tools

CoPT, Corpus Processing Tools, is a set of java classes intended to assist field linguists, NLP researchers and developers, students and software developers in all corpus-related processing.

Downloads: 0 This Week

Last Update: 2013-03-11
See Project
22

Common Resource Grep - crgrep

Common Resource Grep

CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you will find binary downloads and discussion (https://sourceforge.net/p/crgrep/discussion/) . The actual development and issue tracking can be found here: https://bitbucket.org/cryanfuse/crgrep

3 Reviews

Downloads: 0 This Week

Last Update: 2023-04-23
See Project
23

Commonwealth English

Free and open source grammar checker. Currently capable of identifying errors: 1) Incomplete sentence (fragment) 2) Subject Verb Plurality Agreement Incomplete Sentence Example: "John a very man." Subject Verb Plurality Agreement Example: "They walks into a classroom." This software utilizes parts-of-speech tagging software that was developed and published by the Natural Language Processing Group at Stanford University. Many thanks!!! (Full citation in README)

Downloads: 0 This Week

Last Update: 2014-06-28
See Project
24

D.U.C.K

D.U.C.K (Determine segmentation of Unknown words by using Context Knowledge)is an NLP tool, which aims to find the correct segmentation for unknown words in written Hebrew. Statistics from different scopes will be used to determine the segmentation.

Downloads: 0 This Week

Last Update: 2015-08-05
See Project
25

DGiovanni

A multi-agent architecture for building interactive dramas. It uses the Jason's BDI engine, being the Jason's agent-oriented programming language utilized for performing the drama management and for authoring behaviors for the characters.

Downloads: 0 This Week

Last Update: 2013-04-26
See Project

Previous
You're on page 1
2
3
4
Next

Related Searches

annotation

mega voice command database

arabic speech to text

en-ner-location.bin

java english dictionary jar

nlp

tmx

medkat

masmt

text to speech

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise