text processing free download

Showing 17 open source projects for "text processing"

View related business solutions

Search Engines Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
1

WebHarvest - web data extraction tool

Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.

14 Reviews

Downloads: 3 This Week

Last Update: 2025-10-27
See Project
2

cpDetector

cpDetector is a proxy for codepage detection of documents. It delegates to multiple instances that try to detect the codepage by different techinques. A command line executeable is shipped that allows to sort documents by codepage.

Downloads: 7 This Week

Last Update: 2018-04-05
See Project
3

SEO & SEM - Marketing Text Writer

Open Source SEO & SEM Text Creation Tools for free Article Writer

Open Source Tool for Search Engine Optimization (SEO & SEM) used for automatic content processing. These SEO Content Genrators and Article Writers based on Text Writer: https://www.artikelschreiber.com/en/ https://www.unaique.net/en/ https://www.unaique.com/ https://www.artikelschreiben.com/ https://www.buzzerstar.com/ https://googleduplicatecontentsolver.sourceforge.io/ https://inkassos.github.io/inkasso/ https://www.artikelschreiber.com/opensource/ https://www.sebastianenger.com/ https://www.artikelschreiber.com/marketing/review/ https://muckrack.com/markus-muller https://linktr.ee/textgenerator Code Contains: - Perl Source code, language databases and more

1 Review

Downloads: 0 This Week

Last Update: 2025-03-16
See Project
4

Modular Suite of NLP Tools

This project aims to build a suite of Natural Language Processing tools. Modules will include corpus indexing and access tools, a part-of-speech tagger, tokenisers, text classification software, etc.

Downloads: 0 This Week

Last Update: 2014-06-09
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.

Start Free
5

NGramJ

Provide a robust and efficient implementation of n-gram based classifiers to Java. N-Gram algorithms have shown to be surprisingly good at tasks like guessing the language/encoding from an arbitrary text file. And there are many more applications.

1 Review

Downloads: 0 This Week

Last Update: 2013-04-17
See Project
6

PDFBox

PDFBox is a Java PDF Library. This project will allow access to all of the components in a PDF document. More PDF manipulation features will be added as the project matures. This ships with a utility to take a PDF document and output a text file.

7 Reviews

Downloads: 5 This Week

Last Update: 2016-09-18
See Project
7

Phorminx

(Almost) all a scholar in the Humanities needs (polytonic Greek fonts, stylistic and metrical analysis tools, search engines on TLG and PHI) concentrated in only one Linux Live CD, ready to use everywhere at home or at University, without installation

Downloads: 0 This Week

Last Update: 2013-04-05
See Project
8

ExtMiner

Prototype for a framework and user interface for combining various structured search and document clustering techniques.

Downloads: 0 This Week

Last Update: 2021-01-08
See Project
9

Estraier

Estraier is a personal full-text search system for web sites, local file systems, mail boxes, and so on. Estraier has flexible interface and it can handle multilingual documents and various file formats with external plug-ins.

Downloads: 0 This Week

Last Update: 2013-03-12
See Project
Compliant and Reliable File Transfers Backed by Top Security Certifications
Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.

Start Free Trial
10

Infomap NLP Software

The Infomap NLP software performs automatic indexing of words and documents from free-text corpora, using a variant of LSA to enable information retrieval and other applications. It was developed by the Infomap Project at Stanford University's CSLI.

Downloads: 0 This Week

Last Update: 2013-06-03
See Project
11

FlySearch

Fast Local File Search Using Lucene, HTMLParser and Highlighter Support Chinese now

Downloads: 0 This Week

Last Update: 2014-06-26
See Project
12

UCECS

The "Universal Content Evaluation and Categorisation Software" is a program for analysing a websites, or more generally, a texts content. The text is arranged in dozens of categories, permitting more efficient web searches and information processing.

Downloads: 0 This Week

Last Update: 2013-03-07
See Project
13

DocConversion

The DocConversion project provides a distributed document conversion solution with a well defined API which makes use of existing convstion tools and/or a centralized conversion server. This is part of the PRONIR research at http://www.pronir.nl

Downloads: 0 This Week

Last Update: 2013-04-02
See Project
14

UT Educational IR Package

This code supplies miniature pedagogical Java implementations of information retrieval, spidering, and text-processing software. It was initially developed for an introductory course on Intelligent Information Retrieval and Web Search in UT Austin.

Downloads: 0 This Week

Last Update: 2013-03-08
See Project
15

XML Highlighter

A highlighter for XML documents, written in Java. Uses regular expressions to search a set of DOM nodes, and transparently handles highlighting matches that span multiple elements. Highlight events are passed to a user defined highlighter for processing.

Downloads: 0 This Week

Last Update: 2013-02-25
See Project
16

CarpeNomen

A software tool to discover the names of people in electronic documents and HTML markup, note the use of the work 'discover' rather than search. Using this tool, the association bewteen names in documents can be inferred.

Downloads: 0 This Week

Last Update: 2013-04-12
See Project
17

NLP WebCrawler

A WebCrawler for Natural Language Processing. This WebCrawler searches for monolingual (in a specified language) and bilingual, parallel text.

Downloads: 0 This Week

Last Update: 2013-04-24
See Project