Open Source Linux Text Processing Software - Page 3

Text Processing Software for Linux

View 9 business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Find Hidden Risks in Windows Task Scheduler Icon
    Find Hidden Risks in Windows Task Scheduler

    Free diagnostic script reveals configuration issues, error patterns, and security risks. Instant HTML report.

    Windows Task Scheduler might be hiding critical failures. Download the free JAMS diagnostic tool to uncover problems before they impact production—get a color-coded risk report with clear remediation steps in minutes.
    Download Free Tool
  • 1
    PDFBox is a Java PDF Library. This project will allow access to all of the components in a PDF document. More PDF manipulation features will be added as the project matures. This ships with a utility to take a PDF document and output a text file.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    Camomile is a Unicode library for ocaml. Camomile provides Unicode character type, UTF-8, UTF-16, UTF-32 strings, conversion to/from about 200 encodings, collation and locale-sensitive case mappings, and more.
    Downloads: 38 This Week
    Last Update:
    See Project
  • 3
    Early Access iText, a PDF generation library in Java
    Downloads: 19 This Week
    Last Update:
    See Project
  • 4
    Notepad3

    Notepad3

    Light-weight Scintilla-based text editor with syntax highlighting

    Notepad3 is a fast and light-weight Scintilla-based text editor with syntax highlighting. Notepad3 is an excellent replacement for the default Windows text editor. Notepad3 offers many extra features over Notepad. It has a small memory footprint, but is powerful enough to handle most programming jobs.
    Downloads: 18 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 5
    xmltoman and xmlmantohtml are two small scripts to convert xml to man pages in groff format or html. It features the usual man page items such a "description", "options", "see also" etc.
    Leader badge
    Downloads: 31 This Week
    Last Update:
    See Project
  • 6
    Vrapper

    Vrapper

    Vim-like editing in Eclipse

    Vrapper is an eclipse plugin which acts as a wrapper for existing eclipse text editors to provide a Vim-like input scheme for moving around and editing text. Eclipse Update Site: http://vrapper.sourceforge.net/update-site/stable
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    Leader badge
    Downloads: 30 This Week
    Last Update:
    See Project
  • 8
    Create beautiful song books for your church or fellowship using this LaTeX package and related tools.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 9
    Stanford CoreNLP

    Stanford CoreNLP

    Stanford CoreNLP, a Java suite of core NLP tools

    CoreNLP is your one stop shop for natural language processing in Java! CoreNLP enables users to derive linguistic annotations for text, including token and sentence boundaries, parts of speech, named entities, numeric and time values, dependency and constituency parses, coreference, sentiment, quote attributions, and relations. CoreNLP currently supports 6 languages, Arabic, Chinese, English, French, German, and Spanish. The centerpiece of CoreNLP is the pipeline. Pipelines take in raw text, run a series of NLP annotators on the text, and produce a final set of annotations. Pipelines produce CoreDocuments, data objects that contain all of the annotation information, accessible with a simple API, and serializable to a Google Protocol Buffer. CoreNLP generates a variety of linguistic annotations, including parts of speech, named entities, dependency parses, and coreference.
    Downloads: 1 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 10
    TextBlob

    TextBlob

    TextBlob is a Python library for processing textual data

    Simple, Pythonic, text processing, Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more. TextBlob stands on the giant shoulders of NLTK and pattern, and plays nicely with both. Supports word inflection (pluralization and singularization) and lemmatization, as well as spelling correction. Add new models or languages through extensions. Also, it comes with a WordNet integration. If you only intend to use TextBlob’s default models (no model overrides), you can pass the lite argument. This downloads only those corpora needed for basic functionality. TextBlob is also available as a conda package.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    natural

    natural

    General natural language facilities for node

    "Natural" is a general natural language facility for nodejs. It offers a broad range of functionalities for natural language processing. Tokenizing, stemming, classification, phonetics, tf-idf, WordNet, string similarity, and some inflections are currently supported. It’s still in the early stages, so we’re very interested in bug reports, contributions and the like. Note that many algorithms from Rob Ellis’s node-nltools are being merged into this project and will be maintained from here onward. While most of the algorithms are English-specific, contributors have implemented support for other languages. Russian stemming has been added and Spanish stemming has been added, as well. Stemming and tokenizing in more languages have been added. If you’re just looking to use natural without your own node application, you can install via NPM.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Anaphraseus
    Anaphraseus is a CAT (Computer Aided Translation) tool, OpenOffice.org 2-3 macro set for OpenOffice/LibreOffice Writer similar to famous Wordfast. It works with the Wordfast Translation Memory format (*.TXT), and supports text segmentation.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    LaTeX-Mk is a collection of makefile fragments for managing small to large LaTeX based documentation projects. The idea is that especially large documents, there may be many many steps required to typeset the document (export modified figures to postscr
    Leader badge
    Downloads: 14 This Week
    Last Update:
    See Project
  • 14
    The XSD editor is a cross-platform XML editor. Although it can be used to edit any type of XML file, the editor is specifically designed to allow easy creation, editing, and validation of XML Schema (XSD) files.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 15
    PDF Clown

    PDF Clown

    General-Purpose PDF Library for Java and .NET

    PDF Clown is a general-purpose Java and .NET library for manipulating PDF files through multiple abstraction layers, rigorously adhering to PDF 1.7 specification (ISO 32000-1). This project aims to provide a universal access to PDF files (creation, reading, editing, rendering...) through an accurate and elegant object-oriented API. * Features: http://pdfclown.org/overview/features/ * Overview: http://pdfclown.org/overview/architecture/ * Website: http://pdfclown.org/ * Blog: http://www.pdfclown.org/blog/ * Twitter: https://twitter.com/PDFClown
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    RText is a customizable programmer's text editor written in Java. Some of its features include: syntax highlighting, editing multiple documents at once, printing and print preview, find/replace/find in files dialogs, undo/redo, and online help.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 17
    Text Hex Convert v2.1.1

    Text Hex Convert v2.1.1

    THC - Convert Hex, or String with ease!

    A tool for converting hex values with ease! Download now, and you wont get your head hurt again. This tool converts Text values to Hex values, and vice-versa. Written in VB.NET. Created solely for aiding ROM Hacking at first, but became rather an important tool. Source Code: https://sourceforge.net/u/kakarot1212/profile/
    Downloads: 10 This Week
    Last Update:
    See Project
  • 18
    Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    TeXML is an XML vocabulary for TeX. The processor transforms the TeXML markup into the TeX markup, escaping special and out-of-encoding characters. The intended audience is developers who automatically generate [La]TeX or ConTeXt files.
    Leader badge
    Downloads: 17 This Week
    Last Update:
    See Project
  • 20

    omegat-plugins

    OBSOLETE AS OF OMEGAT 3.0.3. DO NOT USE.

    OBSOLETE AS OF OMEGAT 3.0.3. DO NOT USE. Third-party plugins for OmegaT (https://sourceforge.net/projects/omegat)
    Downloads: 4 This Week
    Last Update:
    See Project
  • 21
    RefDB is a reference database and bibliography tool for SGML, XML, and LaTeX documents, sort of a Reference Manager or BibTeX for markup languages. It is portable and known to run on Linux, Free/NetBSD, OSX, Solaris, and Windows/Cygwin.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 22
    RTF to HTML converter for use both with your applications and as a standalone tool. Small and fast. Processes tables better than any other tool I've seen.
    Leader badge
    Downloads: 15 This Week
    Last Update:
    See Project
  • 23
    Utility to create a text table from delimited text input.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 24
    Track changes in LaTeX documents. The goal is to provide editing facilities as known from word processors like Microsoft Word or OpenOffice Writer for LaTeX. The project comprises a LaTeX package and additional software to accept/reject changes etc.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 25
    Web Book Downloader

    Web Book Downloader

    Download websites as e-book: pdf, txt, epub.

    This application allows user to download chapters from website in 3 ways: - from table of contents; - from range: first chapter address, last chapter address; - by crawling from first chapter to n; In settings you can customize language, input(website encoding) for simplicity output is in the same encoding. If you want your language add new class into strings package, and new fields into Settings class and GUI menu(initialize method).
    Downloads: 8 This Week
    Last Update:
    See Project