Open Source Linux Text Processing Software - Page 3

Text Processing Software for Linux

View 9 business solutions
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 1
    Colorer Library
    Colorer provides source text syntax highlighting services. It colorizes source codes in editor systems (more than 200 syntaxes). Uses powerful HRC format(XML, RE, context free grammas), allowing to support any language. Available as Eclipse plugin.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    compromise

    compromise

    Modest natural-language processing

    Language is complicated and there's a gazillion words. Compromise is a javascript library that interprets and pre-parses text and makes some reasonable decisions so things are way easier. Compromise tries its best to parse text. it is small, quick, and often good-enough. It is not as smart as you'd think. Conjugate and negate verbs in any tense. Play between plural, singular and possessive forms. Interpret plain-text numbers. Handle implicit terms. Use it on the client-side or as an es-module. compromise is 180kb (minified). It's pretty fast. It can run on keypress. It works mainly by conjugating all forms of a basic word list. Decide how words get interpreted or make heavier changes with a compromise-plugin. Parse text without running POS-tagging. Pre-parse any match statements for faster lookups. It is not the most accurate, or clever nlp library, but found its niche as an easy, small library that can run everywhere.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    fastText

    fastText

    Library for fast text classification and representation

    FastText is an open-source, free, lightweight library that allows users to learn text representations and text classifiers. It works on standard, generic hardware. Models can later be reduced in size to even fit on mobile devices. ext classification is a core problem to many applications, like spam detection, sentiment analysis or smart replies. In this tutorial, we describe how to build a text classifier with the fastText tool. The goal of text classification is to assign documents (such as emails, posts, text messages, product reviews, etc...) to one or multiple categories. Such categories can be review scores, spam v.s. non-spam, or the language in which the document was typed. Nowadays, the dominant approach to build such classifiers is machine learning, that is learning classification rules from examples. In order to build such classifiers, we need labeled data, which consists of documents and their corresponding categories (or tags, or labels).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    tika-python

    tika-python

    Python binding to the Apache Tika™ REST services

    A Python port of the Apache Tika library that makes Tika available using the Tika REST Server. This makes Apache Tika available as a Python library, installable via Setuptools, Pip and easy to install. To use this library, you need to have Java 7+ installed on your system as tika-python starts up the Tika REST server in the background. To get this working in a disconnected environment, download a tika server file (both tika-server.jar and tika-server.jar.md5, which can be found here) and set the TIKA_SERVER_JAR environment variable to TIKA_SERVER_JAR="file:////tika-server.jar" which successfully tells python-tika to "download" this file and move it to /tmp/tika-server.jar and run as a background process. This is the only way to run python-tika without internet access. Without this set, the default is to check the tika version and pull latest every time from Apache.
    Downloads: 1 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    ADAPRO

    ADAPRO

    Word processor oriented for learning difficulties.

    ADAPRO is a free-to-use word processor geared towards individuals with a learning difficulty like dyslexia or a developmental disorder such as autism. Its adapted, seamless and configurable interface provides a simplified environment that can be relied, fostering the user's sustained attention. It can be downloaded and used completely free of charge for any purpose. Supports English, Spanish and Portuguese. If Java 6 or higher is already present on the computer, it does not even require installation. This project has been part-funded by the European Regional Development Fund under the PCT-MAC 2007-2013 programme.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 6
    Notepad3

    Notepad3

    Light-weight Scintilla-based text editor with syntax highlighting

    Notepad3 is a fast and light-weight Scintilla-based text editor with syntax highlighting. Notepad3 is an excellent replacement for the default Windows text editor. Notepad3 offers many extra features over Notepad. It has a small memory footprint, but is powerful enough to handle most programming jobs.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 7
    Text Encoding Initiative

    Text Encoding Initiative

    TEI produces the TEI Guidelines and associated software

    The TEI is an international and interdisciplinary standard used by libraries, museums, publishers, and academics to represent all kinds of literary and linguistic texts, using an encoding scheme that is maximally expressive and minimally obsolescent.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    Diff-ext is an extension for filemanagers such as Windows Explorer and Nautilus that allows to launch diff/merge tools on selected files.
    Leader badge
    Downloads: 7 This Week
    Last Update:
    See Project
  • 9
    PTools is a set of useful tools written in Pascal. It includes: scientific calculator, archiver, text editor, remote adminitration and more. It is designed to be portable across operating systems, specially Java-based mobiles, Windows and Unixes.
    Downloads: 7 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 10
    Create beautiful song books for your church or fellowship using this LaTeX package and related tools.
    Leader badge
    Downloads: 8 This Week
    Last Update:
    See Project
  • 11

    joy of text

    Editor with scripting language, security features & system interfaces.

    Jot was developed general purpose editor for large CAD files. It's command-driven UI requires no mode switching and hence requires fewer keystrokes to get a typical job done. It is particularly useful for checking and cross-referencing between several source, intermediate and output files - a common requirement for CAD work. But jot's usefulness doesn't stop there. It's sophisticated search features can, for example, be used for interactive data mining or automating the extraction of numerical and textual data and reports from arrays of large text files. It's adaptable user interface, can be programmed to emulate emacs , vi UIs or mouse-driven systems - but who would want to do a thing like that? The display is highly configurable supporting popups, menus-event mouse callbacks etc. The jot language is terse, powerful and well supported, having a useful debugger and many diagnostic features. Importantly, no mode change is required to enter commands in it's native language.
    Downloads: 22 This Week
    Last Update:
    See Project
  • 12
    Regular Expression Editor (RegExpEditor)

    Regular Expression Editor (RegExpEditor)

    regex as a tool, not as a problem

    Regular Expressions (aka regex, regexp) made easy. This simple tool manipulates text with regular expressions. Highlighting of regular expression results. See the real power of regex! Use Scala to do manipulate your search results even more.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 13
    GATE
    NOTE THAT THE SOURCE CODE AND ISSUE TRACKER HAVE NOW MOVED TO GITHUB. FIND US AT https://github.com/GateNLP/ GATE (General Architecture for Text Engineering) is an architecture, framework and development environment for developing, evaluating and embedding Human Language Technology. See http://gate.ac.uk for full details.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    Early Access iText, a PDF generation library in Java
    Downloads: 11 This Week
    Last Update:
    See Project
  • 15
    Leader badge
    Downloads: 17 This Week
    Last Update:
    See Project
  • 16
    NoteCase is portable (Linux, Win32) hierarchical notes manager (aka outliner) coded in C++ using GTK+ toolkit. Project is inactive as of 2008/12/09, read more at: http://factoriel.blogspot.com/2008_12_01_archive.html
    Downloads: 15 This Week
    Last Update:
    See Project
  • 17
    Tomoe is a handwriting character recognition engine.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 18
    RText is a customizable programmer's text editor written in Java. Some of its features include: syntax highlighting, editing multiple documents at once, printing and print preview, find/replace/find in files dialogs, undo/redo, and online help.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19
    WYSIWYG .NET

    WYSIWYG .NET

    WYSIWYG html editor for .NET (C#, VB.NET)

    WYSIWYG .NET editor is an HTML editor that attempt to display the web page as it will show on the browser. It's a visual editor, and you don’t manipulate the code directly.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 20
    Sinhala Unicode Writer

    Sinhala Unicode Writer

    Sinhala Unicode Writer For Linux

    Sinhala Unicode Writer is a sinhala unicode writing tool for linux.This application will help you to type whatever you want in Sinhala just like you are typing an SMS. It is very easy and simple to use.All you have to do is type in your text with English keyboard, meantime app transliterates this text in to Sinhala for you Offline.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 21
    Jaxe
    Jaxe is a free Java XML editor with a configurable GUI, using XML schemas for validation and XSL for exports in HTML or XML.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    Web Book Downloader

    Web Book Downloader

    Download websites as e-book: pdf, txt, epub.

    This application allows user to download chapters from website in 3 ways: - from table of contents; - from range: first chapter address, last chapter address; - by crawling from first chapter to n; In settings you can customize language, input(website encoding) for simplicity output is in the same encoding. If you want your language add new class into strings package, and new fields into Settings class and GUI menu(initialize method).
    Downloads: 7 This Week
    Last Update:
    See Project
  • 23
    The XSD editor is a cross-platform XML editor. Although it can be used to edit any type of XML file, the editor is specifically designed to allow easy creation, editing, and validation of XML Schema (XSD) files.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 24
    An alternative to the string library for C and C++ which is more functional and does not have buffer overflow problems.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 25
    Conky GUI
    Conky GUI eases the customization of Conky configuration files.
    Downloads: 5 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB