88 projects for "text processing" with 2 filters applied:

  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    Unredact

    Unredact

    A simple tool for reading in poorly redacted documents

    Unredact is a specialized tool that attempts to reconstruct redacted or obscured text in images, PDFs, or screenshots using a combination of image processing and generative AI inference to suggest plausible completions of blurred, black-boxed, or jumbled content. Unlike traditional optical character recognition (OCR), which only reads visible text, Unredact focuses on inferring missing content where redaction has been applied by analyzing surrounding context, font characteristics, and linguistic patterns to produce candidate reconstructions. ...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 2
    PDFCraft

    PDFCraft

    PDFCraft is a free, privacy-focused PDF toolkit

    PDFCraft is an extensible toolkit for creating, editing, and transforming PDF documents with both a graphical interface and a scripting API, making it useful for users ranging from casual editors to automated document processors. At its core, the project provides a clean, modern UI where you can rearrange pages, annotate text, insert images, fill forms, and export to multiple formats, all without needing a heavyweight commercial PDF suite. But beyond manual editing, it also offers a...
    Downloads: 31 This Week
    Last Update:
    See Project
  • 3
    XML Copy Editor
    XML Copy Editor is a fast, free, validating XML editor.
    Leader badge
    Downloads: 712 This Week
    Last Update:
    See Project
  • 4
    jsonrepair

    jsonrepair

    Repair invalid JSON documents

    ...It is especially useful in workflows involving AI-generated content, manually edited configuration files, or unreliable external APIs where malformed JSON frequently occurs. jsonrepair supports both browser and Node.js environments, making it suitable for client-side validation tools and backend processing pipelines alike. The project focuses on automation and fault tolerance, reducing the need for manual cleanup of corrupted JSON data. Its lightweight architecture and practical functionality have made it valuable for modern applications that process unpredictable structured text.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 5
    biber
    Biber is a sophisticated bibliography processing backend for the LaTeX biblatex package. It supports a unsurpassed feature set for automated conformance to complex bibliography style requirements such as labelling, sorting and name handling. It has comprehensive Unicode support.
    Leader badge
    Downloads: 332 This Week
    Last Update:
    See Project
  • 6
    Microsoft Works format import library
    libwps is a Microsoft Works file format import filter based on top of the librevenge (see https://sourceforge.net/p/libwpd/wiki/librevenge/ ). Currently, libwps can import all word processing Works formats since about 1995 with some success. It may also be able to import some basic database and spreadsheet files.
    Leader badge
    Downloads: 338 This Week
    Last Update:
    See Project
  • 7
    biblatex
    Biblatex is a LaTeX package which provides full-featured bibliographic facilities
    Leader badge
    Downloads: 25 This Week
    Last Update:
    See Project
  • 8

    xmlj

    XMLJ is a Java XML Editor and validator project.

    XMLJ is a Java XML Editor and validator project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 3 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 10

    Change File Encoding

    Change encoding of text files.

    Change File Encoding is a utility that allows you to change the encoding of text files. For example, files saved in US-ASCII can be converted to UTF-8. Over 170 encodings are supported. Requires Java 1.8 or higher.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    EpiDoc: Epigraphic Documents in TEI XML

    EpiDoc: Epigraphic Documents in TEI XML

    XML text markup for ancient documents

    The EpiDoc Collaborative is developing specifications and tools for standards-based, digital publication and interchange of scholarly and educational editions of documentary and literary texts like inscriptions and papyri. The link below will take you to the EpiDoc home page on this site.
    Leader badge
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    RTextDoc

    RTextDoc

    An editor for structured documents

    RTextDoc is an editor for structured text documents such as LaTeX, AsciiDoc, DocBook. RTextDoc has proofreading capabilities: on-the-fly spelling, instant grammar checking and built-in free dictionaries. RTextDoc has syntax highlighting, bracket matching, folding, document structure browser for sections and labels, bookmarks, manager for LaTeX symbols, an editor for mathematical equations,integrated BibTeX database manager and several tools to convert LaTeX to HTML and back. AsciiDoc...
    Leader badge
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    XSLT syntax highlighting

    XSLT syntax highlighting

    Java based XSLT Processor extension for syntax highlighting

    Please note that project moved to GitHub: https://github.com/xmlark/xslthl This is an implementation of syntax highlighting as an extension module for XSLT processors (Xalan, Saxon), so if you have e.g. article about programming written in DocBook, code examples can be automatically syntax highlighted during the XSLT processing phase.
    Leader badge
    Downloads: 82 This Week
    Last Update:
    See Project
  • 14
    Fen2eps is a small console program that converts FEN (Forsyth Edwards Notation) strings to EPS (Encapsulated Postscript) files containing the chess board diagram.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. The latest...
    Leader badge
    Downloads: 202 This Week
    Last Update:
    See Project
  • 16
    unfluff

    unfluff

    Automatically extract body content (and other cool stuff) from HTML

    unfluff is a Node.js library designed to automatically extract the main content from an HTML document — stripping away navigation bars, ads, footers and other boilerplate to leave you with the “body content”, metadata (title, author, date) and other useful fields. It’s a tool very much aimed at content-analysis, web scraping, building datasets, or repurposing article text for downstream processing (like machine-learning or summarization). The API is simple: you feed in raw HTML and it returns a structured object with the extracted text and other fields. It supports caching internal representations to speed up repeated extractions. While its language support is best for English, it is still widely used in web-content-processing pipelines. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Create beautiful song books for your church or fellowship using this LaTeX package and related tools.
    Leader badge
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    A Swiss Army Knife GUI application for PDF documents: combine, split, rotate, reorder (n-up, booklet), watermark, edit bookmarks/fileinfo/pagetransition, compress, encrypt, decrypt, sign, repair, edit attachments and more.
    Leader badge
    Downloads: 75 This Week
    Last Update:
    See Project
  • 19
    Gallop

    Gallop

    A framework for build smooth asynchronous iOS APP

    Gallop is a powerful rich text framework that supports Asynchronous display. It encapsulates CoreText's rich text functions and commonly used image processing capabilities. just need use LWTextStorage object instead of UILabel object and use LWImageStorage object instead of UIImageView object,Gallop will make sure your app scroll smoothly. You can also use Gallop to parse HTML pages and customize machining to parse HTML pages into iOS native pages.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    Jaxe
    Jaxe is a free Java XML editor with a configurable GUI, using XML schemas for validation and XSL for exports in HTML or XML.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 22
    This is an Eclipse xml editor with several edition capabilities. The main features concern the interaction with the classes and resources declared in xml (Open class/resource, Create class), similar to the interaction between classes in java editor.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    A java-based parser for parsing/grabbing web sites and other text or XML documents, based on a nondeterministic parser language, creating XML output. Also contains a few utility classes for HTML, CSV and text parsing, and additional character sets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Simple Java delimited and fixed width file parser. Handles CSV, Excel CSV, Tab, Pipe delimiters, just to name a few. Maps column positions in the file to user friendly names via XML. See "FlatPack Feature List" under News for complete feature list.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    writeup
    Programming language for converting source documents into HTML or XML. Writeup is a combination of a markup language (similar to markdown) and a macro pre-processing language that enables a formal production system to be set up for documents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next
MongoDB Logo MongoDB