23 projects for "text processing" with 2 filters applied:

  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    unfluff

    unfluff

    Automatically extract body content (and other cool stuff) from HTML

    unfluff is a Node.js library designed to automatically extract the main content from an HTML document — stripping away navigation bars, ads, footers and other boilerplate to leave you with the “body content”, metadata (title, author, date) and other useful fields. It’s a tool very much aimed at content-analysis, web scraping, building datasets, or repurposing article text for downstream processing (like machine-learning or summarization). The API is simple: you feed in raw HTML and it returns a structured object with the extracted text and other fields. It supports caching internal representations to speed up repeated extractions. While its language support is best for English, it is still widely used in web-content-processing pipelines. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Gallop

    Gallop

    A framework for build smooth asynchronous iOS APP

    Gallop is a powerful rich text framework that supports Asynchronous display. It encapsulates CoreText's rich text functions and commonly used image processing capabilities. just need use LWTextStorage object instead of UILabel object and use LWImageStorage object instead of UIImageView object,Gallop will make sure your app scroll smoothly. You can also use Gallop to parse HTML pages and customize machining to parse HTML pages into iOS native pages.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
    Downloads: 2 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    A java-based parser for parsing/grabbing web sites and other text or XML documents, based on a nondeterministic parser language, creating XML output. Also contains a few utility classes for HTML, CSV and text parsing, and additional character sets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    DocFrac is a document converter that can convert between RTF, HTML and ASCII text. This includes RTF to HTML and HTML to RTF. Supports text formatting (e.g. bold); tables; and most European languages. Available for Windows; Linux; ActiveX and DLL.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    A stand-alone editor using Mediawiki markup language to generate HTML code. You can create and preview pages written using Mediawiki markup (i.e. Wikipedia pages) while off-line.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    nanoWIME is a simple, flexible, easy-to-use javascript based WikiMarkup editor.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    RTF2HTML is a name for a cross-platform C++ library (DLL, OCX) and command-line utility, which is intended to convert documents from Rich Text Format (e.g. Word, OO Writer) to HTML. Its features are tiny size, speed, low mem usage and compact output.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 10
    A JavaScript library for parsing Creole 1.0 wiki markup.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    A freely-available Markdown text-to-HTML translator, written in C++, intended for integration into C++ programs rather than for use in web applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    NOTE: unsupported - do you want to maintain this project? contact me! Markdownify is a HTML to Markdown converter written in PHP. See it as the successor to `html2text.php` since it has better design, better performance and less corner cases.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    xBB-code is the PHP library to parse and edit text formatted with BBCode.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    ZML, the Zeitung Markup Language, is a simple CMS for small newspapers. It was specifically designed to publish a student newspaper in print and on the Web. It uses LaTeX and XHTML. So far, it is documented in German only.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Web Content Management Element (WCME) is an editable area on browser page. User can change text and text styles (CSS) then persistently save changed content into web application resources. WCME is written on Javascript with Prototype.js library.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    JLoom is a JSP like template language for text generation - e.g. source code, HTML, XML. JLoom templates are modular encapsulated. Parameters can be any Java type, even Generics or Varargs. There is a plugin for Eclipse and a command line tool.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    RTF to HTML converter for use both with your applications and as a standalone tool. Small and fast. Processes tables better than any other tool I've seen.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    Use Xilize to create XHTML pages or entire websites with just a plain-text editor. The markup is similar to Textile and extensible via BeanShell. Run as a jEdit plugin, from the command line, or embed in a Java program. Small, fast, easy-to-use.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Simple plain text layout library. Can be utilized for html-to-text (html2text) conversion with its HTML reading support.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Markout is a pure-Java lightweight wiki markup parser based on John Gruber's Markdown.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Strip out useless tags and other junk from HTML files. Shrink files, enhance readability of HTML source, promote privacy, and clean HTML exported from Microsoft Word (MS-Word). Run HTMLStrip as-is or customize it with your own regular expressions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Lightweight system for running a weblog. Features multiple authors, topics, Trackback, RSS (amongst others). TruBlog comes with easy installation and strong caching mechanisms, it's localisable and produces a valid XHTML. Theming is done through CSS.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    OpenNabu is a standalone personal wiki, featuring fast and simple content creation combined with highly customizable HTML export. All information is stored in plain text files and no additional server or database is needed.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB