30 projects for "data processing" with 2 filters applied:

  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    Lexbor

    Lexbor

    Lexbor is development of an open source HTML Renderer library

    Lexbor is the development of a web browser engine available as a software library; it ships with a free license and has no extra dependencies. For us, speed is an absolute must-have. In our development process, we focus on fastest parsing techniques for HTML, CSS, and fonts, fastest data processing methods, and fastest ways to serve content to end users. Whether you are building a backend that handles millions of HTML documents or a UI-heavy user app, your software’s response rate always matters to users and developers alike. Lexbor’s code is optimized for ease of access in end-user applications and across programming languages. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    CSSBox

    CSSBox

    Pure Java HTML / CSS rendering engine

    CSSBox is an (X)HTML/CSS rendering engine written in pure Java. Its primary purpose is to provide a complete information about the rendered page suitable for further processing. However, it also allows displaying the rendered document.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 4
    unfluff

    unfluff

    Automatically extract body content (and other cool stuff) from HTML

    unfluff is a Node.js library designed to automatically extract the main content from an HTML document — stripping away navigation bars, ads, footers and other boilerplate to leave you with the “body content”, metadata (title, author, date) and other useful fields. It’s a tool very much aimed at content-analysis, web scraping, building datasets, or repurposing article text for downstream processing (like machine-learning or summarization). The API is simple: you feed in raw HTML and it...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    Gallop

    Gallop

    A framework for build smooth asynchronous iOS APP

    Gallop is a powerful rich text framework that supports Asynchronous display. It encapsulates CoreText's rich text functions and commonly used image processing capabilities. just need use LWTextStorage object instead of UILabel object and use LWImageStorage object instead of UIImageView object,Gallop will make sure your app scroll smoothly. You can also use Gallop to parse HTML pages and customize machining to parse HTML pages into iOS native pages. Use Gallop Building complex rich text...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    A java-based parser for parsing/grabbing web sites and other text or XML documents, based on a nondeterministic parser language, creating XML output. Also contains a few utility classes for HTML, CSV and text parsing, and additional character sets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    jStyleParser

    Java CSS parser and DOM style assignment library

    jStyleParser is a CSS parser written in Java. It has its own application interface that is designed to allow an efficient CSS processing in Java and mapping the values to the Java data types. It is also able to apply the parsed style sheets to a DOM that represents an HTML or XML document and to compute the resulting style of the individual document elements. It supports CSS 2.1 and a large subset of CSS3.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    now here: https://github.com/plastex/plastex plasTeX is a Python-based LaTeX document processing framework. It gives DOM-like access to a LaTeX document, as well as the ability to generate mulitple output formats (e.g. HTML, DocBook, tBook, etc.).
    Downloads: 0 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    A stand-alone editor using Mediawiki markup language to generate HTML code. You can create and preview pages written using Mediawiki markup (i.e. Wikipedia pages) while off-line.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Storm MVC is a php framework based on the model-view-controller design pattern featuring pretty URLs, site themes via inherited master pages, and easy forms processing. It is a mix of the best ideas from Rails, Django and ASP.NET MVC.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    nanoWIME is a simple, flexible, easy-to-use javascript based WikiMarkup editor.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    ServingXML is an open source, Apache 2.0 licensed, framework for flat/XML data transformations. It defines an extensible markup vocabulary for expressing flat-XML, XML-flat, flat-flat, and XML-XML processing in pipelines.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    RTF2HTML is a name for a cross-platform C++ library (DLL, OCX) and command-line utility, which is intended to convert documents from Rich Text Format (e.g. Word, OO Writer) to HTML. Its features are tiny size, speed, low mem usage and compact output.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    A JavaScript library for parsing Creole 1.0 wiki markup.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    A freely-available Markdown text-to-HTML translator, written in C++, intended for integration into C++ programs rather than for use in web applications.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    NOTE: unsupported - do you want to maintain this project? contact me! Markdownify is a HTML to Markdown converter written in PHP. See it as the successor to `html2text.php` since it has better design, better performance and less corner cases.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Webiyo (pronounced "webby-O") is a small Java 1.5 library containing classes for generating web pages, processing forms, and unit-testing web sites. Since no template files are used, it allows you to take full advantage of your IDE's refactoring tools.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    xBB-code is the PHP library to parse and edit text formatted with BBCode.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    ZML, the Zeitung Markup Language, is a simple CMS for small newspapers. It was specifically designed to publish a student newspaper in print and on the Web. It uses LaTeX and XHTML. So far, it is documented in German only.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Web Content Management Element (WCME) is an editable area on browser page. User can change text and text styles (CSS) then persistently save changed content into web application resources. WCME is written on Javascript with Prototype.js library.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    JLoom is a JSP like template language for text generation - e.g. source code, HTML, XML. JLoom templates are modular encapsulated. Parameters can be any Java type, even Generics or Varargs. There is a plugin for Eclipse and a command line tool.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    RTF to HTML converter for use both with your applications and as a standalone tool. Small and fast. Processes tables better than any other tool I've seen.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    Use Xilize to create XHTML pages or entire websites with just a plain-text editor. The markup is similar to Textile and extensible via BeanShell. Run as a jEdit plugin, from the command line, or embed in a Java program. Small, fast, easy-to-use.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Simple plain text layout library. Can be utilized for html-to-text (html2text) conversion with its HTML reading support.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB