Showing 62 open source projects for "html source extractor"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 1
    Nokogiri

    Nokogiri

    Tool to work with XML and HTML from Ruby

    Nokogiri (鋸) makes it easy and painless to work with XML and HTML from Ruby. It provides a sensible, easy-to-understand API for reading, writing, modifying, and querying documents. It is fast and standards-compliant by relying on native parsers like libxml2 (C) and xerces (Java). Be secure-by-default by treating all documents as untrusted by default. Be a thin-as-reasonable layer on top of the underlying parsers, and don't attempt to fix behavioral differences between the parsers. "Native...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    Lexbor

    Lexbor

    Lexbor is development of an open source HTML Renderer library

    Lexbor is the development of a web browser engine available as a software library; it ships with a free license and has no extra dependencies. For us, speed is an absolute must-have. In our development process, we focus on fastest parsing techniques for HTML, CSS, and fonts, fastest data processing methods, and fastest ways to serve content to end users. Whether you are building a backend that handles millions of HTML documents or a UI-heavy user app, your software’s response rate always...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    MultiMarkdown-6

    MultiMarkdown-6

    Lightweight markup processor to produce HTML, LaTeX, and more

    Lightweight markup processor to produce HTML, LaTeX, and more. MultiMarkdown is a superset of the Markdown lightweight markup syntax with support for additional output formats and features. Markdown is a text-to-HTML conversion tool for web writers. Markdown allows you to write using an easy-to-read, easy-to-write plain text format, then convert it to structurally valid XHTML (or HTML).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4

    Exlibris HTML renderer

    Exlibris is a console (text mode) HTML renderer.

    Exlibris is a console (text mode) HTML renderer.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    Tidy

    Tidy

    The granddaddy of HTML tools, with support for modern standards

    The granddaddy of HTML tools. Supports modern standards. Thanks to the efforts of HTACG and prominent contributors, HTML Tidy has a whole new heartbeat and a whole new life. Tidy tidies HTML and XML. It can tidy your documents by itself, and developers can easily integrate its features into even more powerful tools. Tidy is a console application for macOS, Linux, Windows, UNIX, and more. It corrects and cleans up HTML and XML documents by fixing markup errors and upgrading legacy code to...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    Flat file extractor can be used for reading and parsing different flat file structures and printing them in different formats. ffe is a command line tool developed in GNU/Linux environment and it is distributed under GPL. Project moved to https://github.com/igitur/ffe
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7

    Obscure-Extractor-GTK

    Extract files from unusual archive formats

    Small Gtk program to extract files from (mostly) game archive formats. Currently supports Neverwinter Nights, Homeworld 2, BloodRayne, WC IV and a "generic" module to find RIFF, BMP, PNG and other files. Old Delphi version supports a few others.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    xmlfy

    xmlfy

    Convert to XML on the fly

    xmlfy converts text/UTF based output into XML formatted output using schema files and/or options to control its behaviour. By Arthur Gouros.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    A common markup language and a parser to generate documentation in any target format (Html, Latex, Trac, Mediawiki...). The core command relies on a Tcl library: it is easy to create new target formats. Doc files are parameterizable via a header.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 10

    ftdetector

    File type detector library

    This project is a tool to detect file types by signatures and mime types. It uses hash tables to make the detection of a file type as fast as possible. The signature and mime types lists are stored at simple user-friendly files. This file type detector supports a lot of formats (image, archive, text, documents, audio, video, fonts and others). It also includes Microsoft OLE compound file types. The detector's algorythm has special features to detect text file types like (HTML, XML, JSON,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    libdropbox

    Small ANSI C lib for dropbox/windows azure communication

    Small ANSI C lib for dropbox and windows azure communication. Built for small platforms. Using PolarSSL for https communication. Features a small self contained https module and a modified version of the JSMN json parser. Originally based on the dropbox_uploader script. Able to do most dropbox actions. Eg. Upload file, download file, list, file info, account info, share link. Also contains a small CLI programs that interfaces with the lib. Also capable of windows azure service bus...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    MyHTML

    MyHTML

    Fast C/C++ HTML 5 Parser

    Fast C/C++ HTML 5 Parser. Using threads. MyHTML is a fast HTML Parser using Threads implemented as a pure C99 library with no outside dependencies. Asynchronous Parsing, Build Tree, and Indexation. Fully conformant to the HTML5 specification. Two APIs - high and low-level. Manipulation of elements: add, change, delete, and others. Manipulation of elements attributes: add, change, delete, and other. Support 39-character encoding. Support detecting character encodings. Support Single Mode...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Rbmake is both a library of routines and a set of command-line utilities that enables a user to transform content into Rocket Ebook format (.rb) files and back again (unencrypted files only). Compatible with the Rocket Ebook and the REB 1100.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14

    Z Notation E-Mail Mark-up Tools

    Tools to convert Z mark-up to HTML or text.

    A small library and two command-line tools to parse and convert Z notation from the "e-mail" mark-up into HTML code, or into UTF-8 text with box-drawing graphics, or into the Z Standard text format. See the project's Wiki Home Page for details --- the "Wiki" button in the bar above, or the following link:
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Decodes CGI data from standard input, $QUERY_STRING , and $HTTP_COOKIE. Stores data in lookup table(s) for easy retrieval. Uploads files by copying directly to files created with mkstemp(). Has several handy string conversion functions.
    Downloads: 28 This Week
    Last Update:
    See Project
  • 16
    Phantom Has A New Template Oriented Mask Phantom is a php/cgi extension that allow to design efficiently html content for a php base website, written in PHP/C/C++. Phantom has been designed for speed and expandability.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Gumbo

    Gumbo

    An HTML5 parsing library in pure C99

    Gumbo is an implementation of the HTML5 parsing algorithm implemented as a pure C99 library with no outside dependencies. It's designed to serve as a building block for other tools and libraries such as linters, validators, templating languages, and refactoring and analysis tools. Gumbo gains some of this by virtue of being written in C, but it is not an important consideration for the intended use-case, and was not a major design factor. Gumbo is intentionally designed to turn an HTML...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    trafalgar.map

    trafalgar.map

    Open Street Map (OSM) tools, OSM/XML parser, tag extractor

    This is going to be a set of tools which is intended to be used with huge OSM files like the planet files in XML format. The parser reads directly from packed *.gz files and it is not needed to unpack the OSM/XML data files to the local disk. Now in 0.3.0: osm_tags: tag analyzer (like tag watch) osm_split: split osm file in single files for nodes, ways and relations and collect some meta information (will be used as input for other tools). osm_cut: create rectangular...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    DocFrac is a document converter that can convert between RTF, HTML and ASCII text. This includes RTF to HTML and HTML to RTF. Supports text formatting (e.g. bold); tables; and most European languages. Available for Windows; Linux; ActiveX and DLL.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Multi-connection command line tool to download Internet sites. Similar to wget and cURL, but it manages up to 50 parallel links. Main features are: recursive fetching, Metalink retrieving, segmented download and image filtering by width and height.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 21
    entdec

    entdec

    This is the simple program lets you decode file, conains html entities

    This program have been written for decoding files, contains html entities to utf-8 encoded file for simple editing it. The main applying of this program - decode html files, prodused by tex to html converter htlatex, uses to publishing your scientific articles and other works in web. So, it can be used by web programmes for writing gateways applications, same as such finctions, relised, e.d. in perl or php programming language. Texnical description: You have file, contains html...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    This is a JavaScriptStream generator. I consider it as a professionnal tool that target webmasters. It convert HTML formatted document to a serie of JavaScript text output commands. It is a good tool to help write HTML.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Read, parse, merge and write RSS (and Atom) feeds. It has some other functions build-in like text, html, property file output or templates with custom tags to insert RSS feeds into pages that could be uploaded to a server that supports only static html
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    HXPath

    XPath HTML parser

    HXPath is a command line tool useful to extract data from HTML documents. HXPath can select sub trees, like the standard xpath tool, but is also able to read contents and attributes and output them in a bash friendly format. HTML Tidy and HTTP/HTTPS get are built in too.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    svg2js is an SVG to HTML 5 canvas javascript converter.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB