Showing 20 open source projects for "html parse"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    jsoup

    jsoup

    Java library for working with real-world HTML

    ...The parser will make every attempt to create a clean parse from the HTML you provide, regardless of whether the HTML is well-formed or not. You have HTML in a Java String, and you want to parse that HTML to get at its contents, or to make sure it's well formed, or to modify it. The String may have come from user input, a file, or from the web.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    wombat

    wombat

    Lightweight Ruby DSL for scraping structured data from web pages

    Wombat is a lightweight web crawling and scraping library written in Ruby that focuses on extracting structured data from web pages using a concise domain-specific language (DSL). It is designed to simplify the process of defining how information should be collected from HTML documents without requiring large amounts of scraping boilerplate code. Developers can declare the data fields they want and specify selectors or rules for retrieving them, allowing Wombat to parse and return structured results. The DSL approach helps make scraping definitions more readable and maintainable, especially when dealing with multiple fields or nested data structures. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    CyberScraper 2077

    CyberScraper 2077

    A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama

    CyberScraper 2077 is not just another web scraping tool – it's a glimpse into the future of data extraction. Born from the neon-lit streets of a cyberpunk world, this AI-powered scraper uses OpenAI, Gemini and LocalLLM Models to slice through the web's defenses, extracting the data you need with unparalleled precision and style.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    JDynamiTe, Dynamic Template in Java

    JDynamiTe, Dynamic Template in Java

    Dynamically generate documents from templates

    JDynamiTe is a tool which allows you to dynamically create documents in any format from "template" documents. And very few lines of code (or no line at all!) are needed to do that. Some typical usage domains of JDynamiTe are: - dynamic Web pages creation, - text document generation, - source code generation... In fact, it can be useful in any case where pre-defined documents (templates) have to be dynamically populated with data. The main benefit of JDynamiTe is to allow a true...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 5
    HyperSQL is like a doxygen plus javadoc for SQL, hypermapping SQL views, packages, procedures, and functions to HTML source code listings and showing all code locations where these are used.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6

    Z Notation E-Mail Mark-up Tools

    Tools to convert Z mark-up to HTML or text.

    A small library and two command-line tools to parse and convert Z notation from the "e-mail" mark-up into HTML code, or into UTF-8 text with box-drawing graphics, or into the Z Standard text format. See the project's Wiki Home Page for details --- the "Wiki" button in the bar above, or the following link:
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    NekoHTML is a simple HTML scanner and tag balancer that enables application programmers to parse HTML documents and access the information using standard XML interfaces.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 8

    Spondulas

    Spondulas is browser emulator designed to retrieve web pages for hunti

    Spondulas is browser emulator and parser designed to retrieve web pages for hunting malware. It supports generation of browser user agents, GET/POST requests, and SOCKS5 proxy. It can be used to parse HTML files sent via e-mail. Monitor mode allows a website to be monitored at intervals to discover changes in DNS or content over time. Autolog mode creates an investigation file that documents redirection chains. The retrieved web pages are parsed for links and reported to an output file. More information is available on the wiki.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Read, parse, merge and write RSS (and Atom) feeds. It has some other functions build-in like text, html, property file output or templates with custom tags to insert RSS feeds into pages that could be uploaded to a server that supports only static html
    Downloads: 0 This Week
    Last Update:
    See Project
  • Atera - an All-in-one platform for IT management Icon
    Atera - an All-in-one platform for IT management

    Ideal for IT departments and MSPs (managed service providers)

    Your IT essentials, integrated & elevated. Take your IT management from automated to autonomous, download Atera's agent to start your free trial!
    Try Atera now
  • 10
    Wow Log Parser is a combat log parser for the game World of Warcraft. The purpose of the program is to parse the files generated with the /combatlog command. The source code can be found on: http://www.gurre.eu/wowlogparser/forum
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    xBB-code is the PHP library to parse and edit text formatted with BBCode.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    mod_tidy is a TidyLib based DSO module for the Apache HTTP Server Version 2 to parse, clean-up and pretty-print the webservers' (X)HTML output.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Web analyzer for logs from different formats, which output XML reports, multi-hosts logs file supported, possibility to apply an XSL page to ouput in HTML, and use of SVG to make the graphs. The project includes the library to parse HTTP_USER_AGENT
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    GoldSeeker is a small formatted data extraction application. It can parse informations from a text, html or other file, and export it in a database.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    PHP-based news aggregator, can load both RSS and parse HTML using regular expressions. Very customizable. Can be used from everywhere using web browser.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Parse formatted man pages and man page source from most flavors of UNIX. Convert to HTML, ASCII, TkMan, DocBook, and other formats.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    Java API to process or parse HTML documents. If your Java application needs or would like to be able to process some text in HTML format, you'd probably find this API interesting.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    The ŤType-o-graphť project has been written as a PHP function aimed to parse simple text and give "Rich HTML" as an output, e.g. add quotes instead of inch sings ('), clue short words using <nobr> tags and so on.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Suite of multi-language templating system classes, loosely based on PEAR/PHPlib template class. The classes may be used to parse any text based template (HTML, ASCII, text, etc). Already available: - JScript - Python - PHP Planned: - VBscript - Ja
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    LogAnal is a quick hack to parse Apache Log Files and produce graphical and textual web server statistics. Works in incremental mode only. Supports Templates for the output HTML, as well as localization (defaults to English).
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo