Showing 186 open source projects for "java html parser"

View related business solutions
  • Cloud-based help desk software with ServoDesk Icon
    Cloud-based help desk software with ServoDesk

    Full access to Enterprise features. No credit card required.

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
    Try ServoDesk for free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    LOL HTML

    LOL HTML

    Low output latency streaming HTML parser/rewriter with CSS API

    Low Output Latency streaming HTML rewriter/parser with CSS-selector based API. It is designed to modify HTML on the fly with minimal buffering. It can quickly handle very large documents, and operate in environments with limited memory resources. The crate serves as a back-end for the HTML rewriting functionality of Cloudflare Workers, but can be used as a standalone library with a convenient API for a wide variety of HTML rewriting/analysis tasks. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    html-react-parser

    html-react-parser

    HTML to React parser

    HTML to React parser that works on both the server (Node.js) and the client (browser). The parser converts an HTML string to one or more React elements. Available as part of the Tidelift Subscription. For TypeScript projects, you may need to check that domNode is an instance of domhandler's Element. Make sure to render parsed adjacent elements under a parent element.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    html-loader

    html-loader

    HTML Loader

    ...Filter can also be used to extend the supported elements and attributes. By default, the parser in html-loader interprets content inside noscript tags as #text, so processing of content inside this tag will be ignored. A very common scenario is exporting the HTML into their own .html file, to serve them directly instead of injecting with javascript.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    htmlparser2

    htmlparser2

    The fast & forgiving HTML and XML parser

    The fast & forgiving HTML and XML parser. htmlparser2 is the fastest HTML parser, and takes some shortcuts to get there. If you need strict HTML spec compliance, have a look at parse5. htmlparser2 itself provides a callback interface that allows the consumption of documents with minimal allocations. While the Parser interface closely resembles Node.js streams, it’s not a 100% match.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Comet Backup - Fast, Secure Backup Software for MSPs Icon
    Comet Backup - Fast, Secure Backup Software for MSPs

    Fast, Secure Backup Software for Businesses and IT Providers

    Comet is a flexible backup platform, giving you total control over your backup environment and storage destinations.
    Learn More
  • 5
    parse5

    parse5

    HTML parsing/serialization toolset for Node.js.

    HTML parsing/serialization toolset for Node.js. WHATWG HTML Living Standard (aka HTML5)-compliant. parse5 provides nearly everything you may need when dealing with HTML. It's the fastest spec-compliant HTML parser for Node to date. It parses HTML the way the latest version of your browser does. It has proven itself reliable in such projects as jsdom, Angular, Lit, Cheerio, rehype and many more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    PostHTML

    PostHTML

    PostHTML is a tool to transform HTML/XML with JS plugins

    ...PostHTML itself is very small. It includes only an HTML parser, an HTML node tree API and a node tree stringified.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Nokogiri

    Nokogiri

    Tool to work with XML and HTML from Ruby

    Nokogiri (鋸) makes it easy and painless to work with XML and HTML from Ruby. It provides a sensible, easy-to-understand API for reading, writing, modifying, and querying documents. It is fast and standards-compliant by relying on native parsers like libxml2 (C) and xerces (Java). Be secure-by-default by treating all documents as untrusted by default. Be a thin-as-reasonable layer on top of the underlying parsers, and don't attempt to fix behavioral differences between the parsers. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    jsoup

    jsoup

    Java library for working with real-world HTML

    ...The parser will make every attempt to create a clean parse from the HTML you provide, regardless of whether the HTML is well-formed or not. You have HTML in a Java String, and you want to parse that HTML to get at its contents, or to make sure it's well formed, or to modify it. The String may have come from user input, a file, or from the web.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Floki

    Floki

    Floki is a simple HTML parser that enables search for nodes using CSS

    Floki is a simple HTML parser that enables search for nodes using CSS selectors. Floki needs the :leex module in order to compile. Normally this module is installed with Erlang in a complete installation. By default, Floki uses a patched version of mochiweb_html for parsing fragments due to its ease of installation (it's written in Erlang and has no outside dependencies). fast_html is generally faster, according to the benchmarks conducted by its developers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-First Supply Chain Management Icon
    AI-First Supply Chain Management

    Supply chain managers, executives, and businesses seeking AI-powered solutions to optimize planning, operations, and decision-making across the supply

    Logility is a market-leading provider of AI-first supply chain management solutions engineered to help organizations build sustainable digital supply chains that improve people’s lives and the world we live in. The company’s approach is designed to reimagine supply chain planning by shifting away from traditional “what happened” processes to an AI-driven strategy that combines the power of humans and machines to predict and be ready for what’s coming. Logility’s fully integrated, end-to-end platform helps clients know faster, turn uncertainty into opportunity, and transform the supply chain from a cost center to an engine for growth.
    Learn More
  • 10
    Sanitize

    Sanitize

    Ruby HTML and CSS sanitizer

    ...You can also allow specific CSS properties, @ rules, and URL protocols in elements or attributes containing CSS. Any HTML or CSS that you don't explicitly allow will be removed. Sanitize is based on the Nokogiri HTML5 parser, which parses HTML the same way modern browsers do, and Crass, which parses CSS the same way modern browsers do. As long as your allowlist config only allows safe markup and CSS, even the most malformed or malicious input will be transformed into safe output.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    HtmlSanitizer

    HtmlSanitizer

    Cleans HTML to avoid XSS attacks

    HtmlSanitizer is a .NET library for cleaning HTML fragments and documents from constructs that can lead to XSS attacks. It uses AngleSharp to parse, manipulate, and render HTML and CSS. Because HtmlSanitizer is based on a robust HTML parser it can also shield you from deliberate or accidental "tag poisoning" where invalid HTML in one fragment can corrupt the whole document leading to broken layout or style.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    sakura

    sakura

    A minimal CSS framework/theme

    ...Don't want to develop using sakura, but instead want to use it on websites with outdated 90's design (i.e. no CSS)? Quick prototyping, especially when working on backend sites and can't yet be bothered to fidget with CSS/HTML. Building a quick (but pretty) site/blog for your best friend or aunt! No need to remember tons of different class names for every other CSS framework. Works amazingly with markdown generated HTML pages (eliminates the need of hacks like including .img img-responsive in markdown-parser generated <img></img> tags).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Infer

    Infer

    A static analyzer for Java, C, C++, and Objective-C

    Infer is a static analysis tool - if you give Infer some Java or C/C++/Objective-C code it produces a list of potential bugs. Anyone can use Infer to intercept critical bugs before they have shipped to users, and help prevent crashes or poor performance. Infer checks for null pointer exceptions, resource leaks, annotation reachability, missing lock guards, and concurrency race conditions in Android and Java code. Infer checks for null pointer dereferences, memory leaks, coding conventions...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    hyperx

    hyperx

    Tagged template string virtual dom builder

    tagged template string virtual dom builder. This module is similar to JSX, but provided as a standards-compliant ES6 tagged template string function. hyperx works with virtual-dom, react, hyperscript, or any DOM builder with a hyperscript-style API: h(tagName, attrs, children). You might also want to check out the hyperxify browserify transform to statically compile hyperx into javascript expressions to save sending the hyperx parser down the wire. Template strings are available in: node 4+,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Pedestal

    Pedestal

    The Pedestal Server-side Libraries

    Pedestal is a set of libraries that we use to build services and applications. It runs in the back end and can serve up whole HTML pages or handle API requests. There are a lot of tools in that space, so why did we build Pedestal? We had two main reasons. Pedestal is designed for APIs first. Most web app frameworks still focus on the "page model" and server-side rendering. Pedestal lets you start simple and add that if you need it. Pedestal makes it easy to create "live" applications....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    bluemonday

    bluemonday

    Fast golang HTML sanitizer (inspired by the OWASP Java HTML Sanitizer

    bluemonday is an HTML sanitizer implemented in Go. It is fast and highly configurable. bluemonday takes untrusted user-generated content as an input, and will return HTML that has been sanitized against an allowlist of approved HTML elements and attributes so that you can safely include the content in your web page. If you accept user-generated content, and your server uses Go, you need bluemonday. It protects sites from XSS attacks. There are many vectors for an XSS attack and the best way...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 18
    HtmlCleaner is HTML parser written in Java. It transforms dirty HTML to well-formed XML following the same rules that the most web-browsers use.
    Downloads: 30 This Week
    Last Update:
    See Project
  • 19
    Writer2LaTeX and Writer2xhtml is a collection of converters from OpenDocument Format (ODF) to LaTeX/BibTeX, HTML+MathML and EPUB. It is delivered as a standalone java library, as a command line application and as extensions for LibreOffice.
    Leader badge
    Downloads: 36 This Week
    Last Update:
    See Project
  • 20
    DiDOM

    DiDOM

    Simple and fast HTML and XML parser

    Simple and fast HTML and XML parser. DiDom allows loading HTML in several ways.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    FluentLenium

    FluentLenium

    FluentLenium is a web & mobile automation framework

    FluentLenium is a React-ready website automation framework that extends Selenium to write readable, reusable, reliable and resilient UI functional tests. It’s written and maintained by people who are automating browser-based tests on a daily basis. FluentLenium provides a Java-fluent interface to Selenium, and brings some magic to avoid common issues faced by Selenium users. FluentLenium is shipped with adapters for JUnit4, JUnit5, TestNG, Spock, Spring TestNG, Cucumber and Kotest, but it...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    JBake

    JBake

    Java based open source static site/blog generator for developers

    JBake is a Java-based, open source, static site/blog generator for developers & designers. The project uses Gradle 4.9+ as the build system. We configured the gradle check style Plugin to run with the check task. It does not break the build if convention violations are found. But prints a warning and generates a report. Source available on GitHub, licensed under MIT License.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    HTML parser in Delphi

    A Delphi class with functions to read and dissect a HTML file

    THTMLdom is a (Delphi) class with functions to read a HTML source file and dissect it into a tree of THTMLelement. The attributes of the HTML tags are stored in the elements. Functions are provided to select elements on the basis of the attribute values or tag names. The structure of the tree can be shown and it can be rendered as plain text. The source is plain Delphi pascal, requiring a version that supports Tdictionary. There is no dependency on 3rd party units. The file to be parsed...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    HTML Article Generator

    HTML Article Generator

    Quickly create custom webpages from your content

    HTML Article Generator is a tool for quickly generating webpages based on content you enter, including both text and images. These webpages can be customised to give a unique appearance, with a selection of 5 different themes. Other features include the ability to save the current values you have entered and restore these values after future changes have been made. Images can have caption text added to them and given alt text to improve accessibility. Each webpage can also be given a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    CSSBox

    CSSBox

    Pure Java HTML / CSS rendering engine

    CSSBox is an (X)HTML/CSS rendering engine written in pure Java. Its primary purpose is to provide a complete information about the rendered page suitable for further processing. However, it also allows displaying the rendered document.
    Leader badge
    Downloads: 18 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next