Join/Login
Open Source Software
Business Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Open Source Software

Business Software

SourceForge Podcast

Articles
Case Studies
Learn
Blog

Menu

Help
Create
Join
Login

Home
Browse Open Source
Search Results

Search Results for "html parser xpath"

x

Sort By:

Relevance

OS

Windows 331
Linux 322
Mac 276
More...
BSD 194
ChromeOS 157
Desktop Operating Systems 12
Mobile Operating Systems 6
Server Operating Systems 3
Embedded Operating Systems 2

Category

Software Development 157
Formats and Protocols 153
Internet 113
Business 31
Text Editors 30
System 17
Multimedia 15
Communications 14
Database 12
Security 12
Education 10
Scientific/Engineering 7
Printing 6
Desktop Environment 4
Artificial Intelligence 2
Games 2
Mobile 1
Religion and Philosophy 1
Social sciences 1

License

OSI-Approved Open Source 336
Public Domain 7
Other License 5
Creative Commons Attribution License 3

Translations

English 145
French 22
German 20
Spanish 10
More...
Russian 9
Italian 6
Chinese (Simplified) 5
Brazilian Portuguese 3
Czech 3
Polish 3
Portuguese 3
Japanese 2
Norwegian 2
Slovak 2
Chinese (Traditional) 1
Croatian 1
Danish 1
Dutch 1
Finnish 1
Latin 1
Malay 1
Romanian 1
Slovene 1
Vietnamese 1

Programming Language

Status

Production/Stable 102
Beta 96
Alpha 58
Pre-Alpha 21
More...
Planning 11
Mature 9
Inactive 5

Showing 380 open source projects for "html parser xpath"

View related business solutions

Business Continuity Solutions | ConnectWise BCDR
Build a foundation for data security and disaster recovery to fit your clients’ needs no matter the budget.

Whether natural disaster, cyberattack, or plain-old human error, data can disappear in the blink of an eye. ConnectWise BCDR (formerly Recover) delivers reliable and secure backup and disaster recovery backed by powerful automation and a 24/7 NOC to get your clients back to work in minutes, not days.

Learn More
Automated RMM Tools | RMM Software
Proactively monitor, manage, and support client networks with ConnectWise Automate

Out-of-the-box scripts. Around-the-clock monitoring. Unmatched automation capabilities. Start doing more with less and exceed service delivery expectations.

Learn More
1

html-react-parser

HTML to React parser

HTML to React parser that works on both the server (Node.js) and the client (browser). The parser converts an HTML string to one or more React elements. Available as part of the Tidelift Subscription. For TypeScript projects, you may need to check that domNode is an instance of domhandler's Element. Make sure to render parsed adjacent elements under a parent element.

Downloads: 0 This Week

Last Update: 2024-09-11
See Project
2

html-loader

HTML Loader

... and attributes. By default, the parser in html-loader interprets content inside noscript tags as #text, so processing of content inside this tag will be ignored. A very common scenario is exporting the HTML into their own .html file, to serve them directly instead of injecting with javascript.

Downloads: 4 This Week

Last Update: 2024-07-25
See Project
3

html-metadata

MetaData html scraper and parser for Node.js (supports Promises

The aim of this library is to be a comprehensive source for extracting all HTML-embedded metadata. Currently, it supports Schema.org microdata using a third-party library, a native BEPress, Dublin Core, Highwire Press, JSON-LD, Open Graph, Twitter, EPrints, PRISM, and COinS implementation, and some general metadata that doesn't belong to a particular standard (for instance, the content of the title tag, or meta description tags). Planned is support for RDFa, AGLS, and other yet unheard...

Downloads: 0 This Week

Last Update: 2024-08-24
See Project
4

LOL HTML

Low output latency streaming HTML parser/rewriter with CSS API

Low Output Latency streaming HTML rewriter/parser with CSS-selector based API. It is designed to modify HTML on the fly with minimal buffering. It can quickly handle very large documents, and operate in environments with limited memory resources. The crate serves as a back-end for the HTML rewriting functionality of Cloudflare Workers, but can be used as a standalone library with a convenient API for a wide variety of HTML rewriting/analysis tasks. The parser switches back to the tag scanner...

Downloads: 2 This Week

Last Update: 2023-07-31
See Project
Manage your IT department more effectively
Streamline your business from end to end with ConnectWise PSA

ConnectWise PSA (formerly Manage) allows you to stop working in separate systems, and helps you build a more profitable business. No more duplicate data entries, inefficient employees, manual invoices, and the inability to accurately track client service issues. Get a behind the scenes look into the award-winning PSA that automates processes for each area of business: sales, help desk, support, finance, and HR.

Learn More
5

fast-xml-parser

Validate XML, Parse XML and Build XML rapidly

Validate XML, Parse XML to JS Object, or Build XML from JS Object without C/C++ based libraries and no callback.

Downloads: 0 This Week

Last Update: 2023-10-11
See Project
6

Scrapy

A fast, high-level web crawling and web scraping framework

Scrapy is a fast, open source, high-level framework for crawling websites and extracting structured data from these websites. Portable and written in Python, it can run on Windows, Linux, macOS and BSD. Scrapy is powerful, fast and simple, and also easily extensible. Simply write the rules to extract the data, and add new functionality if you wish without having to touch the core. Scrapy does the rest, and can be used in a number of applications. It can be used for data mining, monitoring...

Downloads: 35 This Week

Last Update: 2024-06-21
See Project
7

Karate

Test automation made simple

Karate is the only open-source tool to combine API test-automation, mocks, performance-testing and even UI automation into a single, unified framework. The BDD syntax popularized by Cucumber is language-neutral, and easy for even non-programmers. Assertions and HTML reports are built-in, and you can run tests in parallel for speed. There’s also a cross-platform stand-alone executable for teams not comfortable with Java. You don’t have to compile code. Just write tests in a simple, readable...

Downloads: 12 This Week

Last Update: 2024-08-05
See Project
8

HtmlSanitizer

Cleans HTML to avoid XSS attacks

HtmlSanitizer is a .NET library for cleaning HTML fragments and documents from constructs that can lead to XSS attacks. It uses AngleSharp to parse, manipulate, and render HTML and CSS. Because HtmlSanitizer is based on a robust HTML parser it can also shield you from deliberate or accidental "tag poisoning" where invalid HTML in one fragment can corrupt the whole document leading to broken layout or style. In order to facilitate different use cases, HtmlSanitizer can be customized at several...

Downloads: 7 This Week

Last Update: 2024-07-26
See Project
9

jsoup

Java library for working with real-world HTML

jsoup is a Java library for working with real-world HTML. It provides a very convenient API for fetching URLs and extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors. jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do. jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; jsoup will create a sensible parse tree. The parser will make every...

Downloads: 5 This Week

Last Update: 2024-07-10
See Project
Control remote support software for remote workers and IT teams
Raise the bar for remote support and reduce customer downtime.

ConnectWise ScreenConnect, formerly ConnectWise Control, is a remote support solution for Managed Service Providers (MSP), Value Added Resellers (VAR), internal IT teams, and managed security providers. Fast, reliable, secure, and simple to use, ConnectWise ScreenConnect helps businesses solve their customers' issues faster from any location. The platform features remote support, remote access, remote meeting, customization, and integrations with leading business tools.

Learn More
10

pdf-extractor

Node.js module for rendering pdf pages to images, svgs and HTML files

Pdf-extractor is a wrapper around pdf.js to generate images, svgs, html files, text files and json files from a pdf on node.js. A DOM Canvas is used to render and export the graphical layer of the pdf. Canvas exports *.png as a default but can be extended to export to other file types like .jpg. Pdf objects are converted to svg using the SVGGraphics parser of pdf.js. Pdf text is converted to HTML. This can be used as a (transparent) layer over the image to enable text selection. Pdf text...

Downloads: 4 This Week

Last Update: 2023-03-23
See Project
11

Nokogiri

Tool to work with XML and HTML from Ruby

Nokogiri (鋸) makes it easy and painless to work with XML and HTML from Ruby. It provides a sensible, easy-to-understand API for reading, writing, modifying, and querying documents. It is fast and standards-compliant by relying on native parsers like libxml2 (C) and xerces (Java). Be secure-by-default by treating all documents as untrusted by default. Be a thin-as-reasonable layer on top of the underlying parsers, and don't attempt to fix behavioral differences between the parsers. "Native gems...

Downloads: 0 This Week

Last Update: 2024-07-27
See Project
12

SafeLine

Serve as a reverse proxy to protect your web services from attacks

SafeLine is a self-hosted WAF(Web Application Firewall) to protect your web apps from attacks and exploits. A web application firewall helps protect web apps by filtering and monitoring HTTP traffic between a web application and the Internet. It typically protects web apps from attacks such as SQL injection, XSS, code injection, os command injection, CRLF injection, LDAP injection, XPath injection, RCE, XXE, SSRF, path traversal, backdoor, brute force, HTTP-flood, bot abuse, among others...

Downloads: 3 This Week

Last Update: 2024-09-12
See Project
13

EzXML.jl

XML/HTML handling tools for primates

EzXML.jl is a package to handle XML/HTML documents for primates. This package depends on libxml2, which will be automatically installed as an artifact via XML2_jll.jl if you use Julia 1.3 or later. Currently, Windows, Linux, macOS, and FreeBSD are now supported.

Downloads: 1 This Week

Last Update: 2023-12-04
See Project
14

Jupyter Notebook Tools for Sphinx

Sphinx source parser for Jupyter notebooks

nbsphinx is a Sphinx extension that provides a source parser for *.ipynb files. Custom Sphinx directives are used to show Jupyter Notebook code cells (and of course their results) in both HTML and LaTeX output. Un-evaluated notebooks – i.e. notebooks without stored output cells – will be automatically executed during the Sphinx build process.

Downloads: 1 This Week

Last Update: 2024-08-13
See Project
15

parse5

HTML parsing/serialization toolset for Node.js.

HTML parsing/serialization toolset for Node.js. WHATWG HTML Living Standard (aka HTML5)-compliant. parse5 provides nearly everything you may need when dealing with HTML. It's the fastest spec-compliant HTML parser for Node to date. It parses HTML the way the latest version of your browser does. It has proven itself reliable in such projects as jsdom, Angular, Lit, Cheerio, rehype and many more.

Downloads: 0 This Week

Last Update: 2023-04-18
See Project
16

FreshRSS

A free, self-hostable news aggregator

FreshRSS is a self-hosted RSS and Atom feed aggregator. It is lightweight, easy to work with, powerful, and customizable. Follow websites, podcasts, and video channels in a single place. Read your articles directly in FreshRSS. Search and save queries for quick access. Generate feeds by scraping external websites. Generate new feeds based on your filters. Import and export your feeds with OPML. Stay connected to your feeds in real time. Adapt to your needs thanks to a lot of options. Follow...

Downloads: 2 This Week

Last Update: 7 days ago
See Project
17

htmlparser2

The fast & forgiving HTML and XML parser

The fast & forgiving HTML and XML parser. htmlparser2 is the fastest HTML parser, and takes some shortcuts to get there. If you need strict HTML spec compliance, have a look at parse5. htmlparser2 itself provides a callback interface that allows the consumption of documents with minimal allocations. While the Parser interface closely resembles Node.js streams, it’s not a 100% match. Use the WritableStream interface to process a streaming input.

Downloads: 0 This Week

Last Update: 2024-01-05
See Project
18

CssSelector Component

Converts CSS selectors to XPath expressions

XPath expressions are incredibly flexible, so there is almost always an XPath expression that will find the element you need. Unfortunately, they can also become very complicated, and the learning curve is steep. Even common operations (such as finding an element with a particular class) can require long and unwieldy expressions. CSS selectors are less powerful than XPath, but far easier to write, read and understand. Since they are less powerful, almost all CSS selectors can be converted...

Downloads: 0 This Week

Last Update: 2024-06-04
See Project
19

DiDOM

Simple and fast HTML and XML parser

Simple and fast HTML and XML parser. DiDom allows loading HTML in several ways.

Downloads: 0 This Week

Last Update: 2023-04-20
See Project
20

sakura

A minimal CSS framework/theme

..., especially when working on backend sites and can't yet be bothered to fidget with CSS/HTML. Building a quick (but pretty) site/blog for your best friend or aunt! No need to remember tons of different class names for every other CSS framework. Works amazingly with markdown generated HTML pages (eliminates the need of hacks like including .img img-responsive in markdown-parser generated <img></img> tags).

Downloads: 1 This Week

Last Update: 2023-09-01
See Project
21

AngleSharp

The ultimate angle brackets parser library parsing HTML5, MathML, SVG

AngleSharp follows the W3C specifications and gives you the same results as state of the art browsers. Besides the official API AngleSharp adds some useful extension methods on top. This makes working with the DOM convenient. AngleSharp integrates everything you need to explore and mutate the DOM tree. Node retrieval is straight forward by using powerful CSS query selectors. The CSS queries in AngleSharp are super fast and very simple to use. AngleSharp respects the relationship of HTML...

Downloads: 1 This Week

Last Update: 2024-03-07
See Project
22

Floki

Floki is a simple HTML parser that enables search for nodes using CSS

Floki is a simple HTML parser that enables search for nodes using CSS selectors. Floki needs the :leex module in order to compile. Normally this module is installed with Erlang in a complete installation. By default, Floki uses a patched version of mochiweb_html for parsing fragments due to its ease of installation (it's written in Erlang and has no outside dependencies). fast_html is generally faster, according to the benchmarks conducted by its developers.

Downloads: 0 This Week

Last Update: 2024-04-26
See Project
23

mdBook

Create books from markdown files

... documentation and a fine example of what mdBook produces. mdBook includes built in support for both preprocessing your Markdown and alternative renderers for producing formats other than HTML. These facilities also enable other functionality such as validation. Searching Rust's crates.io is a great way to discover more extensions.

Downloads: 1 This Week

Last Update: 2024-05-17
See Project
24

Sanitize

Ruby HTML and CSS sanitizer

... that you don't explicitly allow will be removed. Sanitize is based on the Nokogiri HTML5 parser, which parses HTML the same way modern browsers do, and Crass, which parses CSS the same way modern browsers do. As long as your allowlist config only allows safe markup and CSS, even the most malformed or malicious input will be transformed into safe output.

Downloads: 0 This Week

Last Update: 2024-08-14
See Project
25

goquery

A little like that j-thing, only in Go

goquery brings a syntax and a set of features similar to jQuery to the Go language. It is based on Go's net/HTML package and the CSS Selector library Cascadia. Since the net/html parser returns nodes, and not a full-featured DOM tree, jQuery's stateful manipulation functions (like height(), css(), and detach()) have been left off. Also, because the net/HTML parser requires UTF-8 encoding, so does goquery: it is the caller's responsibility to ensure that the source document provides UTF-8...

Downloads: 0 This Week

Last Update: 2024-09-06
See Project

Previous
You're on page 1
2
3
4
5
Next

Related Searches

Related Categories

Software Development

Formats and Protocols

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
225 Broadway Suite 1600
San Diego, CA 92101
+1 (858) 454-5900

Resources

Support
Site Documentation
Site Status

© 2024 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise

Thanks for helping keep SourceForge clean.

X

You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

Briefly describe the problem (required):

Upload screenshot of ad (required):

Select a file, or drag & drop file here.

✔

✘

Screenshot instructions:

Click URL instructions:
Right-click on the ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Ad destination/click URL: