web pages free download

Showing 686 open source projects for "web pages"

View related business solutions

Internet Mac Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
1

Translate Web Pages

Translate your page in real time using Google or Yandex

...You can select to automatically translate. To change the translation engine just touch the Google Translate icon. To translate any website it is necessary to access and modify the text of the web pages. And the extension can only do that, with that permission. The pages are translated using the Google or Yandex translation engine (you choose). We do not collect any information. However, to translate, the contents of the web pages will be sent to Google or Yandex servers. You can also install via crx file, download the file using a download manager/or firefox. ...

1 Review

Downloads: 15 This Week

Last Update: 2025-03-16
See Project
2

Web Archives

Browser extension for viewing archived and cached versions of websites

Browser extension for viewing archived and cached versions of web pages, available for Chrome, Edge and Safari. Web Archives is a browser extension that enables you to find archived and cached versions of web pages, and comes with support for more than 10 search engines. Searches can be initiated from the context menu and the browser toolbar. A diverse set of archive and cache sources are supported, which can be toggled and reordered from the extension's options. ...

Downloads: 0 This Week

Last Update: 2026-02-15
See Project
3

rvest

Simple web scraping for R

rvest helps you scrape (or harvest) data from web pages. It is designed to work with magrittr to make it easy to express common web scraping tasks, inspired by libraries like beautiful soup and RoboBrowser. If you’re scraping multiple pages, I highly recommend using rvest in concert with polite. The polite package ensures that you’re respecting the robots.txt and not hammering the site with too many requests.

Downloads: 0 This Week

Last Update: 2025-08-29
See Project
4

single-file-cli

CLI tool to save complete web pages as single self-contained HTML file

SingleFile CLI is an open source command-line tool designed to save complete web pages as a single self-contained HTML file. It captures the rendered page in a headless browser and embeds all required resources directly into the output document, including stylesheets, scripts, images, and fonts. By consolidating every dependency into one file, it allows users to preserve a faithful copy of a web page that can be viewed offline without requiring external assets. ...

Downloads: 6 This Week

Last Update: 2026-03-11
See Project
Train ML Models With SQL You Already Know
BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.

Try Free
5

Ferret

Declarative web scraping

A web scraping system aiming to simplify data extraction from the web. ferret has a declarative query language that makes it easy to focus on the data that you need to get. ferret has the ability to scrape JS rendered pages, handle all page events, and emulate user interactions. the ferret was designed as a library from the ground up. it can be easily embedded into any Go application. ferret helps you to focus on the data you need using an easy-to-learn declarative language. ferret uses Chrome/Chromium via Chrome Devtools Protocol to handle dynamically rendered web pages. ferret is extremely extensible, and creating custom functions and types is super easy. ferret allows users to focus on the data. ...

Downloads: 0 This Week

Last Update: 2025-05-07
See Project
6

Spider

High-performance Rust web crawler and scraper for large-scale data

Spider is a high-performance web crawler and web scraping library written in Rust that enables developers to crawl and index websites efficiently. It focuses on speed, concurrency, and reliability by using asynchronous and multi-threaded processing to handle large volumes of web pages. It can rapidly crawl websites to collect links, retrieve page content, and extract structured information from HTML documents.

Downloads: 0 This Week

Last Update: 2026-03-31
See Project
7

Playwright

Node library to automate Chromium, Firefox & WebKit with a single API

Playwright is a Node library for automating Chromium, Firefox and WebKit using a single API. It supports headless execution for all these browsers on Linux, macOS and Windows, providing automated web browser interactions that are fast, capable, reliable and ever-green. Playwright enables a broad spectrum of cross-browser web automation capabilities, which are used by Single Page Apps and Progressive Web Apps. These include scenarios that span multiple pages, domains and iframes; emulation of mobile devices, geolocation, and permissions; upload and download files and many more.

Downloads: 142 This Week

Last Update: 2026-04-01
See Project
8

wombat

Lightweight Ruby DSL for scraping structured data from web pages

Wombat is a lightweight web crawling and scraping library written in Ruby that focuses on extracting structured data from web pages using a concise domain-specific language (DSL). It is designed to simplify the process of defining how information should be collected from HTML documents without requiring large amounts of scraping boilerplate code. Developers can declare the data fields they want and specify selectors or rules for retrieving them, allowing Wombat to parse and return structured results. ...

Downloads: 0 This Week

Last Update: 2026-04-07
See Project
9

academicpages.github.io

Github Pages template based upon HTML and Markdown for personal

AcademicPages is a ready-made Jekyll theme for academics to build personal websites, blogs, and CV pages. It includes features like publication lists, project showcases, writing blogs, and optimized layouts for easier GitHub Pages deployment. With support for LaTeX rendering, RSS feeds, and responsive design, it's popular among students, researchers, and educators looking to create professional web presences without coding from scratch.

Downloads: 0 This Week

Last Update: 2025-07-04
See Project
AI-powered service management for IT and enterprise teams
Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.

Try it Free
10

goclone

Fast CLI tool for cloning entire websites for local browsing offline

goclone is a command-line utility designed to download and mirror complete websites to a local directory for offline access. It retrieves HTML pages, stylesheets, JavaScript files, images, and other assets from a target site and stores them on the user’s computer. It preserves the original site’s structure by maintaining relative links between pages, allowing the mirrored copy to function similarly to the live version when opened locally. Once a site has been cloned, users can browse the pages offline and navigate between them as if they were viewing the site online. goclone is written in Go and leverages concurrency through Go routines to perform downloads efficiently. goclone can also optionally start a local web server to serve the mirrored files for a more realistic browsing experience. ...

Downloads: 2 This Week

Last Update: 2026-03-11
See Project
11

kimuraframework

AI-first Ruby framework for building fast, flexible web scraping spide

Kimurai is an open source web scraping framework written in Ruby that simplifies the process of building automated data extraction tools. It provides a clean domain-specific language that allows developers to define scraping logic and data schemas with minimal boilerplate code. Kimurai can use AI-assisted extraction to identify where data resides in HTML pages, automatically generating selectors that are cached for future use so subsequent scraping runs operate with pure Ruby performance. ...

Downloads: 0 This Week

Last Update: 24 hours ago
See Project
12

Lightpanda Browser

Lightpanda: the headless browser designed for AI and automation

Lightpanda is an open-source headless browser designed specifically for automation, artificial intelligence workflows, and large-scale web interaction tasks. Unlike traditional browsers that include full graphical rendering engines meant for human users, Lightpanda is built from scratch to operate entirely in headless mode, focusing only on the components required for programmatic web interaction. This design allows it to execute JavaScript and interact with web pages while avoiding the overhead associated with rendering images, fonts, and layout elements intended for visual display. ...

Downloads: 31 This Week

Last Update: 2026-04-24
See Project
13

Geziyor

Blazing fast Go framework for web crawling and data scraping tasks

Geziyor is a high-performance web crawling and web scraping framework built for the Go programming language. It is designed to help developers crawl websites and extract structured information from web pages efficiently. It focuses on speed and scalability, allowing large numbers of requests to be processed concurrently. Geziyor supports use cases such as data mining, monitoring web content, and automated testing workflows.

Downloads: 0 This Week

Last Update: 3 days ago
See Project
14

Lighthouse

Automated auditing, performance metrics, & best practices for the web

Lighthouse is an open-source, automated tool that analyzes and audits web apps and web pages in order to improve their quality. Lighthouse collects modern performance metrics and insights on developer best practices; auditing for performance, accessibility, SEO and more. After auditing it produces a report either in JSON or HTML. Included in the report is a reference doc that explains the importance of the audit and how to fix the problem areas, which you can use to improve the web app or web page. ...

Downloads: 8 This Week

Last Update: 2 days ago
See Project
15

Bili23 Downloader

Cross platform GUI tool for downloading videos from Bilibili sites

Bili23-Downloader is an open source desktop application designed for downloading video content from the Bilibili platform. It provides a graphical interface that allows users to download various types of media including user-uploaded videos, series episodes, movies, and other hosted content. It focuses on ease of use with a zero-configuration setup, making it accessible to both beginners and experienced users. It supports high performance downloads through multi-threading and includes resume...

Downloads: 8 This Week

Last Update: 2026-04-07
See Project
16

Min

A fast, minimal browser that protects your privacy

Tabs in Min take up less space, giving you more room to browse the web. Pages you haven’t looked at in a while fade out, letting you see what’s important, and Focus Mode hides your other tabs to prevent you from getting distracted. See quick definitions and answers with information from DuckDuckGo, including Wikipedia entries and more. Jump to any site quickly with fuzzy search. Or search through the full text of every page you've visited, even if you don't remember the title. ...

Downloads: 24 This Week

Last Update: 2026-04-12
See Project
17

crwlr

Library for Rapid (Web) Crawler and Scraper Development

This library provides kind of a framework and a lot of ready-to-use, so-called steps, that you can use as building blocks, to build your own crawlers and scrapers with. Before diving into the library, let's have a look at the terms crawling and scraping. For most real-world use cases, those two things go hand in hand, which is why this library helps with and combines both. A (web) crawler is a program that (down)loads documents and follows the links in it to load them as well. A crawler...

Downloads: 1 This Week

Last Update: 19 hours ago
See Project
18

Adguard Browser Extension

AdGuard browser extension

AdGuard is a fast and lightweight ad-blocking browser extension that effectively blocks all types of ads and trackers. AdGuard is a fast and lightweight ad blocking browser extension that effectively blocks all types of ads and trackers on all web pages. We focus on advanced privacy protection features to not just block known trackers, but prevent web sites from building your shadow profile. Unlike its standalone counterparts (AG for Windows, Mac), the browser extension is completely free and open source. You can learn more about the difference here. AdGuard does not collect any information about you, and does not participate in any acceptable ads program. ...

Downloads: 43 This Week

Last Update: 2026-03-19
See Project
19

Dev Browser

A Claude Skill to give your agent the ability to use a web browser

Dev Browser is a browser automation skill/plugin that enables an AI agent to control a real browser for verification and testing during development. Its purpose is to close the gap between “code was written” and “the UI actually works,” by letting the agent navigate, interact with pages, and validate behavior in a live environment. A key idea is persistence: the browser can keep pages open so the agent can navigate once and then perform multiple interactions across scripts without losing...

Downloads: 3 This Week

Last Update: 2026-04-09
See Project
20

Zenario

Zenario is a web-based content management system (CMS)

Zenario is a web-based content management system (CMS). It can be used for simple sites, with many "wysiwyg" features for making regular web pages, news items, blogs, and so on. It has powerful features for running extranet sites, such as customer portals, and online databases (e.g. of products, documents or videos). It also has multilingual features built in from the core, so that a site can easily be set up to deliver content in in multiple languages.

Downloads: 0 This Week

Last Update: 2025-11-12
See Project
21

QueryList

Progressive PHP web crawler framework with jQuery-like DOM parsing

QueryList is an extensible PHP web scraping and crawling framework designed to extract and process data from web pages. It provides a simple and expressive API that allows developers to collect structured information from HTML documents using familiar DOM traversal techniques. It is built on top of phpQuery and uses CSS3 selectors similar to those found in jQuery, making it easy for developers to query and manipulate page elements during scraping tasks.

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
22

skycaiji

Open source web scraping system for automated data collection tasks

SkyCaiji is an open source web scraping and data collection system designed to gather information from websites through configurable extraction rules. It focuses on simplifying the process of building crawlers by allowing users to visually define scraping rules rather than writing complex code. It can collect structured or unstructured data from many types of webpages and automate the extraction process for large datasets. SkyCaiji is designed to run on a variety of hosting environments...

Downloads: 2 This Week

Last Update: 17 hours ago
See Project
23

newspaper4k

Python library for scraping and analyzing online news articles easily

...It is a continuation and active fork of the original newspaper3k library, which had stopped receiving updates, with the goal of keeping the ecosystem maintained while adding improvements and bug fixes. It provides developers with tools to automatically download web pages, extract the main article content, and collect associated metadata such as titles, authors, images, and publication dates. Newspaper4k also includes natural language processing capabilities that can generate summaries and identify keywords from extracted article text. Newspaper4k supports both single-article extraction and full news site processing, allowing users to build sources representing entire publications and iterate through their articles. ...

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
24

Mink

PHP web browser emulator abstraction

...Mink is commonly used in behavior-driven development workflows, particularly with frameworks like Behat, where it helps simulate real user behavior such as clicking links, filling forms, and navigating pages. The library supports session management, allowing multiple browser sessions to run simultaneously and interact with different pages or environments.

Downloads: 0 This Week

Last Update: 2026-03-17
See Project
25

Dillo

Dillo, a multi-platform graphical web browser

Dillo is a lightweight, minimal graphical web browser, designed for speed, low resource usage, and privacy. It is written in C and C++ using the FLTK (Fast Light Toolkit) GUI library. Its goals include enabling web access on old or constrained hardware, using slow or unreliable network connections, minimizing dependencies, and avoiding many of the complexities and overheads of modern full-featured browsers. It omits many modern features (notably JavaScript), instead focusing on rendering...

Downloads: 26 This Week

Last Update: 2025-09-11
See Project