websites free download

Showing 79 open source projects for "websites"

View related business solutions

Internet Python Clear Filters & Widen Search

Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
1

MechanicalSoup

A Python library for automating interaction with websites

A Python library for automating interaction with websites. MechanicalSoup automatically stores and sends cookies, follows redirects, and can follow links and submit forms. It doesn't do JavaScript. MechanicalSoup was created by M Hickford, who was a fond user of the Mechanize library. Unfortunately, Mechanize was incompatible with Python 3 until 2019 and its development stalled for several years.

Downloads: 0 This Week

Last Update: 2025-05-30
See Project
2

ScrapeGraphAI

Python scraper based on AI

Extracting content from websites and local documents using LLM. ScrapeGraphAI is a web scraping python library that uses LLM and direct graph logic to create scraping pipelines for websites and local documents (XML, HTML, JSON, Markdown, etc.). Just say which information you want to extract and the library will do it for you.

Downloads: 4 This Week

Last Update: 5 days ago
See Project
3

django CMS

Easy-to-use and developer-friendly enterprise CMS powered by Django

Create modern websites that content editors love. django CMS was originally conceived by web developers frustrated with the technical and security limitations of other systems. Its lightweight core makes it easy to integrate with other software and put to use immediately, while its ease of use makes it the go-to choice for content managers, content editors and website admins.

Downloads: 2 This Week

Last Update: 2026-03-04
See Project
4

Scrapy

A fast, high-level web crawling and web scraping framework

Scrapy is a fast, open source, high-level framework for crawling websites and extracting structured data from these websites. Portable and written in Python, it can run on Windows, Linux, macOS and BSD. Scrapy is powerful, fast and simple, and also easily extensible. Simply write the rules to extract the data, and add new functionality if you wish without having to touch the core. Scrapy does the rest, and can be used in a number of applications.

Downloads: 36 This Week

Last Update: 2026-03-18
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
5

crawler

Collection of JS reverse engineering examples for web scraping study

crawler is a collection of web scraping and JavaScript reverse engineering examples designed for learning how modern websites protect their data and how those protections can be analyzed. It contains many case studies that demonstrate how to analyze and replicate request parameters, cookies, and encryption logic used by real websites. Each directory in the project focuses on a specific target service or scenario, showing how browser network requests and JavaScript code can be studied to reproduce API calls programmatically. ...

Downloads: 0 This Week

Last Update: 2 days ago
See Project
6

JobFunnel

Scrape job websites into a single spreadsheet with no duplicates.

Scrape job websites into a single spreadsheet with no duplicates. Automated tool for scraping job postings into a .csv file. You can search for jobs with YAML configuration files or by passing command arguments. By performing regular scraping and reviewing, you can cut through the noise of even the busiest job markets. Run funnel with your settings YAML to populate your master CSV file with jobs from available providers.

Downloads: 0 This Week

Last Update: 2024-09-29
See Project
7

news-please

Python tool for crawling and extracting structured data from news site

news-please is an open source news crawler and information extraction tool designed to collect and structure articles from online news websites. It provides an integrated pipeline that crawls news sites, retrieves article pages, and extracts structured information such as headlines, authors, publication dates, and article text. news-please can recursively follow internal links and read RSS feeds to gather both recent and archived articles from a news outlet when given only the root URL of a site. ...

Downloads: 3 This Week

Last Update: 4 days ago
See Project
8

LinkChecker

Check links in web documents or full websites

LinkChecker is a free, GPL licensed website validator. LinkChecker checks links in web documents or full websites. It runs on Python 3 systems, requiring Python 3.8 or later. The version in the pip repository may be old, to find out how to get the latest code, plus platform-specific information and other advice see doc/install.txt in the source code archive. If you do not want to install any additional libraries/dependencies you can use the Docker image which is published on GitHub Packages.

Downloads: 0 This Week

Last Update: 2025-07-28
See Project
9

newspaper4k

Python library for scraping and analyzing online news articles easily

Newspaper4k is a Python library designed for extracting, processing, and analyzing news articles from websites. It is a continuation and active fork of the original newspaper3k library, which had stopped receiving updates, with the goal of keeping the ecosystem maintained while adding improvements and bug fixes. It provides developers with tools to automatically download web pages, extract the main article content, and collect associated metadata such as titles, authors, images, and publication dates. ...

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
AI-generated apps that pass security review
Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.

Try Retool free
10

img2dataset

Easily turn large sets of image urls to an image dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine. Also supports saving captions for url+caption datasets. Opt-out directives: Websites can pass the http headers X-Robots-Tag: noai, X-Robots-Tag: noindex , X-Robots-Tag: noimageai and X-Robots-Tag: noimageindex By default img2dataset will ignore images with such headers.

Downloads: 0 This Week

Last Update: 2025-08-09
See Project
11

OnionShare

Securely and anonymously share files of any size

OnionShare is an open source tool that allows you to securely and anonymously share files of any size, host websites, and chat with friends using the Tor network. There's no need for middlemen that could very well violate the privacy and security of the things you share online. With OnionShare, you can share files directly with just an address in Tor Browser. OnionShare works because it is accessible as a Tor Onion Service. All you need to do is open it and drag and drop the files you want to share into it, and start sharing. ...

Downloads: 3 This Week

Last Update: 2025-02-26
See Project
12

CommunityScrapers

This is a public repository containing scrapers

...The repository contains hundreds of scraper definitions written primarily in YAML and Python, each tailored to extract structured metadata such as titles, performers, tags, and media details from specific websites. These scrapers integrate directly into Stash, allowing users to enrich their media libraries with accurate and detailed information without manual entry. The project supports both automatic installation through in-app feeds and manual configuration for advanced use cases. Some scrapers require additional configuration such as API keys or cookies, highlighting its flexibility and adaptability to different sources.

Downloads: 1 This Week

Last Update: 6 days ago
See Project
13

owllook

Vertical novel search engine with unified reading and tracking tools

Owllook is an open source vertical search engine designed for discovering and reading online novels from multiple sources. Instead of redirecting users to different sites, the system parses content from many novel platforms and presents it in a unified reading interface. It focuses on providing a simple and comfortable reading experience with features such as searching for books, following updates, bookmarking chapters, and maintaining a personal bookshelf. It aggregates results from...

Downloads: 2 This Week

Last Update: 4 days ago
See Project
14

OpenWPM

A web privacy measurement framework

OpenWPM is a web privacy measurement framework that makes it easy to collect data for privacy studies on a scale of thousands to millions of websites. OpenWPM is built on top of Firefox, with automation provided by Selenium. It includes several hooks for data collection. Check out the instrumentation section below for more details. OpenWPM is tested on Ubuntu 18.04 via TravisCI and is commonly used via the docker container that this repo builds, which is also based on Ubuntu.

Downloads: 1 This Week

Last Update: 2026-03-03
See Project
15

Scrapling

An adaptive Web Scraping framework

Scrapling is an adaptive web scraping framework designed to handle everything from a single HTTP request to large-scale, concurrent crawls. Built for modern websites, it intelligently adapts to structural changes by automatically relocating elements when page layouts update. The framework includes advanced fetchers capable of bypassing anti-bot protections such as Cloudflare Turnstile using stealth and browser automation techniques. Its powerful spider system supports multi-session crawling, pause and resume functionality, and real-time streaming of scraped data. ...

Downloads: 1 This Week

Last Update: 2026-03-08
See Project
16

Wagtail

A Django content management system focused on flexibility & UX

...Designed by developers for developers, Wagtail plays nicely with everything else in your tech stack so you can do more and focus on perfecting your site. Designers will find Wagtail’s simple templating system ideal for building beautiful websites just the way they want, without any CMS constraints. Editors can create beautiful, modular streams of content that they can create once and publish everywhere. Simply put, it’s the CMS that makes everyone happy!

1 Review

Downloads: 1 This Week

Last Update: 2026-03-03
See Project
17

changedetection.io

The best free open source website change detection and restock service

...Monitor out-of-stock products and get alerts when those products are back in stock, get restock alerts via Discord, Slack, email, and many other platforms. Using the browser steps configuration, add basic steps before performing change detection, such as logging into websites, adding a product to a cart, accepting cookie logins, entering dates, and refining searches. Monitor and track PDF file changes, and know when a PDF file has text changes. Know when your favourite product is on sale, or other special deals are announced before anyone else. Detect and monitor changes in JSON API responses.

Downloads: 1 This Week

Last Update: 3 days ago
See Project
18

spider_collection

Collection of Python web scraping scripts for data extraction tasks

spider_collection is a collection of Python web crawler scripts created primarily for experimentation, learning, and practical scraping tasks. spider_collection gathers multiple independent spiders designed to collect data from different platforms and services, demonstrating a variety of scraping techniques and workflows. These crawlers make use of common Python scraping tools such as requests, parsel, BeautifulSoup, and the Scrapy framework to extract structured information from web pages....

Downloads: 0 This Week

Last Update: 4 days ago
See Project
19

Offline HTML Viewer

Fast offline HTML viewer for opening local HTML files on Windows

...Typical use cases: • Open saved HTML files without using a web browser • View archived websites offline • Read documentation stored as HTML files • Quickly preview local HTML files

Downloads: 45 This Week

Last Update: 2026-03-15
See Project
20

WhakerKit

A seamless toolkit to manage dynamic websites and shared documents

WhakerKit is a versatile toolkit for building websites with both static and dynamic HTML pages, developed by Brigitte Bigi, CNRS. WhakerKit offers seamless management of public and authenticated access, and simplifies document sharing for collaborative environments. It is based on the following technologies: * python >= 3.9 * (optional) PyJWT and ldap3 for authentication (install with pip) * WhakerPy >= 1.3: <https://whakerpy.sourceforge.io> (install with pip) * Whakerexa >= 0.7: <https://whakerexa.sourceforge.io> (download package and unzip) * HTML-5, CSS-4 and JS technologies

Downloads: 0 This Week

Last Update: 2026-01-18
See Project
21

YehDown

Yeahdown: Easy-to-use video downloader for Windows

Yeahdown is a straightforward, user-friendly Windows-based application designed to simplify the process of downloading videos and audio from popular websites like YouTube and Vimeo. Perfect for non-technical users, it offers an intuitive interface and fast, reliable downloads. Key features include improved download speeds, support for multiple major video platforms, and real-time updates for new features. Tested on windows 11.

Downloads: 28 This Week

Last Update: 2025-07-20
See Project
22

Web Link Collector 1000

Automatically collect all links from websites to a clean txt file

## About Easily and automatically collect all your links into a neat txt list from a particular website or an entire section of a multi-page website network! Web Link Collector 1000 is a simple tool for gathering links from websites with minimal effort. It helps you collect resources for research, create reference lists, or save useful links without manual copying and pasting. ## Features - Two Collection Modes: Single page or multiple pages of specific website section, or even the entire domain! - Smart Filtering: Include only same-domain links or gather external links too - Duplicate Prevention: Automatically removes duplicate links - Website-Friendly: Uses respectful delays between requests - Custom File Naming: Save your collections with custom meaningful names - Modern Interface: Clean design with status updates - Link Normalization: Standardizes URLs for proper formatting

Downloads: 0 This Week

Last Update: 2025-07-16
See Project
23

Web Shortcuts

This tool is used to open websites/links by pressing one or more keys on the keyboard, acting as a true shortcut for web pages. When the shortcut keys are pressed, you will be directed to the site previously entered through the main browser set in the system (if the tool does not work after setting the shortcuts, try restarting it).

Downloads: 0 This Week

Last Update: 2025-01-20
See Project
24

Proxy_Pool

Python crawler proxy IP pool (proxy pool)

The main function of the crawler agent IP pool project is to regularly collect free agents published on the Internet for verification and storage, and to regularly verify and store agents to ensure the availability of agents, and to provide API and CLI. At the same time, you can also expand the proxy source to increase the quality and quantity of the proxy pool IP.

Downloads: 0 This Week

Last Update: 2024-01-08
See Project
25

S.I.P.E.R.

Advanced website blocking and productivity tool

A powerful, user-friendly website blocking and productivity application built with modern GTK 4 and Libadwaita. S.I.P.E.R. helps you maintain focus and productivity by blocking distracting websites with advanced features like Pomodoro focus mode, comprehensive statistics, and multi-language support.

Downloads: 0 This Week

Last Update: 2025-11-11
See Project