Open Source Python Internet Software - Page 4

Python Internet Software

View 8735 business solutions

Browse free open source Python Internet Software and projects below. Use the toggles on the left to filter open source Python Internet Software by OS, license, language, programming language, and project status.

  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    spider_collection

    spider_collection

    Collection of Python web scraping scripts for data extraction tasks

    spider_collection is a collection of Python web crawler scripts created primarily for experimentation, learning, and practical scraping tasks. spider_collection gathers multiple independent spiders designed to collect data from different platforms and services, demonstrating a variety of scraping techniques and workflows. These crawlers make use of common Python scraping tools such as requests, parsel, BeautifulSoup, and the Scrapy framework to extract structured information from web pages. Several scripts also incorporate multi-threading and proxy usage to improve scraping efficiency and help avoid common anti-scraping limitations. In addition to raw data collection, some spiders include basic data processing and analysis using tools such as pandas and simple visualization with matplotlib. It also contains examples of proxy pool integration and encapsulation to support more reliable crawling when working with sites that enforce request limits.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2

    ImageFap Gallery Downloader

    Minimalistic ImageFap and xHamster gallery downloader (Python)

    Note: the work on this project is currently on halt. Check out https://sourceforge.net/projects/imagefap-gallery-dl-cli/?source=directory for a more recent version. ImageFap Gallery Downloader is a Python script for full image gallery downloads on ImageFap and xHamster. xHamster support added in v0.3 In order to download user folders in v0.4, open the user profile, follow the 'Galleries' link and copy the folder link from the sidebar to the clipboard. Download everything from a user by copying the user profile link. Windows executable was created with py2exe: https://sourceforge.net/projects/py2exe/ Please report bugs and problems.
    Leader badge
    Downloads: 18 This Week
    Last Update:
    See Project
  • 3
    Echo HTML Viewer

    Echo HTML Viewer

    Fast offline HTML viewer for opening local HTML files on Windows

    Echo HTML Viewer is a lightweight desktop app for viewing local HTML files without a browser or internet connection. Designed for simplicity and privacy, it lets you open saved web pages, documentation, and archived content in a clean, distraction-free interface. Key features: • Open HTML files instantly • Drag & drop support • Fast startup and low resource usage • Fully offline — no telemetry, no tracking • No background services Use cases: • View saved websites offline • Read HTML documentation • Preview local HTML files quickly FREE version includes core functionality. 🚀 PRO version available: Upgrade directly inside the app using the built-in “Unlock PRO” button. © 2026 Echo Technology
    Leader badge
    Downloads: 67 This Week
    Last Update:
    See Project
  • 4
    Azure SDK for Python

    Azure SDK for Python

    Active development of the Azure SDK for Python

    This repository is for active development of the Azure SDK for Python. For consumers of the SDK we recommend visiting our public developer docs or our versioned developer docs. For your convenience, each service has a separate set of libraries that you can choose to use instead of one, large Azure package. To get started with a specific library, see the README.md (or README.rst) file located in the library's project folder. Last stable versions of packages that have been provided for usage with Azure and are production-ready. These libraries provide you with similar functionalities to the Preview ones as they allow you to use and consume existing resources and interact with them, for example: upload a blob. They might not implement the guidelines or have the same feature set as the November releases. They do however offer wider coverage of services. A new set of management libraries that follow the Azure SDK Design Guidelines for Python are now available.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 5
    Checkov

    Checkov

    Prevent cloud misconfigurations during build-time for Terraform

    Checkov scans cloud infrastructure configurations to find misconfigurations before they're deployed. Checkov uses a common command-line interface to manage and analyze infrastructure as code (IaC) scan results across platforms such as Terraform, CloudFormation, Kubernetes, Helm, ARM Templates and Serverless framework. Verify changes to hundreds of supported resource types in all major cloud providers. Checkov supports developers using Terraform, Terraform plan, CloudFormation, Kubernetes, ARM Templates, Serverless, Helm, and AWS CDK. Scan cloud resources in build-time for misconfigured attributes with a simple Python policy-as-code framework. Analyze relationships between cloud resources using Checkov’s graph-based YAML policies. Execute, test, and modify runner parameters in the context of a subject repository CI/CD and version control integrations.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    CommunityScrapers

    CommunityScrapers

    This is a public repository containing scrapers

    Stash Community Scrapers is a large open-source collection of metadata extraction tools designed to work with the Stash media management platform, enabling automated scraping of content information from various online sources. The repository contains hundreds of scraper definitions written primarily in YAML and Python, each tailored to extract structured metadata such as titles, performers, tags, and media details from specific websites. These scrapers integrate directly into Stash, allowing users to enrich their media libraries with accurate and detailed information without manual entry. The project supports both automatic installation through in-app feeds and manual configuration for advanced use cases. Some scrapers require additional configuration such as API keys or cookies, highlighting its flexibility and adaptability to different sources.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    DDPM-CD

    DDPM-CD

    Remote sensing change detection using denoising diffusion models

    This is the Pytorch implementation of Remote Sensing Change Detection using Denoising Diffusion Probabilistic Models. The generated images contain objects that we commonly see in real remote sensing images, such as buildings, trees, roads, vegetation, water surfaces, etc., demonstrating the powerful ability of the diffusion models to extract key semantics that can be further used in remote sensing change detection. We fine-tune a light-weight change detection head which takes multi-level feature representations from the pre-trained diffusion model as inputs and outputs change prediction map.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    DecryptLogin

    DecryptLogin

    Python library providing APIs for automated website login workflows

    DecryptLogin is a Python library designed to simplify automated login processes for many popular websites by providing ready-to-use APIs that simulate authentication behavior. It focuses on implementing login mechanisms through HTTP requests, allowing developers to programmatically authenticate with supported services without manually replicating complex login flows. It includes modules that handle different authentication modes such as PC login, mobile login, and QR code login depending on what the target platform supports. DecryptLogin supports a wide variety of online services and platforms, including social media sites, developer platforms, cloud services, and other web portals. Developers can integrate these login routines into automation scripts, crawlers, or data collection tools that require authenticated sessions. It also provides example utilities and automation scripts demonstrating how the login APIs can be used in practical scenarios.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    FinalRecon

    FinalRecon

    All-in-one Python web reconnaissance tool for fast target analysis

    FinalRecon is an all-in-one web reconnaissance tool written in Python that helps security professionals gather information about a target website quickly and efficiently. It combines multiple reconnaissance techniques into a single command-line utility so users do not need to run several separate tools to collect similar data. FinalRecon focuses on providing a fast overview of a web target while maintaining accuracy in the collected results. It includes modules for gathering server information, analyzing SSL certificates, performing WHOIS lookups, and crawling website resources. FinalRecon can also enumerate DNS records, discover subdomains, search for directories and files, and scan common network ports. Historical URLs and resources can be retrieved from archived sources to help analyze changes in a website over time. Designed primarily for penetration testers and security researchers, FinalRecon simplifies the reconnaissance phase of security assessments.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 10
    GoogleScraper

    GoogleScraper

    Python tool for scraping search engine results from many providers

    GoogleScraper is a Python-based tool designed to automatically collect and process search engine results from multiple providers. It enables developers and researchers to programmatically query search engines and extract useful information such as links, titles, and result descriptions. GoogleScraper supports several major search engines and can be used to gather structured datasets from search result pages for further analysis. It provides two different scraping approaches: sending direct HTTP requests that simulate browser traffic or controlling real browsers through automation frameworks. By running automated queries and collecting results in bulk, the project can assist with tasks such as SEO research, trend discovery, or building datasets of websites related to specific keywords. GoogleScraper also includes capabilities for running multiple scraping tasks concurrently to improve performance and increase the amount of collected data.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    LocalStack

    LocalStack

    Develop and test your cloud apps offline

    LocalStack is a fully functional local AWS cloud stack that enables you to develop and test your cloud and serverless apps offline. It spins up an easy-to-use testing environment on your local machine that has the same APIs and works the same way as the real AWS cloud environment. It can spin up a number of different core Cloud APIs on your local machine, including API Gateway, Kinesis, DynamoDB, Firehose, Lambda and many others. LocalStack was built on some of today’s best-of-breed mocking/testing tools, combining them and making them interoperable, and adding important functionality such as error injection and pluggable services. All this happening locally, without ever talking to the cloud.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Open-CD

    Open-CD

    A Change Detection Repo Standing on the Shoulders of Giants

    Open-CD is an open source change detection toolbox based on a series of open source general vision task tools.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    PostHog

    PostHog

    PostHog provides open-source web & product analytics

    PostHog is an all‑in‑one open‑source platform for product and web analytics—offering event-based analytics, session recording, feature flagging, A/B testing, cohorts, and more—that you can self‑host, with full support for data privacy and enterprise compliance. Sync data from external tools like Stripe, Hubspot, your data warehouse, and more. Query it alongside your product data. Run custom filters and transformations on your incoming data. Send it to 25+ tools or any webhook in real time or batch export large amounts to your warehouse. Capture traces, generations, latency, and cost for your LLM-powered app.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    Pydoll

    Pydoll

    Async Python library in automating Chromium browsers without WebDriver

    Pydoll is a Python library designed for automating Chromium-based web browsers such as Chrome and Edge without relying on a traditional WebDriver layer. Instead of using external drivers, it connects directly to the Chrome DevTools Protocol through WebSocket, allowing scripts to control browser behavior more efficiently and with fewer compatibility issues. It provides a high-level API that simplifies common browser automation tasks while still offering access to low-level protocol features for advanced control. Its architecture is built around asynchronous programming using Python’s asyncio framework, enabling concurrent automation of multiple tabs and browser contexts. Pydoll also includes tools for monitoring and intercepting network traffic, allowing developers to analyze or modify requests and responses during automation workflows. It emphasizes realistic interactions and fingerprint management to reduce the likelihood of automated actions.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    Python API for JMComic

    Python API for JMComic

    Python crawler and API for downloading JMComic albums and images

    JMComic-Crawler-Python is a Python library and crawler framework designed to programmatically access and download comic content from the JMComic platform. It provides a structured API that allows developers to retrieve albums, chapters, and images using simple Python code while handling the necessary network requests and data processing behind the scenes. It supports both web-based and mobile API interfaces, enabling flexible interaction with the platform depending on the available endpoints. Its architecture includes components for configuration management, download orchestration, and client communication, allowing users to automate the retrieval of manga chapters or entire albums. It includes command-line functionality and configuration files so users can customize download behavior, directory structures, and performance settings without modifying code. It also supports plugin-based extensions that allow additional processing.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    SMTP Tunnel Proxy

    SMTP Tunnel Proxy

    A high-speed covert tunnel that disguises TCP traffic as SMTP email

    SMTP Tunnel Proxy is a high-speed covert tunneling proxy that disguises regular TCP traffic as legitimate SMTP email communication to evade deep packet inspection (DPI) firewalls and censorship systems. It implements a SOCKS5 proxy interface on the client that wraps outbound traffic into an SMTP-like handshake (EHLO, STARTTLS, AUTH) and encrypted payload, making the session appear to DPI systems as a normal email exchange. The tool supports modern TLS encryption (STARTTLS) with HMAC-SHA256 authentication, per-user secrets, IP whitelisting, and multiplexed connections over a single tunnel. With a simple installer and systemd service setup, users can quickly deploy it on a Linux VPS as a tunneled access point, then connect from Windows, macOS, or Linux clients using auto-generated client packages. The project is useful for environments where standard VPNs or proxies are blocked, offering a combination of performance and stealth.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    Weibo Crawler

    Weibo Crawler

    Python crawler for collecting and downloading Sina Weibo user data

    weibo-crawler is a Python-based data collection tool designed to retrieve information from Sina Weibo user accounts. It automates the process of gathering posts, user profile details, and engagement metrics from one or more target accounts. weibo-crawler can extract comprehensive information about users, including profile attributes such as nickname, follower count, following count, and account metadata. It also captures detailed data about each post, including the content, publishing time, topics, mentions, likes, reposts, and comments. In addition to textual data, the project can download original media from posts, such as images, videos, and Live Photo content. Collected data can be exported to structured formats such as CSV or JSON or stored in databases for further analysis and research. It supports incremental crawling so users can periodically collect only newly published posts, making it useful for ongoing monitoring or dataset updates.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    autocrawler

    autocrawler

    Multiprocess Selenium crawler for downloading images by keywords

    AutoCrawler is a Python-based image crawling tool designed to automatically download large numbers of images from search engines using automated browser interaction. It uses Selenium and a Chrome browser driver to navigate image search pages and collect image sources based on keywords provided by the user. AutoCrawler supports multiprocess and multithreaded downloading, which allows it to retrieve images faster by running several tasks simultaneously. Users provide search terms through a simple keyword file, and the crawler organizes downloaded images into directories for each keyword. It can download either thumbnails or full resolution images and supports multiple image formats such as JPG, GIF, and PNG. It also includes configuration options such as headless mode, download limits, proxy usage, and thread count to customize crawling behavior.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    bilibili-manga-downloader

    bilibili-manga-downloader

    Download and manage Bilibili Manga chapters with GUI downloader

    BiliBili-Manga-Downloader is an open source desktop application designed to download manga chapters from the Bilibili Manga platform for offline reading and local management. It was created to address limitations of the web reading experience, such as intrusive advertisements, inconvenient image zooming, and inconsistent navigation during reading sessions. It provides a graphical user interface that allows users to search for manga titles using keywords, view detailed information about available series, and select chapters to download. BiliBili-Manga-Downloader supports multi-threaded downloading to improve performance and includes progress tracking with estimated time remaining for active downloads. It also offers multiple output formats, allowing chapters to be saved as image folders or compressed comic archive formats suitable for local manga readers.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    crawler

    crawler

    Collection of JS reverse engineering examples for web scraping study

    crawler is a collection of web scraping and JavaScript reverse engineering examples designed for learning how modern websites protect their data and how those protections can be analyzed. It contains many case studies that demonstrate how to analyze and replicate request parameters, cookies, and encryption logic used by real websites. Each directory in the project focuses on a specific target service or scenario, showing how browser network requests and JavaScript code can be studied to reproduce API calls programmatically. Many examples illustrate techniques such as debugging scripts, intercepting requests, analyzing encrypted parameters, and understanding authentication flows. crawler also explores common anti-scraping defenses and demonstrates how developers can examine them through debugging tools and reverse engineering techniques.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    django CMS

    django CMS

    Easy-to-use and developer-friendly enterprise CMS powered by Django

    Create modern websites that content editors love. django CMS was originally conceived by web developers frustrated with the technical and security limitations of other systems. Its lightweight core makes it easy to integrate with other software and put to use immediately, while its ease of use makes it the go-to choice for content managers, content editors and website admins. Developers can integrate other existing Django applications rapidly, or build brand new compatible apps that take advantage of django CMS's publishing and editing features. django CMS is user-friendly and has a very intuitive drag-and-drop interface. It's built around the needs of multi-lingual publishing by default, not as an afterthought: all websites, pages and content can exist in multiple language versions.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    galacteek

    galacteek

    Multi-platform browser for the distributed web

    galacteek is a multi-platform Qt5-based browser and semantic agent for the distributed web. Be sure to install all the gstreamer packages on your system to be able to use the mediaplayer. After opening/mounting the DMG image, hold Control and click on the galacteek icon, and select Open and accept. You probably need to allow the system to install applications from anywhere in the security settings. Docker images are available. They run the full GUI inside a virtual Xorg server (using Xvfb). A VNC server runs on TCP port 5900 of the container, just use a regular VNC client to access the interface. The password to access the VNC service is printed to the console when starting the container.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    grab-site

    grab-site

    Web crawler for archiving and backing up sites into WARC archives

    grab-site is an open source web crawling tool designed to archive and back up websites by recursively downloading their content. It works by taking a starting URL and systematically following links across the site, capturing pages and resources and saving them into WARC archive files for long-term preservation. Internally, the crawler uses a fork of the wpull engine to fetch and process web pages efficiently during large-scale crawls. grab-site includes a built-in dashboard that displays real-time crawl activity, including which URLs are currently being processed and how many remain in the queue. Users can dynamically apply ignore patterns during an active crawl, allowing them to skip problematic or unnecessary URLs that could slow down or block the archiving process. grab-site also provides predefined ignore sets for common site structures such as forums and other complex web platforms. Additional mechanisms like duplicate page detection help avoid re-crawling identical content.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    jd-autobuy

    jd-autobuy

    Python tool that automates JD.com login and product purchase tasks

    jd-autobuy is an open source Python-based automation tool designed to simulate the purchasing process on the JD e-commerce platform. It uses web scraping and HTTP request techniques to log into an account, check product availability, and attempt to purchase specified items automatically. It supports login through methods such as QR code authentication, allowing users to sign in through the platform’s mobile application. Once authenticated, the script can retrieve product details including price, stock status, and item information. It can automatically add items to the shopping cart and prepare an order submission workflow for faster purchasing during high-demand sales or limited stock releases. Users can configure parameters such as the product ID, quantity, refresh interval, and purchase behavior using command-line options. jd-autobuy is intended primarily for learning purposes and demonstrates how automated scripts can interact with web services and online shopping systems .
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    mlscraper

    mlscraper

    ML-based HTML scraper that learns extraction rules from examples

    mlscraper is a Python library designed to automatically extract structured data from HTML pages without requiring developers to manually write CSS selectors or XPath rules. Instead of defining extraction logic by hand, users provide a few examples of the data they want to retrieve from a webpage. It analyzes those examples within the HTML document and determines patterns or rules that can be used to extract the same type of information from similar pages. Once trained, the generated scraper can process new pages and return the extracted data in structured formats such as dictionaries or lists. This approach simplifies web scraping tasks by shifting the focus from rule-writing to example-based training. Internally, the project processes HTML documents, identifies relevant elements in the DOM, and builds extraction logic based on statistical or heuristic analysis of the training samples. The result is a developer-oriented tool that aims to automate common scraping workflows.
    Downloads: 2 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB