Showing 83 open source projects for "linux proxy scraper"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 1
    Scrapling

    Scrapling

    An adaptive Web Scraping framework

    Scrapling is an adaptive web scraping framework designed to handle everything from a single HTTP request to large-scale, concurrent crawls. Built for modern websites, it intelligently adapts to structural changes by automatically relocating elements when page layouts update. The framework includes advanced fetchers capable of bypassing anti-bot protections such as Cloudflare Turnstile using stealth and browser automation techniques. Its powerful spider system supports multi-session crawling,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2

    python-proxy-headers

    Handle custom proxy headers when making HTTPS requests in python

    The python-proxy-headers package provides support for handling custom proxy headers when making HTTPS requests in various python modules. We currently provide extensions to the following packages: urllib3 requests aiohttp httpx None of these modules provide good support for parsing custom response headers from proxy servers. And some of them make it hard to send custom headers to proxy servers. So we at ProxyMesh made these extension modules to support our customers that use...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3

    scrapy-proxy-headers

    Handle custom proxy headers when making HTTPS requests in scrapy

    The scrapy-proxy-headers package is designed for adding proxy headers to HTTPS requests in Scrapy. In normal usage, custom headers put in request.headers cannot be read by a proxy when you make a HTTPS request, because the headers are encrypted and passed through the proxy tunnel, along with the rest of the request body. You can read more about this at Proxy Server Requests over HTTPS. Because Scrapy does not have a good way to pass custom headers to a proxy when you make HTTPS...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4

    http-proxy-tunnel

    Create nested tunnels through HTTP proxies

    Http-proxy-tunnel creates TCP tunnels through http proxies that permit the CONNECT method. It differs from other proxy tunnelling programs in that it can tunnel through multiple proxies, and can use SSL tunnels. These abilities mean that in combination with a web server that can proxy (such as Apache) you can serve normal web pages from ports 80 and 443 and connect to the server (using ssh say) via those ports at the same time. All available documentation can be read online at...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 5
    dude uncomplicated data extraction

    dude uncomplicated data extraction

    dude uncomplicated data extraction: A simple framework

    Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax. Dude is currently in Pre-Alpha. Please expect breaking changes. You can run your scraper from terminal/shell/command-line by supplying URLs, the output filename of your choice and the paths to your python scripts to dude scrape command.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    spider_collection

    spider_collection

    Collection of Python web scraping scripts for data extraction tasks

    spider_collection is a collection of Python web crawler scripts created primarily for experimentation, learning, and practical scraping tasks. spider_collection gathers multiple independent spiders designed to collect data from different platforms and services, demonstrating a variety of scraping techniques and workflows. These crawlers make use of common Python scraping tools such as requests, parsel, BeautifulSoup, and the Scrapy framework to extract structured information from web pages....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    autocrawler

    autocrawler

    Multiprocess Selenium crawler for downloading images by keywords

    AutoCrawler is a Python-based image crawling tool designed to automatically download large numbers of images from search engines using automated browser interaction. It uses Selenium and a Chrome browser driver to navigate image search pages and collect image sources based on keywords provided by the user. AutoCrawler supports multiprocess and multithreaded downloading, which allows it to retrieve images faster by running several tasks simultaneously. Users provide search terms through a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    CacheGuard Gateway

    CacheGuard Gateway

    Free UTM appliance: firewall, VPN, WAF and antivirus in one ISO.

    Securing your network should not require an enterprise budget. CacheGuard is a free open-source network security appliance for startups and growing businesses that need serious protection without the complexity. Install CacheGuard-OS on any x86 machine or VM and get a complete security gateway in under an hour. No plug-ins, no compatibility issues. Everything works out of the box. CacheGuard-OS is not an app, it IS the OS. A fully custom network appliance operating system built from...
    Leader badge
    Downloads: 9 This Week
    Last Update:
    See Project
  • 9
    python-proxy

    python-proxy

    HTTP/HTTP2/HTTP3/Socks4/Socks5/Shadowsocks/ShadowsocksR/SSH

    python-proxy, also known as pproxy, is a lightweight proxy tool written in Python for flexible local and remote traffic forwarding. It supports multiple proxy protocols, making it useful for developers, testers, and network administrators who need a compact proxy layer without a heavy service stack. The project can operate as a client, server, forward proxy, reverse proxy, or protocol bridge depending on how it is configured. It supports HTTP, SOCKS4, SOCKS5, Shadowsocks, and newer transport...
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 10
    vido

    vido

    Video/Audio Downloader frontend for youtube-dl

    Vido is a Video/Audio Downloader frontend for the popular youtube downloader youtube-dl, a rewrite of ytd-gtk by the same team, updated to python 3 using pygobjects and GTK+. Vido now uses yt-dlp, an updated fork of youtube-dl for downloading videos. yt-dlp is faster and supports more sites than the original youtube-dl The program has been tested only on linux and the installation instructions are also provided for the same on our wiki. We do not provide support for windows/mac...
    Leader badge
    Downloads: 22 This Week
    Last Update:
    See Project
  • 11
    Proxy_Pool

    Proxy_Pool

    Python crawler proxy IP pool (proxy pool)

    The main function of the crawler agent IP pool project is to regularly collect free agents published on the Internet for verification and storage, and to regularly verify and store agents to ensure the availability of agents, and to provide API and CLI. At the same time, you can also expand the proxy source to increase the quality and quantity of the proxy pool IP.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    ddgr

    ddgr

    DuckDuckGo from the terminal

    ddgr is a cmdline utility to search DuckDuckGo from the terminal. While googler is highly popular among cmdline users, in many forums the need of a similar utility for privacy-aware DuckDuckGo came up. DuckDuckGo Bangs are super-cool too! So here's ddgr for you! Unlike the web interface, you can specify the number of search results you would like to see per page. It's more convenient than skimming through 30-odd search results per page. The default interface is carefully designed to use...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    SCFProxy

    SCFProxy

    A proxy tool based on cloud function

    SCFProxy is a tool to implement HTTP proxy, SOCKS proxy, and reverse proxy based on cloud function and API gateway provided by several cloud service providers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    AutoScraper

    AutoScraper

    A Smart, Automatic, Fast and Lightweight Web Scraper for Python

    This project is made for automatic web scraping to make scraping easy. It gets a URL or the HTML content of a web page and a list of sample data that we want to scrape from that page. This data can be text, URL or any HTML tag value of that page. It learns the scraping rules and returns similar elements. Then you can use this learned object with new URLs to get similar content or the exact same element of those new pages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    mlscraper

    mlscraper

    ML-based HTML scraper that learns extraction rules from examples

    mlscraper is a Python library designed to automatically extract structured data from HTML pages without requiring developers to manually write CSS selectors or XPath rules. Instead of defining extraction logic by hand, users provide a few examples of the data they want to retrieve from a webpage. It analyzes those examples within the HTML document and determines patterns or rules that can be used to extract the same type of information from similar pages. Once trained, the generated scraper...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    pspider

    pspider

    Simple Python framework for building multithreaded web crawlers

    PSpider is a lightweight web crawling framework written in Python designed to simplify the development of custom web spiders. It focuses on providing an easy-to-understand architecture while still supporting concurrent crawling for improved performance. It uses a multithreaded model that separates the crawling workflow into several components responsible for fetching, parsing, and saving data. Tasks are managed through queues, allowing different parts of the crawler to process work...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Scylla

    Scylla

    Intelligent proxy pool for collecting and managing public proxies

    Scylla is an open source proxy pool system designed to collect, validate, and manage large numbers of public proxy servers for use in web scraping and data extraction workflows. It automatically crawls the internet to discover proxy IP addresses and evaluates their availability and reliability before adding them to a usable pool. It includes a JSON API that allows developers and applications to retrieve proxy information programmatically, making it easier to integrate proxy rotation into...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 18
    googler

    googler

    Google Search, Google Site Search, Google News from the terminal

    googler is a power tool to Google (Web & News) and Google Site Search from the command-line. It shows the title, URL and abstract for each result, which can be directly opened in a browser from the terminal. Results are fetched in pages (with page navigation). Supports sequential searches in a single googler instance. googler was initially written to cater to headless servers without X. You can integrate it with a text-based browser. However, it has grown into a very handy and flexible...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    GoogleScraper

    GoogleScraper

    Python tool for scraping search engine results from many providers

    GoogleScraper is a Python-based tool designed to automatically collect and process search engine results from multiple providers. It enables developers and researchers to programmatically query search engines and extract useful information such as links, titles, and result descriptions. GoogleScraper supports several major search engines and can be used to gather structured datasets from search result pages for further analysis. It provides two different scraping approaches: sending direct...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Streisand

    Streisand

    Streisand sets up a new server running your choice

    Streisand is a tool that automates the deployment of censorship-resistant VPN services on cloud servers. It was created to help users bypass internet censorship and surveillance by quickly setting up secure communication channels without needing deep system administration expertise. With just a cloud provider account and basic Unix command-line knowledge, Streisand can provision a server and configure multiple VPN and proxy protocols almost automatically. This includes OpenVPN, WireGuard,...
    Downloads: 182 This Week
    Last Update:
    See Project
  • 21
    v2rayL

    v2rayL

    v2ray linux GUI

    V2Ray is a tool under Project V. Project V includes a series of tools to help you create your own customized network system. And V2Ray belongs to the core one. Simply put, V2Ray is a proxy software similar to Shadowsocks, but has more advantages than Shadowsocks.v2ray linux client, using pyqt5 to write GUI interface, the core is based on v2ray-core (v2ray-linux-64) vmess supports websocket, mKcp, and tcp. There may be some bugs in the current program, but they have not been tested. If you find bugs during use, please submit them in issue for improvement. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    django-dynamic-scraper

    django-dynamic-scraper

    Creating Scrapy scrapers via the Django admin interface

    Django Dynamic Scraper (DDS) is an app for Django build on top of the scraping framework Scrapy. While preserving many of the features of Scrapy it lets you dynamically create and manage spiders via the Django admin interface. With Django Dynamic Scraper (DDS) you can define your Scrapy scrapers dynamically via the Django admin interface and save your scraped items in the database you defined for your Django project. Since it simplifies things DDS is not usable for all kinds of scrapers, but...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    PivotSuite

    PivotSuite

    Network Pivoting Toolkit

    PivotSuite is a portable, platform-independent and powerful network pivoting toolkit, Which helps Red Teamers / Penetration Testers to use a compromised system to move around inside a network. It is a Standalone Utility, Which can use as a Server or as a Client. If the compromised host is directly accessible (Forward Connection) from Our pentest machine, Then we can run pivotsuite as a server on the compromised machine and access the different subnet hosts from our pentest machine, Which was...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    ProxyBroker

    ProxyBroker

    Asynchronous tool for finding and checking public proxy servers

    ProxyBroker is an open source Python tool designed to automatically discover and verify public proxy servers from many online sources. It operates asynchronously, allowing it to gather and test large numbers of proxies efficiently while performing multiple checks concurrently. It collects proxy addresses from dozens of providers and evaluates whether they are functional and suitable for use. It supports several proxy protocols, including HTTP, HTTPS, SOCKS4, and SOCKS5, making it flexible...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Jupyter Server Proxy

    Jupyter Server Proxy

    Jupyter notebook server extension to proxy web services.

    Jupyter Server Proxy lets you run arbitrary external processes (such as RStudio, Shiny Server, Syncthing, PostgreSQL, Code Server, etc) alongside your notebook server and provide authenticated web access to them using a path like /rstudio next to others like /lab. Alongside the Python package that provides the main functionality, the JupyterLab extension (@jupyterhub/jupyter-server-proxy) provides buttons in the JupyterLab launcher window to get to RStudio for example.
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo