Web Scrapers for Windows

View 34 business solutions
  • Our Free Plans just got better! | Auth0 by Okta Icon
    Our Free Plans just got better! | Auth0 by Okta

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your secuirty. Auth0 now, thank yourself later.
    Try free now
  • Bright Data - All in One Platform for Proxies and Web Scraping Icon
    Bright Data - All in One Platform for Proxies and Web Scraping

    Say goodbye to blocks, restrictions, and CAPTCHAs

    Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.
    Get Started
  • 1

    Rapid Reference

    An extension that allows for hassle-free website citation/referencing.

    Please do not distribute with the goal of selling my program. How to attach to your Chrome/Edge/Brave etc Browser: 1. Download the extension(rapidreference.zip) 2. Extract 3. Go to chrome://extensions if on Chrome, or navigate to your extension management setting in your browser 4. Enable developer mode (usually top right) 5. Add unpacked extension 6. Choose the extracted extension's folder 7. There you go! How to use: 1. Start a session in the panel of the extension 2. Do required research/website navigation 3. You can check the preview of the cited references in the panel 4. You can copy the citation list to the clipboard through simply clicking the "Copy Citation" button 5. You're Done! Thanks for using my software 😊
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    YouTube video web scraper 2 [ISA]

    YouTube video web scraper 2 [ISA]

    YouTube video web scraper 2 [Improved.Simplified.Alternative]

    'YouTube video web scraper 2' is an desktop application developed using python 3.11.4 and other add-on libaries. Finds YouTube video based on user request and view as table. Export the table as excel. Compatible only for windows OS.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    This 5 generation selenium web crawler crawl through web page of a host website searching for static and dynamic links and able to detect honeypot links.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4

    twitch-batch-downloader

    Automate the download of entire Twitch.tv channels

    Automate the download of entire Twitch.tv channels with its metadata. Save each Twitch video into its own folder, with date and time values, video ID, stream metadata, frame screenshot, .ts parts list and sha256 hash. Keep the original ts files and generate mp4 files from them. It requires a shell and some command line utilities. See README.md for details in the Code/git section.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Nectar: Employee Recognition Software to Build Great Culture Icon
    Nectar: Employee Recognition Software to Build Great Culture

    Nectar is an employee recognition software built for the modern workforce.

    Our 360 recognition & rewards platform enables everyone (peer to peer & manager to employees alike) to send meaningful recognition rooted in core values. Nectar has the most extensive rewards catalog so users can choose from company branded swag, Amazon products, gift cards or custom reward types. Integrate with your other tools like Slack and Teams to make sending recognition easy. We support top organizations like MLB, SHRM, Redfin, Heineken and more.
    Learn More
  • 5
    PHP, mySQL and AJAX web crawler
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Es un software diseñado para suplir la necesidad de algunas personas de tener un Web Crawler o Spider duro, navega de forma automática por los diferentes sitios o paginas Web, extrayendo los enlaces a otras paginas.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    APC Anti Crawler is a php5 class based on APC which can be used to limit the amount of http request per IP. It stop web crawler to download your entire website.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Robust featureful multi-threaded CLI web spider using apache commons httpclient v3.0 written in java. ASpider downloads any files matching your given mime-types from a website. Tries to reg.exp. match emails by default, logging all results using log4j.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    ApeSmit is a very simple Python module to create XML sitemaps as defined at http://www.sitemaps.org. ApeSmit doesn’t contain any web spider or something like that, it just writes the data you provide to a file using the proper syntax.
    Downloads: 0 This Week
    Last Update:
    See Project
  • ManageEngine Endpoint Central for IT Professionals Icon
    ManageEngine Endpoint Central for IT Professionals

    A one-stop Unified Endpoint Management (UEM) solution

    ManageEngine's Endpoint Central is a Unified Endpoint Management Solution, that takes care of enterprise mobility management (including all features of mobile application management and mobile device management), as well as client management for a diversified range of endpoints - mobile devices, laptops, computers, tablets, server machines etc. With ManageEngine Endpoint Central, users can automate their regular desktop management routines like distributing software, installing patches, managing IT assets, imaging and deploying OS, and more.
    Learn More
  • 10
    Arachnid is a Java-based web spider framework. It includes a simple HTML parser object that parses an input stream containing HTML content. Simple Web spiders can be created by sub-classing Arachnid and adding a few lines of code called after each page
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Aracnis is a Java based framework for building distributed web spiders. These spiders can be used to accomplish a variety of tasks, for example, screen-scraping and link integrity checking.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Ayakashi

    Ayakashi

    The next generation web scraping framework

    The next-generation web scraping framework. The web has changed. Gone are the days when raw HTML parsing scripts were the proper tool for the job. Javascript and single-page applications are now the norms. Demand for data scraping and automation is higher than ever, from business needs to data science and machine learning. Our tools need to evolve. Ayakashi helps you build scraping and automation systems that are easy to build simple or sophisticated, highly performant, maintainable, and built for change. Ayakashi's way of finding things in the page and using them is done with props and domQL. Directly inspired by the relational database world (and SQL), domQL makes DOM access easy and readable no matter how obscure the page's structure is. Props are the way to package domQL expressions as re-usable structures which can then be passed around to actions or to be used as models for data extraction.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    BTV Rename
    The goal of this project is 100% TVDB recognition and SxxExx renaming of all files generated by BeyondTV while maintaining the BeyondTV database. Currently the project relies on TVrage and scans the entire folder which is selected by the user in the conf
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Blackfire Player

    Blackfire Player

    Web Crawling, Web Testing, and Web Scraping application

    Blackfire Player is a powerful Web Crawling, Web Testing, and Web Scraper application. It provides a nice DSL to crawl HTTP services, assert responses, and extract data from HTML/XML/JSON responses. Some Blackfire Player use cases: Crawl a website/API and check expectations -- aka Acceptance Tests; Scrape a website/API and extract values; Monitor a website; Test code with unit test integration (PHPUnit, Behat, Codeception, ...); Test code behavior from the outside thanks to the native Blackfire Profiler integration -- aka Unit Tests from the HTTP layer (tm). Blackfire Player executes scenarios written in a special DSL (files should end with .bkf).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    This is simple link checker. It can crawl any site and help to find broken links. It also having download CSV report option.The CSV file includes url ,parent page url and status of page [broken or ok]. It is be very useful for search engine optimization.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    CEF Python

    CEF Python

    Python bindings for the Chromium Embedded Framework (CEF)

    Python bindings for the Chromium Embedded Framework (CEF). CEF Python is an open source project founded by Czarek Tomczak in 2012 to provide Python bindings for the Chromium Embedded Framework (CEF). The Chromium project focuses mainly on Google Chrome application development while CEF focuses on facilitating embedded browser use cases in third-party applications. Lots of applications use CEF control, there are more than 100 million CEF instances installed around the world. There are numerous use cases for CEF. Use it as a modern HTML5 based rendering engine that can act as a replacement for classic desktop GUI frameworks. Think of it as Electron for Python. Embed a web browser widget in a classic Qt / GTK / wxPython desktop application. Use it for automated testing of web applications with more advanced capabilities than Selenium web browser automation due to CEF low level programming APIs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Constellio Enterprise Search engine

    Constellio Enterprise Search engine

    Open source Search Engine and Enterprise Search

    Constellio is an enterprise search engine that allows companies to search all their organization's information through a single interface (Web, CRM, ERP, ECM, Mail etc.). Constellio is Based on Apache Solr and Google Search Appliance's connector. Constellio has a powerful web crawler.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Crawler.NET is a component-based distributed framework for web traversal intended for the .NET platform. It comprises of loosely coupled units each realizing a specific web crawler task. The main design goals are efficiency and flexibility.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    A web crawler for crawling news from user specific websites and keywords using PHP and third party software
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit for All of Us

    DSTK - DataScience ToolKit is an opensource free software for statistical analysis, data visualization, text analysis, and predictive analytics. Newer version and smaller file size can be found at: https://sourceforge.net/projects/dstk3/ It is designed to be straight forward and easy to use, and familar to SPSS user. While JASP offers more statistical features, DSTK tends to be a broad solution workbench, including text analysis and predictive analytics features. Of course you may specify JASP for advanced data editing and RapidMiner for advanced prediction modeling. DSTK is written in C#, Java and Python to interface with R, NLTK, and Weka. It can be expanded with plugins using R Scripts. We have also created plugins for more statistical functions, and Big Data Analytics with Microsoft Azure HDInsights (Spark Server) with Livy. License: R, RStudio, NLTK, SciPy, SKLearn, MatPlotLib, Weka, ... each has their own licenses.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    The DeDuplicator is an add-on module (plug-in) for the web crawler Heritrix. It offers a means to reduce the amount of duplicate data collected in a series of snapshot crawls.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    DirIndexFaker is a PHP script designed to produce fake apache directory listings for the purpose of slowing down, and overloading with false positives the web spiders used by the RIAA, MPAA, and other Copyright Cartel members.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Webhunter is a distributed, multi-threaded web crawler designed for both general indexing and crawling the web for focused content.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    FWebSpider is a web crawler application written on Perl. It performs chosen site crawl, featuring response cache, URL storage, URL exclusion rules and more. It is developed to function as a local/global site search engine core.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Ferret

    Ferret

    Declarative web scraping

    A web scraping system aiming to simplify data extraction from the web. ferret has a declarative query language that makes it easy to focus on the data that you need to get. ferret has the ability to scrape JS rendered pages, handle all page events, and emulate user interactions. the ferret was designed as a library from the ground up. it can be easily embedded into any Go application. ferret helps you to focus on the data you need using an easy-to-learn declarative language. ferret uses Chrome/Chromium via Chrome Devtools Protocol to handle dynamically rendered web pages. ferret is extremely extensible, and creating custom functions and types is super easy. ferret allows users to focus on the data. It abstracts away the technical details and complexity of underlying technologies using its own declarative language. It is extremely portable, extensible, and fast.
    Downloads: 0 This Week
    Last Update:
    See Project