Showing 129 open source projects for "spider"

View related business solutions
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 1
    Spider

    Spider

    High-performance Rust web crawler and scraper for large-scale data

    Spider can operate concurrently across many pages, allowing it to gather large datasets in a short period of time. Spider also provides mechanisms for subscribing to crawl events so developers can process page data such as URLs, status codes, or HTML content as it is discovered. It supports advanced capabilities such as headless browser rendering, background crawling tasks, and configurable rules that control crawl depth or ignored paths.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    xhs-spider

    xhs-spider

    Desktop tool for collecting and exporting Xiaohongshu post data

    XHS-Spider is a desktop data collection tool designed to gather content and metadata from the Xiaohongshu platform. It provides a graphical interface that allows users to explore posts, collect information, and download media such as images and videos from individual notes or search results. It was developed primarily as a learning project to demonstrate approaches to building web crawlers and experimenting with technologies such as WebView2 and WPF UI.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    Python-Spider

    Python-Spider

    Python3 web crawler practice

    Python-Spider is a repository intended to teach or provide examples for writing web spiders / crawlers in Python — part of a broader learning and resource collection by its author. The code and documentation are oriented toward beginners or intermediate learners who want to learn how to fetch, parse, and extract data from websites programmatically.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    spider_collection

    spider_collection

    Collection of Python web scraping scripts for data extraction tasks

    spider_collection is a collection of Python web crawler scripts created primarily for experimentation, learning, and practical scraping tasks. spider_collection gathers multiple independent spiders designed to collect data from different platforms and services, demonstrating a variety of scraping techniques and workflows. These crawlers make use of common Python scraping tools such as requests, parsel, BeautifulSoup, and the Scrapy framework to extract structured information from web pages....
    Downloads: 2 This Week
    Last Update:
    See Project
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 5
    EasySpider

    EasySpider

    A visual no-code/code-free web crawler/spider

    A visual code-free/no-code web crawler/spider, supporting both Chinese and English.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 6
    FEAPDER

    FEAPDER

    Powerful Python crawler framework for scalable web scraping tasks

    ...It focuses on providing a developer-friendly environment that makes it easier to create, run, and manage crawlers for a variety of data collection tasks. It includes several built-in spider types, such as AirSpider, Spider, TaskSpider, and BatchSpider, which address different crawling scenarios ranging from lightweight scraping to distributed and batch-based jobs. feapder supports features such as breakpoint resume, allowing crawlers to continue from where they stopped without losing progress. It also integrates monitoring and alerting capabilities to help developers track crawler performance and detect issues during execution. feapder includes browser rendering support for handling dynamic web pages and provides mechanisms for large-scale data deduplication during crawling.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Downloads: 16 This Week
    Last Update:
    See Project
  • 8
    Scrapy-Redis

    Scrapy-Redis

    Redis-based components for Scrapy

    ...Version 0.3 changed the requests serialization from marshal to cPickle, therefore persisted requests using version 0.2 will not able to work on 0.3. The class scrapy_redis.spiders.RedisSpider enables a spider to read the urls from redis. The urls in the redis queue will be processed one after another, if the first request yields more requests, the spider will process those requests before fetching another url from redis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Scrapling

    Scrapling

    An adaptive Web Scraping framework

    ...The framework includes advanced fetchers capable of bypassing anti-bot protections such as Cloudflare Turnstile using stealth and browser automation techniques. Its powerful spider system supports multi-session crawling, pause and resume functionality, and real-time streaming of scraped data. Scrapling combines high performance, memory efficiency, and extensive async support to deliver blazing-fast scraping workflows. With a developer-friendly API, CLI tools, MCP server integration for AI-assisted extraction, and Docker support, it offers a complete solution for modern web scrapers.
    Downloads: 5 This Week
    Last Update:
    See Project
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 10
    Grab Framework Project

    Grab Framework Project

    Web Scraping Framework

    ...The single request/response API that allows you to build network request, perform it and work with the received content. The API is built on top of urllib3 and lxml libraries. The Spider API to build asynchronous web crawlers. You write classes that define handlers for each type of network request. Each handler is able to spawn new network requests. Network requests are processed concurrently with a pool of asynchronous web sockets. Grab provides interface called Spider to develop multithreaded web-site scrapers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    DotnetSpider

    DotnetSpider

    Lightweight .NET framework for fast web crawling and data scraping

    DotnetSpider is a web crawling and data extraction framework built on the .NET Standard platform. It is designed to help developers create efficient and scalable crawlers for collecting structured data from websites. It provides a high-level API that simplifies the process of defining spiders, managing requests, and extracting content from web pages. Developers can create custom spiders by extending base classes and configuring pipelines that handle downloading, parsing, and storing...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Colly

    Colly

    Elegant Scraper and Crawler Framework for Golang

    Colly provides a clean interface to write any kind of crawler/scraper/spider. With Colly you can easily extract structured data from websites, which can be used for a wide range of applications, like data mining, data processing or archiving. Clean API. Fast (>1k request/sec on a single core) Manages request delays and maximum concurrency per domain. Automatic cookie and session handling. Sync/async/parallel scraping.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    In Files there is WebCrawlerMySQL.jar which supports MySql Connection Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby and MySQL Database - Written in Java Cross Platform Also See Free email Sender : https://sourceforge.net/projects/gitst-free-email-ender/ Please install Microsoft OpenJDK to start the application https://www.microsoft.com/openjdk
    Downloads: 13 This Week
    Last Update:
    See Project
  • 14

    pico-web-database

    Web spider/database/indexer system programmed in the Pico language

    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    Wurst Hacked Client

    Wurst Hacked Client

    Wurst Minecraft Hacked Client 1.21.10

    Wurst New Best Minecraft Java Edition Hack Client Free Download Wurst is a client for Minecraft Java Edition, this mod allows you to be able to fly in survival mode, gain speed, X-ray, auto bridge, kill aura and many more features! ⚠️ Use at Your Own Risk This repository is provided as-is. Using it may lead to a ban or other consequences depending on how and where it's used. You have been warned. This project is licensed under the MIT License
    Leader badge
    Downloads: 100 This Week
    Last Update:
    See Project
  • 16
    python-fxxk-spider

    python-fxxk-spider

    Collection of 100+ Python web scraping projects and crawler examples

    python-fxxk-spider is a curated collection of Python web scraping and crawler projects gathered in a single repository for reference and learning. It aggregates many independent scraping examples that target a wide range of websites, online services, and public data sources. Instead of being a single crawler tool, it functions as a catalog of ready-made Python spider implementations that demonstrate different scraping techniques. python-fxxk-spider includes scrapers for social media, e-commerce platforms, job listings, music services, video platforms, and various content sites. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    Tkgames is a site for games written using the powerful tcl/tk language. These include my original tesselation puzzle Polypuzzle, and recent additions Hearts, Spider, Yahtzee and the tooo addictive, Tktk.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 18

    ahCrawler

    A PHP search engine for your website and web analytics tool. GNU GPL3

    ...The spider is a CLI tool and must be added as a cronjob. In a web based backend you can control all data and analyze your data. You can handle multiple websites in the same backend. PHP 7 or 8 + PDO (Mysql/ Sqlite)
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19

    solitaire

    Elegant Solitaire with custom card decks, smooth animations, and keybo

    The Most Over-Engineered Solitaire Game You'll Ever Love is a meticulously crafted card game that goes beyond the basics. Play in Draw-One or Draw-Three modes with full keyboard support for mouseless gaming. Enjoy smooth drag-and-drop mechanics and auto-complete detection that knows when you're headed for victory. Customize your experience with different card backs or load entirely custom deck designs via ZIP files - play with cats, dinosaurs, or whatever themed cards you prefer! The game...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 20
    XM Solitaire
    200 card games for Windows (Freecell, Klondike, Fan, Spider, Pyramid, Gaps, ...). Games layout and rules are declared in XML format. User can add his own cards and background images.
    Leader badge
    Downloads: 62 This Week
    Last Update:
    See Project
  • 21
    Crawlab

    Crawlab

    Distributed web crawler admin platform for spiders management

    ...Tasks are scheduled by the task scheduler module in the master node, and received by the task handler module in worker nodes, which executes these tasks in task runners. Task runners are actually processes running spider or crawler programs, and can also send data through gRPC (integrated in SDK) to other data sources, e.g. MongoDB.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Easyspider - Distributed Web Crawler

    Easyspider - Distributed Web Crawler

    Easy Spider is a distributed Perl Web Crawler Project from 2006

    Easy Spider is a distributed Perl Web Crawler Project from 2006. It features code from crawling webpages, distributing it to a server and generating xml files from it. The client site can be any computer (Windows or Linux) and the Server stores all data. Websites that use EasySpider Crawling for Article Writing Software: https://www.artikelschreiber.com/en/ https://www.unaique.net/en/ https://www.unaique.com/ https://www.artikelschreiben.com/ https://www.buzzerstar.com/ https://easyperlspider.sourceforge.io/ https://www.sebastianenger.com/ https://www.artikelschreiber.com/opensource/ It is fun to look at some code that is few years ago and to see how one has improved himself. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    crawly

    crawly

    High-level web crawling and scraping framework for Elixir apps

    Crawly is a high-level application framework for crawling websites and extracting structured data using the Elixir programming language. It provides a complete environment for building web crawlers that systematically visit pages, collect information, and transform that data into structured formats for further processing. Crawly is designed for tasks such as data mining, information processing, and building historical archives of web content. Crawly follows the Elixir and OTP architecture...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    In Files there is WebCrawlerMySQL.jar which supports MySql Connection Please follow this link to get latest version https://sourceforge.net/projects/web-spider-web-crawler-extract/ Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby OR MySQL Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby Database - Written in Java Cross Platform See also Free Email Sender in this link: https://sourceforge.net/projects/gitst-free-email-ender/ Please install Microsoft OpenJDK to start the application https://www.microsoft.com/openjdk
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    A suite of games written by C++. Games include spider, landlord, solidate,and so on. IF you have any requirements, please leave your message in http://groups.google.com/group/myopensoft
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB