spider free download - SourceForge

Showing 97 open source projects for "spider"

View related business solutions

Mac Clear Filters & Widen Search

Enterprise-grade ITSM, for every business
Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.

Try it Free
Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
1

Spider

High-performance Rust web crawler and scraper for large-scale data

Spider can operate concurrently across many pages, allowing it to gather large datasets in a short period of time. Spider also provides mechanisms for subscribing to crawl events so developers can process page data such as URLs, status codes, or HTML content as it is discovered. It supports advanced capabilities such as headless browser rendering, background crawling tasks, and configurable rules that control crawl depth or ignored paths.

Downloads: 2 This Week

Last Update: 2026-03-31
See Project
2

spider_collection

Collection of Python web scraping scripts for data extraction tasks

spider_collection is a collection of Python web crawler scripts created primarily for experimentation, learning, and practical scraping tasks. spider_collection gathers multiple independent spiders designed to collect data from different platforms and services, demonstrating a variety of scraping techniques and workflows. These crawlers make use of common Python scraping tools such as requests, parsel, BeautifulSoup, and the Scrapy framework to extract structured information from web pages....

Downloads: 2 This Week

Last Update: 22 hours ago
See Project
3

Python-Spider

Python3 web crawler practice

Python-Spider is a repository intended to teach or provide examples for writing web spiders / crawlers in Python — part of a broader learning and resource collection by its author. The code and documentation are oriented toward beginners or intermediate learners who want to learn how to fetch, parse, and extract data from websites programmatically.

Downloads: 0 This Week

Last Update: 2025-12-08
See Project
4

EasySpider

A visual no-code/code-free web crawler/spider

A visual code-free/no-code web crawler/spider, supporting both Chinese and English.

Downloads: 8 This Week

Last Update: 2025-01-01
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
5

FEAPDER

Powerful Python crawler framework for scalable web scraping tasks

...It focuses on providing a developer-friendly environment that makes it easier to create, run, and manage crawlers for a variety of data collection tasks. It includes several built-in spider types, such as AirSpider, Spider, TaskSpider, and BatchSpider, which address different crawling scenarios ranging from lightweight scraping to distributed and batch-based jobs. feapder supports features such as breakpoint resume, allowing crawlers to continue from where they stopped without losing progress. It also integrates monitoring and alerting capabilities to help developers track crawler performance and detect issues during execution. feapder includes browser rendering support for handling dynamic web pages and provides mechanisms for large-scale data deduplication during crawling.

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
6

Spider-Team

Downloads: 16 This Week

Last Update: 2025-11-26
See Project
7

Scrapy-Redis

Redis-based components for Scrapy

...Version 0.3 changed the requests serialization from marshal to cPickle, therefore persisted requests using version 0.2 will not able to work on 0.3. The class scrapy_redis.spiders.RedisSpider enables a spider to read the urls from redis. The urls in the redis queue will be processed one after another, if the first request yields more requests, the spider will process those requests before fetching another url from redis.

Downloads: 0 This Week

Last Update: 2024-07-06
See Project
8

Scrapling

An adaptive Web Scraping framework

...The framework includes advanced fetchers capable of bypassing anti-bot protections such as Cloudflare Turnstile using stealth and browser automation techniques. Its powerful spider system supports multi-session crawling, pause and resume functionality, and real-time streaming of scraped data. Scrapling combines high performance, memory efficiency, and extensive async support to deliver blazing-fast scraping workflows. With a developer-friendly API, CLI tools, MCP server integration for AI-assisted extraction, and Docker support, it offers a complete solution for modern web scrapers.

Downloads: 5 This Week

Last Update: 2026-04-17
See Project
9

DotnetSpider

Lightweight .NET framework for fast web crawling and data scraping

DotnetSpider is a web crawling and data extraction framework built on the .NET Standard platform. It is designed to help developers create efficient and scalable crawlers for collecting structured data from websites. It provides a high-level API that simplifies the process of defining spiders, managing requests, and extracting content from web pages. Developers can create custom spiders by extending base classes and configuring pipelines that handle downloading, parsing, and storing...

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
10

Colly

Elegant Scraper and Crawler Framework for Golang

Colly provides a clean interface to write any kind of crawler/scraper/spider. With Colly you can easily extract structured data from websites, which can be used for a wide range of applications, like data mining, data processing or archiving. Clean API. Fast (>1k request/sec on a single core) Manages request delays and maximum concurrency per domain. Automatic cookie and session handling. Sync/async/parallel scraping.

Downloads: 2 This Week

Last Update: 2025-03-27
See Project
11

Web Spider, Web Crawler, Email Extractor

Free Extracts Emails, Phones and custom text from Web using JAVA Regex

In Files there is WebCrawlerMySQL.jar which supports MySql Connection Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby and MySQL Database - Written in Java Cross Platform Also See Free email Sender : https://sourceforge.net/projects/gitst-free-email-ender/ Please install Microsoft OpenJDK to start the application https://www.microsoft.com/openjdk

Downloads: 13 This Week

Last Update: 2025-11-23
See Project
12

pico-web-database

Web spider/database/indexer system programmed in the Pico language

Downloads: 1 This Week

Last Update: 2025-05-28
See Project
13

python-fxxk-spider

Collection of 100+ Python web scraping projects and crawler examples

python-fxxk-spider is a curated collection of Python web scraping and crawler projects gathered in a single repository for reference and learning. It aggregates many independent scraping examples that target a wide range of websites, online services, and public data sources. Instead of being a single crawler tool, it functions as a catalog of ready-made Python spider implementations that demonstrate different scraping techniques. python-fxxk-spider includes scrapers for social media, e-commerce platforms, job listings, music services, video platforms, and various content sites. ...

Downloads: 3 This Week

Last Update: 22 hours ago
See Project
14

Tk Games

Tkgames is a site for games written using the powerful tcl/tk language. These include my original tesselation puzzle Polypuzzle, and recent additions Hearts, Spider, Yahtzee and the tooo addictive, Tktk.

Downloads: 10 This Week

Last Update: 2024-11-15
See Project
15

Crawlab

Distributed web crawler admin platform for spiders management

...Tasks are scheduled by the task scheduler module in the master node, and received by the task handler module in worker nodes, which executes these tasks in task runners. Task runners are actually processes running spider or crawler programs, and can also send data through gRPC (integrated in SDK) to other data sources, e.g. MongoDB.

Downloads: 1 This Week

Last Update: 2023-07-26
See Project
16

Easyspider - Distributed Web Crawler

Easy Spider is a distributed Perl Web Crawler Project from 2006

Easy Spider is a distributed Perl Web Crawler Project from 2006. It features code from crawling webpages, distributing it to a server and generating xml files from it. The client site can be any computer (Windows or Linux) and the Server stores all data. Websites that use EasySpider Crawling for Article Writing Software: https://www.artikelschreiber.com/en/ https://www.unaique.net/en/ https://www.unaique.com/ https://www.artikelschreiben.com/ https://www.buzzerstar.com/ https://easyperlspider.sourceforge.io/ https://www.sebastianenger.com/ https://www.artikelschreiber.com/opensource/ It is fun to look at some code that is few years ago and to see how one has improved himself. ...

1 Review

Downloads: 0 This Week

Last Update: 2025-03-16
See Project
17

crawly

High-level web crawling and scraping framework for Elixir apps

Crawly is a high-level application framework for crawling websites and extracting structured data using the Elixir programming language. It provides a complete environment for building web crawlers that systematically visit pages, collect information, and transform that data into structured formats for further processing. Crawly is designed for tasks such as data mining, information processing, and building historical archives of web content. Crawly follows the Elixir and OTP architecture...

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
18

Web Spider, Web Crawler, Email Extractor

Free Extracts Emails, Phones and custom text from Web using JAVA Regex

In Files there is WebCrawlerMySQL.jar which supports MySql Connection Please follow this link to get latest version https://sourceforge.net/projects/web-spider-web-crawler-extract/ Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby OR MySQL Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby Database - Written in Java Cross Platform See also Free Email Sender in this link: https://sourceforge.net/projects/gitst-free-email-ender/ Please install Microsoft OpenJDK to start the application https://www.microsoft.com/openjdk

3 Reviews

Downloads: 0 This Week

Last Update: 2022-12-24
See Project
19

rubywebcrawler

web spider software written in ruby

Downloads: 0 This Week

Last Update: 2021-12-24
See Project
20

ReconSpider

Most Advanced Open Source Intelligence (OSINT) Framework

...ReconSpider can be used by Infosec Researchers, Penetration Testers, Bug Hunters, and Cyber Crime Investigators to find deep information about their target. ReconSpider aggregate all the raw data, visualize it on a dashboard, and facilitate alerting and monitoring on the data. Recon Spider also combines the capabilities of Wave, Photon and Recon Dog to do a comprehensive enumeration of attack surfaces. Reconnaissance is a mission to obtain information by various detection methods, about the activities and resources of an enemy or potential enemy, or geographic characteristics of a particular area. A Web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot that systematically browses the World Wide Web, typically for the purpose of Web indexing (web spidering).

Downloads: 2 This Week

Last Update: 2022-11-25
See Project
21

GoSpider

Gospider - Fast web spider written in Go

GoSpider - Fast web spider written in Go. Fast web crawling. Brute force and parse sitemap.xml. Parse robots.txt. Generate and verify link from JavaScript files. Link Finder. Find AWS-S3 from response source. Find subdomains from the response source. Get URLs from Wayback Machine, Common Crawl, Virus Total, Alien Vault. Format output easy to Grep. Support Burp input.

Downloads: 1 This Week

Last Update: 2023-01-27
See Project
22

ruia

Async Python framework for fast and flexible web scraping spiders

...Ruia is powered by Python’s asyncio library along with aiohttp, enabling developers to perform concurrent network requests efficiently and scrape data from websites with minimal overhead. Ruia follows a “write less, run faster” philosophy, emphasizing concise code and streamlined spider development. It provides a structured approach to building scraping projects through components such as data items, spiders, middleware, and plugins. Developers can define structured fields to extract information from HTML content and process responses asynchronously to improve crawling performance. It also supports middleware and plugin systems that allow customization of request handling, response processing, and additional functionality.

Downloads: 2 This Week

Last Update: 2026-03-11
See Project
23

yari

YARI is a comprehensive tool suite to debug (layouts), spy, spider, inspect and navigate SWT and Eclipse based application GUIs (Workbench or RCP) at runtime.

1 Review

Downloads: 0 This Week

Last Update: 2020-05-07
See Project
24

Photon

Incredibly fast crawler designed for OSINT

...Despite its speed focus, the tool still provides useful filtering and extraction capabilities for analysts who need structured results. Overall, Photon functions as a lightweight yet powerful reconnaissance spider for web intelligence gathering.

Downloads: 2 This Week

Last Update: 2026-03-03
See Project
25

GOPA

GOPA, a spider written in Golang, for Elasticsearch

GOPA, a spider written in Golang, for Elasticsearch. Lightweight, low footprint, memory requirement should, be 100MB. Easy to deploy, no runtime or dependency required. Easy to use, no programming or script ability needed, out-of-box features. First of all, get it, two opinions: download the pre-built package or compile it yourself. Besides Elasticsearch, Gopa doesn't require any other dependencies, just simply run .

Downloads: 0 This Week

Last Update: 2023-04-12
See Project