web crawler free download

Showing 22 open source projects for "web crawler"

View related business solutions

Python Clear Filters & Widen Search

SKUDONET Open Source Load Balancer
Take advantage of Open Source Load Balancer to elevate your business security and IT infrastructure with a custom ADC Solution.

SKUDONET ADC, operates at the application layer, efficiently distributing network load and application load across multiple servers. This not only enhances the performance of your application but also ensures that your web servers can handle more traffic seamlessly.

Learn More
Automated RMM Tools | RMM Software
Proactively monitor, manage, and support client networks with ConnectWise Automate

Out-of-the-box scripts. Around-the-clock monitoring. Unmatched automation capabilities. Start doing more with less and exceed service delivery expectations.

Learn More
1

Gerapy

Distributed Crawler Management Framework Based on Scrapy

Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Scrapyd-Client, Scrapyd-API, Django and Vue.js. Someone who has worked as a crawler with Python may use Scrapy. Scrapy is indeed a very powerful crawler framework. It has high crawling efficiency and good scalability. It is basically a necessary tool for developing crawlers using Python. If you use Scrapy as a crawler, then of course we can use our own host to crawl when crawling, but when the crawl is very large, we can’t run...

Downloads: 0 This Week

Last Update: 2023-07-19
See Project
2

TorBot

Dark Web OSINT Tool

Contributions to this project are always welcome. To add a new feature fork the dev branch and give a pull request when your new feature is tested and complete. If its a new module, it should be put inside the modules directory. The branch name should be your new feature name in the format <Feature_featurename_version(optional)>. On Linux platforms, you can make an executable for TorBot by using the install.sh script. You will need to give the script the correct permissions using chmod +x...

Downloads: 3 This Week

Last Update: 2023-10-12
See Project
3

Crawlab

Distributed web crawler admin platform for spiders management

Golang-based distributed web crawler management platform, supporting various languages including Python, NodeJS, Go, Java, PHP and various web crawler frameworks including Scrapy, Puppeteer, Selenium. Please use docker-compose to one-click to start up. By doing so, you don't even have to configure MongoDB database. The frontend app interacts with the master node, which communicates with other components such as MongoDB, SeaweedFS and worker nodes. Master node and worker nodes communicate...

Downloads: 0 This Week

Last Update: 2023-07-26
See Project
4

ReconSpider

Most Advanced Open Source Intelligence (OSINT) Framework

... the capabilities of Wave, Photon and Recon Dog to do a comprehensive enumeration of attack surfaces. Reconnaissance is a mission to obtain information by various detection methods, about the activities and resources of an enemy or potential enemy, or geographic characteristics of a particular area. A Web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot that systematically browses the World Wide Web, typically for the purpose of Web indexing (web spidering).

Downloads: 6 This Week

Last Update: 2022-11-25
See Project
Let your volunteer coordinators do their best work.
For non-profit organizations requiring a software solution to keep track of volunteers

Stop messing with tools that aren’t designed to amplify volunteer programs. With VolunteerMatters, it’s a delight to manage everything in one place.

Learn More
5

BotSlayer

BotSlayer Community Edition

BotSlayer is an application that helps track and detect potential manipulation of information spreading on Twitter. The tool is developed by the Observatory on Social Media at Indiana University --- the same lab that brought to you Botometer and Hoaxy. BotSlayer is not a tool to detect and remove likely social bots from your list of Twitter followers or friends. For that purpose, check out Botometer. If you just want to visualize the spread of some piece of information, consider Hoaxy....

Downloads: 0 This Week

Last Update: 2023-07-13
See Project
6

ShadowSocksShare

Python ShadowSocks framework

This project obtains the shared ss(r) account from the ss(r) shared website crawler, redistributes the account and generates a subscription link by parsing and verifying the account connectivity. Since Google plus will be closed on April 2, 2019, almost all the available accounts crawled before come from Google plus. So if you are building your own website, please keep an eye on the updates of this project and redeploy using the latest source code.

Downloads: 0 This Week

Last Update: 2022-11-09
See Project
7

pyspider

A powerful Spider(Web Crawler) system in Python

pyspider is a powerful Spider(Web Crawler) system in Python. Components are connected by message queue. Every component, including message queue, is running in their own process/thread, and replaceable. That means, when process is slow, you can have many instances of processor and make full use of multiple CPUs, or deploy to multiple machines. This architecture makes pyspider really fast. benchmarking. Since pyspider has various components, you can just run pyspider to start a standalone...

Downloads: 1 This Week

Last Update: 2021-03-31
See Project
8

diskover

File system crawler and disk space usage software

diskover is a file system crawler and disk space usage software that uses Elasticsearch to index your file metadata. diskover crawls and indexes your files on a local computer or remote storage server over network mounts. diskover helps manage your storage by identifying old and unused files and give better insights into data change "hotfiles", file duplication "dupes" and wasted space. It is designed to help deal with managing large amounts of data growth and provide detailed storage...

Downloads: 0 This Week

Last Update: 2020-05-16
See Project
9

sitecheck

Modular web site spider for web developers.

More than just a link checker, sitecheck is a website spider (also known as a crawler) which can assist with SEO by testing an entire site plus both inbound links from search engines and outbound links to other sites for the following issues: looping redirects (HTTP 301/302), broken links (HTTP 404), server errors (HTTP 500), spelling mistakes, low readability scores (using the Flesch Reading Ease test), missing/empty/duplicate meta tags, duplicate content, slow page speed, W3C validation...

1 Review

Downloads: 0 This Week

Last Update: 2014-10-04
See Project
Powerful small business accounting software
For small businesses looking for desktop accounting software

With AccountEdge, business owners can organize, process, and report on their financial information so they can focus on their business. Features include: accounting, integrated payroll, sales and purchases, contact management, inventory tracking, time billing, and more.

Learn More
10

Domain Analyzer Security Tool

Finds all the security information for a given domain name

Domain analyzer is a security analysis tool which automatically discovers and reports information about the given domain. Its main purpose is to analyze domains in an unattended way.

Downloads: 6 This Week

Last Update: 2016-11-26
See Project
11

SauceWalk Proxy Helper

Enumeration and automation of file discovery for your sec tools.

SauceWalk is a freeware(.exe)/Open Source(.py) tool for aiding in the enumeration of web application structure. It consists of 2 parts a local executable (walk.exe) and a remote agent. Walk.exe iterates through the local files and folders of your target web application (for example a local copy of Wordpress) and generates requests via your favourite proxy (for example burp suite) against a given target url. The remote agent can be used to identify target files and folders on a live system...

Downloads: 0 This Week

Last Update: 2013-09-24
See Project
12

Web Crawler Security Tool

A web crawler oriented to information security.

Last update on tue mar 26 16:25 UTC 2012 The Web Crawler Security is a python based tool to automatically crawl a web site. It is a web crawler oriented to help in penetration testing tasks. The main task of this tool is to search and list all the links (pages and files) in a web site. The crawler has been completely rewritten in v1.0 bringing a lot of improvements: improved the data visualization, interactive option to download files, increased speed in crawling, exports list of found...

3 Reviews

Downloads: 0 This Week

Last Update: 2015-10-10
See Project
13

Python Crawler Library

Python Web Crawler Library

A simple library for crawling the web. This library will give you the ability to create macros for crawling web site and preforming simple actions like preforming "log in" and other simple actions in web sites.

Downloads: 0 This Week

Last Update: 2015-06-04
See Project
14

elk

elk is a powerful open-source python based command-line web crawler that can recursively search for files and text on websites.

Downloads: 0 This Week

Last Update: 2013-04-18
See Project
15

Monkey-Spider

Moved to https://github.com/aikinci/monkeyspider

The Monkey-Spider is a crawler based low-interaction Honeyclient Project. It is not only restricted to this use but it is developed as such. The Monkey-Spider crawles Web sites to expose their threats to Web clients.

Downloads: 0 This Week

Last Update: 2013-05-30
See Project
16

FTP Crawler and Search

FTP crawler is designed to provide an easy web interface to searching files on the FTP and a crawler to index files on FTP servers.

Downloads: 0 This Week

Last Update: 2013-03-26
See Project
17

Universal Information Crawler

Universal information crawler is a fast precise and reliable Internet crawler. Uicrawler is a program/automated script which browses the World Wide Web in a methodical, automated manner and creates the index of documents that it accesses.

Downloads: 0 This Week

Last Update: 2013-04-24
See Project
18

zSearch -- The easy search engine

zSearch is a simple python based crawler and search engine. Raw HTML are stored in bzip2 archives, the index is created using pylucene, and twsited is used to provide internal http server. Results are sent back as XML over HTTP.

Downloads: 0 This Week

Last Update: 2016-07-24
See Project
19

isobel

A configurable knowledge management framework. It works out of the box, but it's meant mainly as a framework to build complex information retrieval and analysis systems. The 3 major components: Crawler, Analyzer and Indexer can also be used separately.

Downloads: 0 This Week

Last Update: 2013-03-22
See Project
20

Nomad - Tiny Search Engine

Nomad is tiny but efficient search engine and web crawler. This works very good for searching with in the set of corporate websites on internet and/or intranet's HTML documents or knowledge repositories.

Downloads: 0 This Week

Last Update: 2013-03-14
See Project
21

PySMBSearch

PySMBSearch is a crawler and search engine for SMB shares. It consists of a crawler script, which creates an index and stores it in an SQL database, and a CGI script that can be used to extract queries from the database.

Downloads: 0 This Week

Last Update: 2013-02-25
See Project
22

Distributed Webhunter

Webhunter is a distributed, multi-threaded web crawler designed for both general indexing and crawling the web for focused content.

Downloads: 0 This Week

Last Update: 2013-04-05
See Project