crawler url free download

Showing 24 open source projects for "crawler url"

View related business solutions

Red Hat Enterprise Linux on Microsoft Azure
Deploy Red Hat Enterprise Linux on Microsoft Azure for a secure, reliable, and scalable cloud environment, fully integrated with Microsoft services.

Red Hat Enterprise Linux (RHEL) on Microsoft Azure provides a secure, reliable, and flexible foundation for your cloud infrastructure. Red Hat Enterprise Linux on Microsoft Azure is ideal for enterprises seeking to enhance their cloud environment with seamless integration, consistent performance, and comprehensive support.

Learn More
Get Advanced Threat Protection for Your Azure Workloads
FortiGate NGFW on Azure Enables You to Protect Your Workloads Beyond Basic Azure Security Services

FortiGate NGFW identifies and stops advanced threats with powerful application control, malware protection, web filtering, antivirus, and IPS technology. As the attack surface expands, FortiGate provides integrated and automated protection against emerging and sophisticated threats while securing hybrid or multi-cloud environments. Deploy today in Azure Marketplace.

Free 30-Day Trial
1

SiteOne Crawler

SiteOne Crawler is a website analyzer and exporter

SiteOne Crawler is a very useful and easy-to-use tool you'll ♥ as a Dev/DevOps, website owner or consultant. Works on all popular platforms - Windows, macOS, and Linux (x64 and arm64 too). It will crawl your entire website in depth, analyze and report problems, show useful statistics and reports, generate an offline version of the website, generate sitemaps, or send reports via email. Watch a detailed video with a sample report for Astro. build website. This crawler can be used as a command...

Downloads: 0 This Week

Last Update: 2024-09-14
See Project
2

WebMagic

A scalable web crawler framework for Java

WebMagic is a scalable crawler framework. It covers the whole lifecycle of crawler, downloading, url management, content extraction and persistent. It can simplify the development of a specific crawler. WebMagic is a simple but scalable crawler framework. You can develop a crawler easily based on it. WebMagic has a simple core with high flexibility, a simple API for html extracting. It also provides annotation with POJO to customize a crawler, and no configuration is needed. Some other features...

Downloads: 0 This Week

Last Update: 2023-12-05
See Project
3

crwlr

Library for Rapid (Web) Crawler and Scraper Development

... just load actually all links it is finding (and is allowed to load according to the robots.txt file), then it would just load the whole internet (if the URL(s) it starts with are no dead end). Or it can be restricted to load only links matching certain criteria (on same domain/host, URL path starts with "/foo",...) or only to a certain depth. A depth of 3 means 3 levels deep. Links found on the initial URLs provided to the crawler are level 1 and so on.

Downloads: 0 This Week

Last Update: 2024-08-05
See Project
4

Abdal Web Traffic Generator

create useful statistics and traffic on your site

This tool will have the ability to create useful statistics and traffic on your site and actually help rank your statistics on sites like Alexa and so on.

1 Review

Downloads: 9 This Week

Last Update: 2021-12-05
See Project
Holistically view your business data within a single solution.
For IT service providers and MSPs that need a data platform to manage their processes

BrightGauge, a ConnectWise solution, was started in 2011 to fill a missing need in the small-to-medium IT Services industry: a better way to manage data and provide the value of work to clients. BrightGauge Software allows you to display all of your important business metrics in one place through the use of gauges, dashboards, and client reports. Used by more than 1,800 companies worldwide, BrightGauge integrates with popular business solutions on the market, like ConnectWise, Continuum, Webroot, QuickBooks, Datto, IT Glue, Zendesk, Harvest, Smileback, and so many more. Dig deeper into your data by adding, subtracting, multiplying, and dividing one metric against another. BrightGauge automatically computes these formulas for you. Want to show your prospects how quick you are to respond to tickets? Show off your data with embeddable gauges on public sites.

Learn More
5

Universal Proxy Software 2.0

Universal Proxy Software 2.0 is the most advanced proxy software.

Universal SEO Software 2.0 is back with new features and Bug fixes. Features: Proxy Grabber Proxy Checker Proxy(IP) Changer Mega Proxy Grabber Mega Proxy Checker Auto Proxy Changer Proxy Editor Proxy Leecher Proxy Scrapper Mega Proxy Editor Mega Proxy Leecher Mega Proxy Scrapper Proxy Combiner Proxy Lookup Text Proxy Leecher Mega Proxy Combiner Mega Proxy Lookup Mega Text Proxy Leecher Mor Crawler Proxy Viewer Proxy URL Grabber Proxy Grabber: Proxy Grabber is one...

3 Reviews

Downloads: 6 This Week

Last Update: 2020-06-05
See Project
6

ShadowSocksShare

Python ShadowSocks framework

This project obtains the shared ss(r) account from the ss(r) shared website crawler, redistributes the account and generates a subscription link by parsing and verifying the account connectivity. Since Google plus will be closed on April 2, 2019, almost all the available accounts crawled before come from Google plus. So if you are building your own website, please keep an eye on the updates of this project and redeploy using the latest source code.

Downloads: 0 This Week

Last Update: 2022-11-09
See Project
7

crawler4j

Open source web crawler for Java

crawler4j is an open source web crawler for Java which provides a simple interface for crawling the Web. Using it, you can setup a multi-threaded web crawler in few minutes. You need to create a crawler class that extends WebCrawler. This class decides which URLs should be crawled and handles the downloaded page. shouldVisit function decides whether the given URL should be crawled or not. In the above example, this example is not allowing .css, .js and media files and only allows pages within...

Downloads: 1 This Week

Last Update: 2022-01-12
See Project
8

Catberry

Catberry is an isomorphic framework

... dependencies and create plugins, and Flux, for the data layer. Search crawler receives a full page from the server. The whole state of the application is restored from URL. Server-side progressive rendering based on node.js streams and parallel rendering of components in a browser. The framework is well-tested (code coverage is about 90%) and it is already used in production.

Downloads: 0 This Week

Last Update: 2022-12-01
See Project
9

frsi

Fast Remote SVN Info

A fast remote-repository information tool for Subversion. Need recursive svn info and the log for each file (with only relevant changed-paths), along with any svn:externals properties, quickly and all in a single XML output? frsi info -R --log file-relevant --propget svn:externals --xml <URL> (The first run with the --log option will be slow as it needs to cache the entire repository log.) Supports the standard SVN authentication options. Windows Users: This tool requires...

Downloads: 0 This Week

Last Update: 2015-11-26
See Project
Nectar: Employee Recognition Software to Build Great Culture
Nectar is an employee recognition software built for the modern workforce.

Our 360 recognition & rewards platform enables everyone (peer to peer & manager to employees alike) to send meaningful recognition rooted in core values. Nectar has the most extensive rewards catalog so users can choose from company branded swag, Amazon products, gift cards or custom reward types. Integrate with your other tools like Slack and Teams to make sending recognition easy. We support top organizations like MLB, SHRM, Redfin, Heineken and more.

Learn More
10

go_spider

An awesome Go concurrent Crawler(spider) framework

An awesome Go concurrent Crawler(spider) framework. The crawler is flexible and modular. It can be expanded to an Individualized crawler easily or you can use the default crawl components only. Spider gets a Request in Scheduler that has url to be crawled. Then Downloader downloads the result(html, json, jsonp, text) of the Request. The result is saved in Page for parsing in PageProcesser. Html parsing is based on goquery package. Json parsing is based on simple JSON package. Jsonp...

Downloads: 0 This Week

Last Update: 2023-01-27
See Project
11

GHZ Tools

... Fuzzer 6)- Web Scanner: RFI/LFI URL Scanner Web Extractor Open Port Scanner URL Crawler SQLi Scanner

Downloads: 7 This Week

Last Update: 2014-09-02
See Project
12

A simple Crawler

We can make a simple crawler with using Java Servlet & JSP . A crawl

... - HelloResult.class - Bfs.class - Queue.class - WebSource.class - [hw5] - [WEB-INF] - [classes] - [mvc] - index.html ( first page for crawler ) - web.xml ( the configuraion of all servlet ) - HelloController.java ( process the HTTP request and response ) - HelloModel.java ( main process and crawler , url match ) - HelloView.java ( show the result of crawler and search) - HelloResult.java ( show the search result)

Downloads: 0 This Week

Last Update: 2013-06-06
See Project
13

SauceWalk Proxy Helper

Enumeration and automation of file discovery for your sec tools.

SauceWalk is a freeware(.exe)/Open Source(.py) tool for aiding in the enumeration of web application structure. It consists of 2 parts a local executable (walk.exe) and a remote agent. Walk.exe iterates through the local files and folders of your target web application (for example a local copy of Wordpress) and generates requests via your favourite proxy (for example burp suite) against a given target url. The remote agent can be used to identify target files and folders on a live system...

Downloads: 0 This Week

Last Update: 2013-09-24
See Project
14

TumblOne

TumblOne is a Tumblr Blog Image Crawler

With this little tool you are able to Crawl various Images of Bloghoster Tumblr.com. It searches given Tumblr Blog Url for all types of image formats, and gives you the ability to download the found Images automatic. Tutorial Video: https://www.youtube.com/watch?v=Ns2bkvF_ht4 A Manual can be found at SourceForge in Download Area. BE AWARE: HERE IS A NEW DOMAIN www.TumblOne.COM, WICH POINTS TO MY PROJECT, THIS HAS NOTHING TO DO WITH THIS PROJECT, IT'S NOT FROM ME. STAY AWAY, MAYBE...

8 Reviews

Downloads: 20 This Week

Last Update: 2015-08-07
See Project
15

Web Crawler Security Tool

A web crawler oriented to information security.

Last update on tue mar 26 16:25 UTC 2012 The Web Crawler Security is a python based tool to automatically crawl a web site. It is a web crawler oriented to help in penetration testing tasks. The main task of this tool is to search and list all the links (pages and files) in a web site. The crawler has been completely rewritten in v1.0 bringing a lot of improvements: improved the data visualization, interactive option to download files, increased speed in crawling, exports list of found...

3 Reviews

Downloads: 2 This Week

Last Update: 2015-10-10
See Project
16

Page Crawler

This software will "crawl" pages for a given URL.

Downloads: 0 This Week

Last Update: 2013-04-29
See Project
17

Macs CMS

** Guys I have built a much more powerful Fully Featured CMS system at: https://github.com/MacdonaldRobinson/FlexDotnetCMS Macs CMS is a Flat File ( XML and SQLite ) based AJAX Content Management System. It focuses mainly on the Edit In Place editing concept. It comes with a built in blog with moderation support, user manager section, roles manager section, SEO / SEF URL

Downloads: 0 This Week

Last Update: 2019-01-26
See Project
18

Broken url checker

This is simple link checker. It can crawl any site and help to find broken links. It also having download CSV report option.The CSV file includes url ,parent page url and status of page [broken or ok]. It is be very useful for search engine optimization.

Downloads: 0 This Week

Last Update: 2013-04-05
See Project
19

Generic Web Crawler (GWC)

A toolkit for crawling information from web pages by combining different kinds of "actions". Actions are simple operations such as navigation to a specified url or extraction of text from the html. Also available is a graphic user interface.

Downloads: 0 This Week

Last Update: 2015-10-10
See Project
20

Crawlet engine.

Web Crawler Engine: jsrCRAW is an intelligent Java engine Crawler for Internete Content Monitoring: read periodically the content of url, retrieve link, apply rules (Crawlet) alert user of changes.

Downloads: 0 This Week

Last Update: 2015-11-28
See Project
21

URL Web Crawler

It is basicly a program that can make you a search engine. It is a web crawler, has all the web site source code (in ASP, soon to be PHP as well), and a mysql database.

Downloads: 0 This Week

Last Update: 2015-05-23
See Project
22

WebSPHINX

WebSPHINX is a web crawler (robot, spider) Java class library, originally developed by Robert Miller of Carnegie Mellon University. Multithreaded, tollerant HTML parsing, URL filtering and page classification, pattern matching, mirroring, and more.

2 Reviews

Downloads: 0 This Week

Last Update: 2015-11-12
See Project
23

bet365-websocket-crawler

bet365 bot: bet365

一、Introduction： Monitor bet365 in-play football matches scores. 二、Getting Started： require PyExecJS,requests,autobahn,twisted. pip install PyExecJS pip install requests pip install autobahn pip install twisted Run bet365_websocket_spider.py and see output logs. 三、Note： If it can't get data after running, try using the following API. Get all live football matches, url: http://106.52.68.20/b365/soccer/test/allEv?lang=en Get odds of all matches for goal line...

Downloads: 0 This Week

Last Update: 2023-10-19
See Project
24

FWebSpider

FWebSpider is a web crawler application written on Perl. It performs chosen site crawl, featuring response cache, URL storage, URL exclusion rules and more. It is developed to function as a local/global site search engine core.

Downloads: 0 This Week

Last Update: 2013-03-26
See Project