A function-testing, performance-measuring, site-mirroring, web spider that is widely portable and capable of using scenarios to process a wide range of web transactions, including ssl and forms.
webspider provides a mechanism to get contents from web. With the extended classes, you can do the following things: 1. grab urls from a specified base url 2. analyze the contents of a list of urls 3. get specific files from web 4. blablabla
XGreen Picture Gallery is a .Net Component which allows developers to add picture gallery component from toolbox. Now with Drag and Drop! The Project is still at development. -It will not run without JS file download it too..
Aura Link Spirit - Link/Domain Checker
Free Aura Link Spirit‘s features gives you the ability to check; Google Pagerank Google Index, Bing Index, Total # of Active Backlinks, Total # of Unique Active Backlinks, Total # of NoFollow Links that Point to Link/Domain, Alexa Traffic Rank, Alexa Links, Dmoz List, Total # of Facebook Likes, Page Title… It has proxy support and multi-threaded
arachne is a C++ library for HTTP crawling, link, text and metadata extraction designed to run in a distributed environment.
Command line HTML Parser to be used in scripts to extract data from HTML/webpage according to supplied path and options. Usefull for systematic periodic parsing pages with known structures where information keeps changing - like looking for item on ebay
A web crawler which uses regular expressions on text downloaded from a site.
open-search is a framework to build a p2p web search engine, whereby people mutually form a search engine without the intervention of central servers or a central actor.
SF Ftp Search Engine ----High speed, open source based and no database required. Demo: http://sf.hit.edu.cn/
Web Code Reader bir web adresinin kodlarını görüntüleme aracıdır. net framework 4.0 gerekmektedir.
Larbin is a Web crawler intended to fetch a large number of Web pages, it should be able to fetch more than 100 millions pages on a standard PC with much u/d. This set of PHP and Perl scripts, called webtools4larbin, can handle the output of Larbin and p
The CMS-Bandits is a set of php scripts, with online html editor, calendar, search engine, rss reader, revision log, personal nickpage, comment system, webcrawler and even more.
Scalable and configurable C++ spider library
This is a python script to scrape familysearch.org so you can select one of your ancestors and grab all of their ancestors and download them into one gedcom file.
getrss is a shell script which displays and interacts with RSS feeds. It accepts urls, and supports bookmarks.