Simple-Scrape is a simple web-scraping library that allows for programmatic access to HTML code. No further techniques are needed and the library is very compact and thus easy to use.
A Web crawler prototype designed to index pages of certain resource sharing platforms based on folksonomy tags. The results are displayed in an Excel spreadsheet.
Arachnid is a Java-based web spider framework. It includes a simple HTML parser object that parses an input stream containing HTML content. Simple Web spiders can be created by sub-classing Arachnid and adding a few lines of code called after each page