An open source search engine with RESTFul API and crawlers
WebCollector is an open source web crawler framework based on Java.
XML bindings and a GUI for creating and editing XBMC Scrapers
Auto Rescanning - Search Terms - Regularly Updated With New Features
You can analyze a, img, h1, h2 tags in your site.
Mail ARchiver
keyword search engine for semi-structured data (Tables, lists,...)