ahCrawler
A PHP search engine for your website and web analytics tool. GNU GPL3
ahCrawler is a set to implement your own search on your website and an analyzer for your web content. It can be used on a shared hosting.
It consists of
* crawler (spider) and indexer
* search for your website(s)
* search statistics
* website analyzer (http header, short titles and keywords, linkchecker, ...)
You need to install it on your own server. So all crawled data stay in your environment.
You never know when an external webspider updated your content. Trigger a rescan...