Aracnis is a Java based framework for building distributed web spiders. These spiders can be used to accomplish a variety of tasks, for example, screen-scraping and link integrity checking.
InSite is a Web site management tool written in perl. It checks link integrity and does some basic content monitoring of your site's files directly on the local disk, which gives it a huge speed advantage over similar tools.