NekoHTML is a simple HTML scanner and tag balancer that enables application programmers to parse HTML documents and access the information using standard XML interfaces.
Aracnis is a Java based framework for building distributed web spiders. These spiders can be used to accomplish a variety of tasks, for example, screen-scraping and link integrity checking.
JLinkCheck is an Ant Task written in Java for checking links in websites. It is not just checking one single page, but crawling a whole site like a spider, generating a report in XML and (X)HTML. JReptator will be its succesor with many more features
Programmable web client utilising HttpUnit with input & output files in XML. Eccles includes the ability to create a GUI to control/monitor the processing, and can be used for website testing as well as automating web transactions.
Bugkilla is a set of java tools for the functional test of J2EE Web Applications.
Specification and execution of tests will be automated for web front end and business logic layer.
One goal is to integrate with existing frameworks and tools.
Checks links in HTML files. Checks almost any tag attributes known to contain references to other resources. Supports multithreading. Written in Java. Frontends for Console and AWT available, Swing in development.
Orome is a tool for automating System or Acceptance tests (also Unit test though this is not the focus) for web-based systems. Orome takes a set of static HTML pages defining a walkthrough of (part of) the systems and tests it against the running system.