WebSPHINX is a web crawler (robot, spider) Java class library, originally developed by Robert Miller of Carnegie Mellon University. Multithreaded, tollerant HTML parsing, URL filtering and page classification, pattern matching, mirroring, and more.
JoBo is a web site mirroring tool. It has a graphical UI but there is a also command line version. Supports robot exclusion protocol (but this can be disabled)