From: Aaron B. <aa...@ar...> - 2012-05-16 20:42:44
|
Erik Hetzner <eri...@uc...> writes: > A quick question. UURI [1] is located in Heritrix Commons. HandyUrl is > located in archive-commons. Which should I use? Hmmm, it might depend on your needs. AFAIK, the UURI is geared towards Heritrix's needs, which includes a pretty light "normalization" of the URL. From an archival capture point of view, I think the idea is that Heritrix shouldn't munge the URL very much. However, HandyUrl is geared for access/playback/Wayback needs, and as such incorporates stronger URL normalization/canonicalization. I haven't spent much time in the code for either, the above is just my thoughts based on informal discussions with Gordon and Brad. Aaron |