From: Erik H. <eri...@uc...> - 2012-05-16 21:21:38
|
At Wed, 16 May 2012 13:44:11 -0700, Aaron Binns wrote: > > > Erik Hetzner <eri...@uc...> writes: > > > A quick question. UURI [1] is located in Heritrix Commons. HandyUrl is > > located in archive-commons. Which should I use? > > Hmmm, it might depend on your needs. AFAIK, the UURI is geared towards > Heritrix's needs, which includes a pretty light "normalization" of the > URL. From an archival capture point of view, I think the idea is that > Heritrix shouldn't munge the URL very much. > > However, HandyUrl is geared for access/playback/Wayback needs, and as > such incorporates stronger URL normalization/canonicalization. > > I haven't spent much time in the code for either, the above is just my > thoughts based on informal discussions with Gordon and Brad. Thanks, Aaron. It sounds like HandyUrl is more appropriate for my current task. best, Erik |