|
From: Brad T. <br...@ar...> - 2007-11-27 18:59:26
|
Have you considered proxy mode? We're working on some software called "browser monkeys" which helps automate large scale processing like you're describing, link checking, for example, using a firefox plugin, among other components. There may be a version available in the next one to two months. For this testing, we use proxy mode, and if that's an option for your link checking system, I'd recommend it. We'll announce the release of the browser monkey software on this forum. Re: the domain prefix, I'll try to get some rough documentation online in the near term, but it may be a couple of weeks. For now, you'll need to look at the wayback.xml and wayback-templates.xml files, and the source code, but feel free to post specific issues, observations, and questions here. Brad > > > Hi, > we have almost completed development of an HTTrack archive to ARC > conversion tool and are in the middle of testing. To ensure that our > conversion process is successful Ive installed wayback 1.0 (which seems to > be working well) and have loaded some of our converted ARC files into the > system. From here I intended to run a link checker over the harvested > website to check how well the conversion process worked. Because the link > checker wont process the Javascript that internalizes the links, I was > hoping that the domain-prefix replay mode would allow us to get around > this. > > Is this the case? and where can we get configuration information for > domain-prefix? > > Thanks, > This e-mail is intended for the addressee only and may contain information > which is subject to legal privilege. The contents are not necessarily the > official view or communication of the National Library of New Zealand. If > you are not the intended recipient you must not use, disclose, copy or > distribute this e-mail or any information in, or attached to it. If you > have received this e-mail in error, please contact the sender immediately > or return the original message to the National Library by e-mail, and > destroy any copies. The National Library does not accept any liability for > changes made to this e-mail or attachments after sending. > > > > All e-mails have been scanned for viruses and content by security > software. The National Library reserves the right to monitor all e-mail > communications through its network. > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/_______________________________________________ > Archive-access-discuss mailing list > Arc...@li... > https://lists.sourceforge.net/lists/listinfo/archive-access-discuss > |