From: Jackson, A. <And...@bl...> - 2013-06-06 16:13:25
|
It's not just the indexer. The front-end logic and the coupling to H3 have all been problematic recently. We have suffered a range of problems deploying recent Wayback versions, due to unintended consequences of recent changes that break functionality that we require. As well as the de-duplication problems I mentioned in a separate email, we've also had issues with Memento access points (which don't return link-format timemaps as they should/used to) and the XML query endpoint failing under certain conditions (due to changes in URL handling/'cleaning'). In my opinion, one of the critical jobs for the future Wayback OS project is to set up proper, automated integration tests that exercise all the functionality the IIPC partners need, and will therefore detect if changes to the source code have unintentionally altered critical behaviour. It is technically fairly straightforward to make an integration test that, say, indexes a few WARCs, fires up a Wayback instance, and checks the responses to some queries. It does, of course, require some investment of time and effort. However, that investment would enable future modifications to the code base to be carried out with far more confidence. I've started doing some work in this area, but would appreciate knowing if anyone else is willing to put some effort into building up the testing framework. Thanks, Andy > -----Original Message----- > From: Jones, Gina [mailto:gj...@lo...] > Sent: 06 June 2013 13:13 > To: arc...@li... > Subject: [Archive-access-discuss] Wayback Indexer > > I believe that the wayback indexer is the weakest link to longterm access to > our collections. And it isn't obvious sometimes what is going on when you > index content until you actually access that content. > > One of the projects I want to do this year (or next) is to take the available > indexers and index a set of content that we have (2000-now) and review the > output. > > gina > > ------------------------------------------------------------------------ ------ > How ServiceNow helps IT people transform IT departments: > 1. A cloud service to automate IT design, transition and operations 2. > Dashboards that offer high-level views of enterprise services 3. A single > system of record for all IT processes http://p.sf.net/sfu/servicenow-d2d-j > _______________________________________________ > Archive-access-discuss mailing list > Arc...@li... > https://lists.sourceforge.net/lists/listinfo/archive-access-discuss ************************************************************************** Experience the British Library online at http://www.bl.uk/ The British Library’s latest Annual Report and Accounts : http://www.bl.uk/aboutus/annrep/index.html Help the British Library conserve the world's knowledge. Adopt a Book. http://www.bl.uk/adoptabook The Library's St Pancras site is WiFi - enabled ************************************************************************* The information contained in this e-mail is confidential and may be legally privileged. It is intended for the addressee(s) only. If you are not the intended recipient, please delete this e-mail and notify the mailto:pos...@bl... : The contents of this e-mail must not be disclosed or copied without the sender's consent. The statements and opinions expressed in this message are those of the author and do not necessarily reflect those of the British Library. The British Library does not take any responsibility for the views of the author. ************************************************************************* Think before you print |