As promised (and faster than expected), I have added
regular-expression-based ID manipulation in VuFind’s OAI-PMH harvester tool, so
it should now be possible to munge IDs however you like and get consistent
results from the harvest, import and delete tools. See the new
idSearch/idReplace parameters in harvest/oai.ini of VuFind after r3060.
From: Demian Katz
Sent: Thursday, October 21, 2010 8:36 AM
To: 'mikan.d.dspace listmail'; firstname.lastname@example.org
Subject: RE: [VuFind-Tech] Automated importing
Eoghan covered most of the details here, but just one more
thing: I’m soon going to be adding some features to the OAI-PMH harvester to
allow more ID manipulation at the harvest stage. Hopefully this will
simplify the process of distinguishing between MARC records from different
sources. I was initially resistant to adding extra complexity to the
harvester, but Fang Peng convinced me that it’s worthwhile, since if you
normalize IDs within the harvester, it makes it easier to deal with incremental
updates and deletes. I’ll post more details when this is ready.
Sent: Thursday, October 21, 2010 4:21 AM
Subject: [VuFind-Tech] Automated importing
We're planning to import large amount of data from different
sources to VuFind. Some items have ID's, which may overlap and they might need
additional prefix to be added in order to fall in SOLR nicely. What would be
the preferred way of doing this kind of conversion? Does importer scripts have
means/tools for this, or should I write a script of my own to do this?
Since the other data sources are still active, I need to run these batch
imports nightly to keep VuFind up to date. Any experience on how to arrange
this kind of automation; any considerations / problems / good practices I
should take advice on?
Thanks for the tips,