From: Demian K. <dem...@vi...> - 2012-02-21 18:23:25
|
Dave Lacy told me about your XPath-based indexing - it definitely sounds like a good alternative for people who don't want to deal with VuFind's current XSLT-based XML indexing, although I'm not sure if it will allow the necessary level of flexibility to cover all use cases. Have you thought about supporting custom functions/scripts that get passed the entire XML document for more complex analysis? That might help close the gap, assuming you haven't already implemented it. Of course, I'm not sure if building custom Java functions is a significant time saver over building XSLT... but I hate XSLT just enough that I'd be happy to have additional options in the toolkit! And yes, I'm sorry I missed you at Code4lib, especially since I didn't see you last year either - lousy virus! Hopefully next year will work out better! - Demian From: Robert Haschart [mailto:rh...@vi...] Sent: Tuesday, February 21, 2012 12:07 PM To: Demian Katz; vuf...@li... Subject: Re: [VuFind-General] Searching catalogue and respository Demian, Based on some discussions at Code4Lib I have an initial implementation of SolrMarc where it is designed to read non-MARC XML records, and extract fields from those records to put in the index based on xpath expressions. Its not yet checked in anywhere, but it will allow you to have one index specification for Marc records, and another for non-Marc XML records, with a similar format. You can start from the Marc index specification, and change each existing rule to get the data from the appropriate part of the XML record via xpath, including mapping the values found via a solrmarc translation map. For example: MARC-based index specification: id = 001, first language_facet = 008[35-37]:041a:041d, language_map.properties # format is for facet, display, and selecting partial for display in show view format = 007[0]:000[6-7]:000[6], (map.format), first isbn_t = 020a, (pattern_map.isbn_clean) material_type_display = custom, removeTrailingPunct(300aa) # Title fields # primary title title_t = custom, getLinkedFieldCombined(245a) title_display = custom, removeTrailingPunct(245a) title_vern_display = custom, getLinkedField(245a) ... Corresponding non-MARC-based index specification: id = xpath, .//dc[1]/identifier[1], (pattern_map.id) language_facet = xpath, .//dc/language, language_map.properties language2_facet = xpath, .//dc/language[@usage='display'] # format is for facet, display, and selecting partial for display in show view format = xpath, .//dc[1]/type[0] # Title fields # primary title title_t = xpath, .//dc/title title_display = xpath, .//dc/title ... I'll keep you and the VuFind community apprised of this, and may seek use cases and test data from those who might be interested. -Bob Haschart P.S. It was too bad you couldn't make it to Code4Lib, I was looking forward to seeing you there. On 2/21/2012 10:08 AM, Demian Katz wrote: The OAI harvest is a two-step process -- first you harvest the records to a directory, and then you need to import the files from that directory into the Solr index. I'm guessing that you haven't done the second step. There are some details here: http://vufind.org/wiki/importing_records#xml_records The hard part of this is that you need to set up an XSLT to transform your harvested records into documents that can be loaded into the Solr index. I don't think we have an existing sample configuration for Eprints (though if somebody is already doing this, please speak up), so you may need to adapt one of the existing examples and make some changes to match the Eprints format. XSLT is not the easiest thing to work with, so feel free to ask questions if you need help. Also feel free to share a sample harvested record if you would like recommendations on the best way to proceed. - Demian -----Original Message----- From: Tim Fletcher [mailto:T.F...@bb...] Sent: Tuesday, February 21, 2012 9:40 AM To: vuf...@li...<mailto:vuf...@li...> Subject: [VuFind-General] Searching catalogue and respository Hi, We are making some good progress with a second test installation - this time on Ubuntu - and although there are a lot of things still to sort out I was tempted to try to see if I could search our Eprints institutional repository as well as the library catalogue. I feel this would be a good selling point in trying to get support to really work on an implementation. I edited OAI.INI and seem to have harvested the records from our repository but nothing appears when I try to do a search. I'm just wondering if what I want to do is possible or if I am missing something and there is a setting that needs amending. Any advice welcome - even if it is to tell me that this isn't possible! Many thanks, Tim ----------------------- Tim Fletcher Library IT Development Manager Birkbeck College University of London Malet Street London WC1E 7HX t.f...@bb...<mailto:t.f...@bb...> Tel: 020 7631 6060 Fax: 020 7631 6066 http://www.bbk.ac.uk/lib/ ------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d _______________________________________________ VuFind-General mailing list VuF...@li...<mailto:VuF...@li...> https://lists.sourceforge.net/lists/listinfo/vufind-general ------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d _______________________________________________ VuFind-General mailing list VuF...@li...<mailto:VuF...@li...> https://lists.sourceforge.net/lists/listinfo/vufind-general |