From: Demian K. <dem...@vi...> - 2018-05-16 18:38:37
|
The VuFind XSLT indexer supports a custom VuFind::harvestWithParser function that you can use to populate the full text field of the Solr index for XML documents just like you can use the SolrMarc functions for MARC records. You could probably add something like: <field name="fulltext"> <xsl:value-of select="php:function('VuFind::harvestWithParser', string(//dc:identifier))"/> </field> Of course, this assumes that there is just one dc:identifier and that it points directly to the full-text document. If you need to pick one dc:identifier from among many, or if the URL points to an intermediate page that contains a link to a full-text document, things get more complicated, since you have to add logic somewhere to deal with those complications. I also can’t promise that the example syntax above is exactly correct, so some adjustments may be needed. Looking at the other existing .xsl examples may prove helpful. Please let me know if you need further guidance! - Demian From: P. S. Mukhopadhyay [mailto:psm...@gm...] Sent: Wednesday, May 16, 2018 2:14 PM To: vufind-tech <vuf...@li...> Subject: [VuFind-Tech] Full-text indexing Dear All We are using VuFind (4.0) as the front-end discovery layer, where records are indexed from Koha, DSpace Greenstone and Omeka. A good number of books catalogued in Koha have fuul text pdf files linked via tag 856u. We are using Apache-Tika (in VuFind by using marc_local.propeties settings) for indexing full-text pdf files as available in Koha (Koha does not have full-text indexing mechanisms). Similarly we have full text objects in DSpace and Greenstone. These two software have their own full-text indexing mechanisms. But what is the mechanisms to index full-text objects as available in DSpace and Greenstone by using the combination of Tika and VuFind so that end users can search both metadta and fulltext objects from all these three software (Koha, DSpace, Greenstone) from VuFind search interface? The full-text objects in both of these software (DSpace and Greenstone) are linked via DC.Identifier metadata element. Thanks and regards -- ----------------------------------------------------------------------- Dr. Parthasarathi Mukhopadhyay Associate Professor, Department of Library and Information Science, University of Kalyani, Kalyani - 741 235 (WB), India ----------------------------------------------------------------------- |