From: Giulia H. <gh...@li...> - 2006-01-13 21:51:47
|
An interesting, for me, scenario which I'm trying to solve. Scenario: I have two sets of files: mets & pdf which are related in pairs. I'll need to create indexes of both, but pdf will just have a full text access for searches. However, result coming from a found term in a pdf needs to link to the related mets file rather than the pdf itself. My approach(s) I thought of two possible solutions, the first, though, is what I think might the simpler. 1) in the preFilter.xml for the pdf file, I change what is indexed as the id of the file replacing it with the related mets fileName which I calculate by using a call, from within the xsl, to an external java class which analyze the files in the mets directory. I like this approach because it leaves XTF with the task of indexing and making all of the work with minimal intervention on my side. My question: how do I change the $id of the file? 2) in the preFilter.xml for the mets file, I make a call to an external java class which finds the right pdf and returns its content which I would put in a indexing snippet like this: <myPdf xtf:meta="true"> <xsl:value-of select="$resultOfJavaCall"/> </myPdf> Where I would have to convert somehow the pdf into text before hand. What concerns me about this approach is the huge entries that I might get. Do you have suggestion on the easiest, cleanest way to solve this problem? Thanks, Giulia ---------------------------- Giulia Hill Programmer/Analyst Library Systems Office University of California at Berkeley 386 Doe Annex Berkeley, CA 94720 |