Hi Demian,

This sounds very interesting indeed. Can Aperture index full text files via URL or should they be locally present? And if full-texts are indexed indeed, why not offer the ability to store the actual pdf - file in VuFind as well? Any plans to make this happen? :)

Good Work!
Mika


2010/10/25 Demian Katz <demian.katz@villanova.edu>

Hello,

 

Just a quick update – I have just built upon my XSLT work from last week by integrating the Java Aperture library with VuFind.  This makes it possible to harvest documents like PDFs or Word files and extract their text contents directly into the Solr index.  It was easier to get it working than I expected, though I did run into one apparent bug in Aperture’s shell scripts under Linux!  See notes here:

 

http://vufind.org/wiki/importing_records#full_text

 

It may be useful to do something similar for SolrMarc-based imports – see http://vufind.org/jira/browse/VUFIND-274 for details.

 

Let me know if you have questions about this – I’m sure if anyone starts using this in earnest, we’ll need to make some further adjustments for improved stability…  but as a proof of concept, it seems to work quite nicely!

 

- Demian


------------------------------------------------------------------------------
Nokia and AT&T present the 2010 Calling All Innovators-North America contest
Create new apps & games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store
http://p.sf.net/sfu/nokia-dev2dev
_______________________________________________
Vufind-tech mailing list
Vufind-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/vufind-tech