Implement both a bibtex crawler AND a bibtex subcrawler to convert Bibtex files to RDF.
provide BOTH because users may want all their bibtex files within their folders crawled (subcrawler), but may opt to only crawl/select a single file (crawler with datasource).
Also, the datasource could take configuration values such as supporting JabRef's proprietary pdf= syntax.
For jabref/pdf:
@INPROCEEDINGS{Sauermann2005b,
title = {{The Semantic Desktop---a Basis for Personal Knowledge Management}},
pdf = {Sauermann2005b.pdf}
}
the pdf= link is relative to the path of the bibtex or to some configurable path - so this "absolute path for relative PDF links" must be configurable for a bibtex file, hence a datasource, hence the crawler.
this kind-of-worked in gnowsis, we must reactivate the code from here:
https://gnowsis.opendfki.de/repos/gnowsis/tags/0.8.3-alpha/gnowsis_adapterpack/WEB-INF/src/org/gnowsis/adapters/bibtex/
Christian K. at L3S implemented a Bibtex extractor, see
http://dev.nepomuk.semanticdesktop.org/repos/trunk/java/de.l3s.aperture/src/java/org/semanticdesktop/aperture/extractor/bibtex/
It is quite atomic so it should be easy to be fully migrated into the aperture source. Please contact Christian if you are interested.
This task will be a test-task for an applicant for internship at DFKI.
Ok, so Enrico Minack said the code from L3S should be ported to Aperture.
take http://dev.nepomuk.semanticdesktop.org/repos/trunk/java/de.l3s.aperture/src/java/org/semanticdesktop/aperture/extractor/bibtex/ and find its dependencies.
checkout the aperture-addons project from https://aperture.svn.sourceforge.net/svnroot/aperture/trunk/aperture-addons/ using SVN, add the l3s bibtex extractor to this project and verify if it generates good bibtex.
good bibtex/rdf can be found here:
* http://www.dfki.uni-kl.de/~sauermann/leo_sauermann.bib converts to http://www.dfki.uni-kl.de/~sauermann/rdf/leo_bib.rdf