Thanks, Leo, I will take a look at these resources.  I am doing more than just Lucene, so I will most likely want a persistent store that I can access  to feed to various libraries, Lucene being one of them.  RDF can do that, right?  I guess I was expecting the RDF stuff to be more behind the scenes in Aperture.  I know you guys are coming from the Semantic Web world, so it makes sense for what you are doing, but for someone like me who just needs a single interface to the various extraction libraries, I don't really want to worry about the underlying implementation and setup.  I think, however, that I can handle this by wrapping the appropriate pieces for my application and making it transparent to my users.


On Jul 24, 2007, at 10:43 AM, Leo Sauermann wrote:

Hi Grant,

hey, before you continue:
did you look at our experimental Aperture LuceneHandler?
(btw: we switched to SVN)
perhaps its exactly what you need, and if not, perhaps you could improve/extend it?

we use RDF2Go, which is slightly documented here:


It was Grant Ingersoll who said at the right time 23.07.2007 20:15 the following words:
This doesn't return any results for me.  When I do
RDFContainer metadata = ...

...//Add stuff to the model

metadata.getModel().writeTo(new PrintWriter(System.out));

it prints out the properties, but when I call the method below it  
doesn't return any results.  So, I believe there are entries in my  
model, just not sure how to get at them.
Is there a good primer somewhere on how all this RDF stuff works?   
All I really want to be able to do is get out the metadata and the  
content for indexing in Lucene and use in other applications.

Is there anyway to interrogate the Extractor to see what properties  
it could produce?

Thanks so much for all your help,

On Jul 23, 2007, at 10:20 AM, Antoni Mylka wrote:

Grant Ingersoll pisze: covers how to put
things into the RDF container, can someone point me to a tutorial or
docs on how to get stuff out of the Container?  Where can I get a
collection of all the URIs that are in the RDFContainer after
extraction has taken place?

You could do it like this

     private static List getPropertyURIs(RDFContainer  
container) {
         List result = new LinkedList();
         Model model = container.getModel();
         String uriString = container.getDescribedUri().toSPARQL();
         String query =
             "SELECT DISTINCT ?property WHERE { " +
              uriString + " ?property ?o }";
         QueryResultTable queryResult = null;
         ClosableIterator iterator = null;
         try {
             queryResult = model.sparqlSelect(query);
             iterator = queryResult.iterator();
             while (iterator.hasNext()) {
                 QueryRow row =;
                 URI uri = (URI)row.getValue("property");
         } catch (Exception e) {
         } finally {
             if (iterator != null) {iterator.close();}
         return result;

Aperture components (in the current trunk) use the Aperture Data
Ontology, augmented by elements of Dublin Core. Some components  
also use
VCARD, ICAL and ICALTZD ontologies.(OutlookCrawler crawls calendar
events in ICAL whereas the IcalCrawler crawls elements in ICALTZD  
All of
them can be viewed here:


Each Aperture Data Object is represented as an instance of a  
class. Such an instance can have any number of properties from the
above-mentioned vocabularies. At the moment there is no detailed
documentation about what kind of rdf structures does each Aperture
component produce. The important properties that are often present are
dc:title, dc:subject and aperture:fullText but you'd need to take a  
at the ontologies, the source code or analyze the output of some  
programs to get a more detailed picture.

I'm currently working on a major overhaul of the ontology  
In the Nepomuk Social Semantic Desktop project
( a unified ontology framework for
dektop resources has been developed. You can take a look at the  
draft at


This migration is being done in the NIEIntegrationBranch. When it is
complete (hopefully this week), the RDF structures generated by  
will be quite well documented in the NIE specification and in the
Javadoc of appropriate components.

Antoni Mylka

This email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a  
Download your FREE copy of Splunk now >>
Aperture-devel mailing list
Grant Ingersoll

This email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>
Aperture-devel mailing list

DI Leo Sauermann 

Deutsches Forschungszentrum fuer 
Kuenstliche Intelligenz DFKI GmbH
Trippstadter Strasse 122
P.O. Box 2080           Fon:   +49 631 20575-116
D-67663 Kaiserslautern  Fax:   +49 631 20575-102
Germany                 Mail:

Prof.Dr.Dr.h.c.mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats:
Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313


Grant Ingersoll