Hi Grant,

hey, before you continue:
did you look at our experimental Aperture LuceneHandler?
https://aperture.svn.sourceforge.net/svnroot/aperture/trunk/aperture-addons/src/java/org/semanticdesktop/aperture/lucenehandler/LuceneHandler.java
https://aperture.svn.sourceforge.net/svnroot/aperture/trunk/aperture-addons/src/java/org/semanticdesktop/aperture/lucenehandler/LuceneCrawlerExample.java
(btw: we switched to SVN)
perhaps its exactly what you need, and if not, perhaps you could improve/extend it?

we use RDF2Go, which is slightly documented here:
http://wiki.ontoworld.org/wiki/RDF2Go
http://wiki.ontoworld.org/wiki/Semweb4j/models/step2

best
Leo


It was Grant Ingersoll who said at the right time 23.07.2007 20:15 the following words:
This doesn't return any results for me.  When I do
RDFContainer metadata = ...

...//Add stuff to the model

metadata.getModel().writeTo(new PrintWriter(System.out));

it prints out the properties, but when I call the method below it  
doesn't return any results.  So, I believe there are entries in my  
model, just not sure how to get at them.
Is there a good primer somewhere on how all this RDF stuff works?   
All I really want to be able to do is get out the metadata and the  
content for indexing in Lucene and use in other applications.

Is there anyway to interrogate the Extractor to see what properties  
it could produce?

Thanks so much for all your help,
Grant


On Jul 23, 2007, at 10:20 AM, Antoni Mylka wrote:

  
Grant Ingersoll pisze:
    
http://aperture.sourceforge.net/tutorial/rdf.html covers how to put
things into the RDF container, can someone point me to a tutorial or
docs on how to get stuff out of the Container?  Where can I get a
collection of all the URIs that are in the RDFContainer after
extraction has taken place?

      
You could do it like this

     private static List<URI> getPropertyURIs(RDFContainer  
container) {
         List<URI> result = new LinkedList<URI>();
         Model model = container.getModel();
         String uriString = container.getDescribedUri().toSPARQL();
         String query =
             "SELECT DISTINCT ?property WHERE { " +
              uriString + " ?property ?o }";
         QueryResultTable queryResult = null;
         ClosableIterator<QueryRow> iterator = null;
         try {
             queryResult = model.sparqlSelect(query);
             iterator = queryResult.iterator();
             while (iterator.hasNext()) {
                 QueryRow row = iterator.next();
                 URI uri = (URI)row.getValue("property");
                 result.add(uri);
             }
         } catch (Exception e) {
             e.printStackTrace();
         } finally {
             if (iterator != null) {iterator.close();}
         }
         return result;
     }

Aperture components (in the current trunk) use the Aperture Data
Ontology, augmented by elements of Dublin Core. Some components  
also use
VCARD, ICAL and ICALTZD ontologies.(OutlookCrawler crawls calendar
events in ICAL whereas the IcalCrawler crawls elements in ICALTZD  
All of
them can be viewed here:

<https://aperture.svn.sourceforge.net/svnroot/aperture/trunk/ 
aperture/doc/ontology/>

Each Aperture Data Object is represented as an instance of a  
DataObject
class. Such an instance can have any number of properties from the
above-mentioned vocabularies. At the moment there is no detailed
documentation about what kind of rdf structures does each Aperture
component produce. The important properties that are often present are
dc:title, dc:subject and aperture:fullText but you'd need to take a  
look
at the ontologies, the source code or analyze the output of some  
example
programs to get a more detailed picture.

I'm currently working on a major overhaul of the ontology  
architecture.
In the Nepomuk Social Semantic Desktop project
(http://nepomuk.semanticdesktop.org) a unified ontology framework for
dektop resources has been developed. You can take a look at the  
current
draft at

<http://dev.nepomuk.semanticdesktop.org/repos/trunk/ontologies/nie/ 
htmldocs/nie.html>

This migration is being done in the NIEIntegrationBranch. When it is
complete (hopefully this week), the RDF structures generated by  
Aperture
will be quite well documented in the NIE specification and in the
Javadoc of appropriate components.

Antoni Mylka
antoni.mylka@dfki.de

---------------------------------------------------------------------- 
---
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a  
browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Aperture-devel mailing list
Aperture-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/aperture-devel
    

------------------------------------------------------
Grant Ingersoll
http://www.grantingersoll.com/
http://lucene.grantingersoll.com
http://www.paperoftheweek.com/



-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Aperture-devel mailing list
Aperture-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/aperture-devel
  


-- 
____________________________________________________
DI Leo Sauermann       http://www.dfki.de/~sauermann 

Deutsches Forschungszentrum fuer 
Kuenstliche Intelligenz DFKI GmbH
Trippstadter Strasse 122
P.O. Box 2080           Fon:   +49 631 20575-116
D-67663 Kaiserslautern  Fax:   +49 631 20575-102
Germany                 Mail:  leo.sauermann@dfki.de

Geschaeftsfuehrung:
Prof.Dr.Dr.h.c.mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats:
Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313
____________________________________________________