From: Grant I. <gra...@gm...> - 2007-07-25 17:27:39
|
Thanks, Leo, I will take a look at these resources. I am doing more than just Lucene, so I will most likely want a persistent store that I can access to feed to various libraries, Lucene being one of them. RDF can do that, right? I guess I was expecting the RDF stuff to be more behind the scenes in Aperture. I know you guys are coming from the Semantic Web world, so it makes sense for what you are doing, but for someone like me who just needs a single interface to the various extraction libraries, I don't really want to worry about the underlying implementation and setup. I think, however, that I can handle this by wrapping the appropriate pieces for my application and making it transparent to my users. Cheers, Grant On Jul 24, 2007, at 10:43 AM, Leo Sauermann wrote: > Hi Grant, > > hey, before you continue: > did you look at our experimental Aperture LuceneHandler? > https://aperture.svn.sourceforge.net/svnroot/aperture/trunk/ > aperture-addons/src/java/org/semanticdesktop/aperture/lucenehandler/ > LuceneHandler.java > https://aperture.svn.sourceforge.net/svnroot/aperture/trunk/ > aperture-addons/src/java/org/semanticdesktop/aperture/lucenehandler/ > LuceneCrawlerExample.java > (btw: we switched to SVN) > perhaps its exactly what you need, and if not, perhaps you could > improve/extend it? > > we use RDF2Go, which is slightly documented here: > http://wiki.ontoworld.org/wiki/RDF2Go > http://wiki.ontoworld.org/wiki/Semweb4j/models/step2 > > best > Leo > > > It was Grant Ingersoll who said at the right time 23.07.2007 20:15 > the following words: >> This doesn't return any results for me. When I do >> RDFContainer metadata = ... >> >> ...//Add stuff to the model >> >> metadata.getModel().writeTo(new PrintWriter(System.out)); >> >> it prints out the properties, but when I call the method below it >> doesn't return any results. So, I believe there are entries in my >> model, just not sure how to get at them. >> Is there a good primer somewhere on how all this RDF stuff works? >> All I really want to be able to do is get out the metadata and the >> content for indexing in Lucene and use in other applications. >> >> Is there anyway to interrogate the Extractor to see what properties >> it could produce? >> >> Thanks so much for all your help, >> Grant >> >> >> On Jul 23, 2007, at 10:20 AM, Antoni Mylka wrote: >> >> >>> Grant Ingersoll pisze: >>> >>>> http://aperture.sourceforge.net/tutorial/rdf.html covers how to put >>>> things into the RDF container, can someone point me to a >>>> tutorial or >>>> docs on how to get stuff out of the Container? Where can I get a >>>> collection of all the URIs that are in the RDFContainer after >>>> extraction has taken place? >>>> >>>> >>> You could do it like this >>> >>> private static List<URI> getPropertyURIs(RDFContainer >>> container) { >>> List<URI> result = new LinkedList<URI>(); >>> Model model = container.getModel(); >>> String uriString = container.getDescribedUri().toSPARQL(); >>> String query = >>> "SELECT DISTINCT ?property WHERE { " + >>> uriString + " ?property ?o }"; >>> QueryResultTable queryResult = null; >>> ClosableIterator<QueryRow> iterator = null; >>> try { >>> queryResult = model.sparqlSelect(query); >>> iterator = queryResult.iterator(); >>> while (iterator.hasNext()) { >>> QueryRow row = iterator.next(); >>> URI uri = (URI)row.getValue("property"); >>> result.add(uri); >>> } >>> } catch (Exception e) { >>> e.printStackTrace(); >>> } finally { >>> if (iterator != null) {iterator.close();} >>> } >>> return result; >>> } >>> >>> Aperture components (in the current trunk) use the Aperture Data >>> Ontology, augmented by elements of Dublin Core. Some components >>> also use >>> VCARD, ICAL and ICALTZD ontologies.(OutlookCrawler crawls calendar >>> events in ICAL whereas the IcalCrawler crawls elements in ICALTZD >>> All of >>> them can be viewed here: >>> >>> <https://aperture.svn.sourceforge.net/svnroot/aperture/trunk/ >>> aperture/doc/ontology/> >>> >>> Each Aperture Data Object is represented as an instance of a >>> DataObject >>> class. Such an instance can have any number of properties from the >>> above-mentioned vocabularies. At the moment there is no detailed >>> documentation about what kind of rdf structures does each Aperture >>> component produce. The important properties that are often >>> present are >>> dc:title, dc:subject and aperture:fullText but you'd need to take a >>> look >>> at the ontologies, the source code or analyze the output of some >>> example >>> programs to get a more detailed picture. >>> >>> I'm currently working on a major overhaul of the ontology >>> architecture. >>> In the Nepomuk Social Semantic Desktop project >>> (http://nepomuk.semanticdesktop.org) a unified ontology framework >>> for >>> dektop resources has been developed. You can take a look at the >>> current >>> draft at >>> >>> <http://dev.nepomuk.semanticdesktop.org/repos/trunk/ontologies/nie/ >>> htmldocs/nie.html> >>> >>> This migration is being done in the NIEIntegrationBranch. When it is >>> complete (hopefully this week), the RDF structures generated by >>> Aperture >>> will be quite well documented in the NIE specification and in the >>> Javadoc of appropriate components. >>> >>> Antoni Mylka >>> ant...@df... >>> >>> -------------------------------------------------------------------- >>> -- >>> --- >>> This SF.net email is sponsored by: Splunk Inc. >>> Still grepping through log files to find problems? Stop. >>> Now Search log events and configuration files using AJAX and a >>> browser. >>> Download your FREE copy of Splunk now >> http://get.splunk.com/ >>> _______________________________________________ >>> Aperture-devel mailing list >>> Ape...@li... >>> https://lists.sourceforge.net/lists/listinfo/aperture-devel >>> >> ------------------------------------------------------ >> Grant Ingersoll >> http://www.grantingersoll.com/ >> http://lucene.grantingersoll.com >> http://www.paperoftheweek.com/ >> >> >> >> --------------------------------------------------------------------- >> ---- >> This SF.net email is sponsored by: Splunk Inc. >> Still grepping through log files to find problems? Stop. >> Now Search log events and configuration files using AJAX and a >> browser. >> Download your FREE copy of Splunk now >> http://get.splunk.com/ >> _______________________________________________ >> Aperture-devel mailing list >> Ape...@li... >> https://lists.sourceforge.net/lists/listinfo/aperture-devel >> > > > -- > ____________________________________________________ > DI Leo Sauermann http://www.dfki.de/~sauermann > > Deutsches Forschungszentrum fuer > Kuenstliche Intelligenz DFKI GmbH > Trippstadter Strasse 122 > P.O. Box 2080 Fon: +49 631 20575-116 > D-67663 Kaiserslautern Fax: +49 631 20575-102 > Germany Mail: leo...@df... > > Geschaeftsfuehrung: > Prof.Dr.Dr.h.c.mult. Wolfgang Wahlster (Vorsitzender) > Dr. Walter Olthoff > Vorsitzender des Aufsichtsrats: > Prof. Dr. h.c. Hans A. Aukes > Amtsgericht Kaiserslautern, HRB 2313 > ____________________________________________________ ------------------------------------------------------ Grant Ingersoll http://www.grantingersoll.com/ http://lucene.grantingersoll.com http://www.paperoftheweek.com/ |