Hello Chris, Gunnar,

just a reminder: I collect some arguments about various discussions on the gnowsis wiki here:
http://gnowsis.opendfki.de/wiki/ApertureDiscussion

Es begab sich aber da Christiaan Fluit zur rechten Zeit 11.08.2006 16:08 folgendes schrieb:
Gunnar Aastrand Grimnes wrote:
  
Currently aperture uses Sesame2 for two things:

* the RDF Model classes: URI, Literal, Resource, etc.
* the RDFContainer, although we pretend this is a general class, more 
often than not I find myself casting it to the SesameRDFContainer.

Myself and Leo propose that we change Aperture to instead of RDF2Go [1] 
for these two tasks. RDF2Go is an abstraction over RDF stores, and 
provides bindings for Sesame1, Sesame2, Jena, etc.
    

Good to have this discussion now, we need to finalize this part of the 
API before we can reach a beta or final status.
  
yes. thats why we now want to discuss here that we reach a common view of the problem and possible solution.

The same applies by the way to the RDF namespaces that we use.
  
yes, see also on the wiki page, I wrote some stuff there.
http://gnowsis.opendfki.de/wiki/ApertureDiscussion

Chris, you should have an account there to edit.

  
We think such a change would have the following benefits:

* It would make aperture genuinely RDF API agnostic.
    

Just for completeness: the Sesame guys have argued before that Sesame 
already contains such a storage-agnostic API, as the Repository class 
does not make any assumptions on where and how the information is 
stored. People have been able to put a Repository on top of a Jena 
model, for example.
  
we did a library to wrap Sesame inside Jena graphs. more below.

I on the other hand really liked the simplicity of RDFContainer during 
Aperture development, all the details regarding transactions, contexts 
etc. being neatly taken care of in the RDFContainer implementation.
  
Yes, RDFContainers should remain as a simplicity and I would not change their interface much.
Only the URIs passed in/out would be of org.ontoware.rdf2go.URI (or whatever we can come up with for a typed URI there)

So I would see that the RDFContainer is an abstraction saying:
I wrap one resource and its properties, I also wrap a complete RDF model, but you don't see much of it.
You can read/write properties of  this one resource very easy, to access the whole model you use
rdfcontainer.getModel() and get an rdf2go model.


This interface also allowed me to more easily build custom RDFContainers 
in AutoFocus, which for example do some special processing on certain 
properties, acting like a filter between Aperture and the persistent 
storage. Not impossible to do with Repositories and Sails but harder due 
to their larger complexity.

Below I will treat my thoughts about RDFContainer vs. RDF2GO's Model.

  
* Aperture would be easier to integrate into project that already have 
legacy dependencies on some RDF toolkit
    

True.

  
* This might also make it possible to add query methods to the generic 
RDFContainer, something I understood from Chris that was currently tricky?
    

It's not tricky at all! Just add a simple getStatements method. I've 
proposed this before but have met with some resistance at the DFKI side ;)
  
yes, you are right that I, Leo, objected.
My point was and is that if we start adding more and more methods to RDFContainer, we end up with RDF2GO.
so I made the decision to overcome the temptation of abstracting too much here,
making RDFcontainer better would've meant to write a part of RDF2GO, and I think no two open source projects should takle exactly the same problem, and I wanted to avoid to much investion in the RDFContainer here.

The problem with the current RDFContainer is that you can add arbitrary 
statements but you can only retrieve those statements that have the 
described URI as subject. The partOf property often uses this URI as 
object but you cannot retrieve it without directly accessing the 
underlying Repository.
  
point there. yes, I think rdf2go will allow a better way to access the underlying repository or
an improved version of RDFContainer, based on rdf2go, can have the "reverse_add_xy" functions that you need to manipulate these reverse properties.
  
* It would be possible to use Aperture with Sesame1, which currently 
wouldn't work since sesame1 and 2 cannot exist in the same jvm.
    

Can you explain how RDF2GO solves this and why this cannot be done with 
RDFContainer? I see this mainly as a ClassLoader-related problem, which 
can be solved by using frameworks like OSGi, it has nothing specifically 
to do with the RDFContainer API as far as I can see.
  
you cannot pass a sesame1 object inside a osgi bundle that inside uses sesame2.
each OSGI bundle has to declare its dependencies, using
Require-Bundle: sesame1,
 sesame2
I would assume that the bundles "sesame1" and "sesame2" would be incompatible and throw an error here.

example: osgi bundle "aperture" offers this method:
public RDFModel getRDF(uri)

the bundle requires sesame2 to have the openrdf.model package inside itself, and the receiving service (lets say "gui") needs to require sesame2 also, to read the result returned by the java method.

  
* We were also partly motivated by the fact that Nepomuk has decided to 
base it's RDF API on RDF2Go
    

Sounds like a reason to switch (although I wonder how this decision 
making has taken place ;) ).

  
Inside Nepomuk we have an urge to be open and have a general, vendor independent standard.
The interfaces to our stores are primarily defined by HTTP bindings.

But the Java RDF apis involved should be simple and stable, RDF2GO doesn'T change much as it doesn'T implement many features on itself.

So a gui written using the RDF2GO library can be reused when switching store. We had this problem when switching from Jena to sesame2 and we will have it again in the future.

as a by-product of this "how should clients contact nepomuk" question, it was decided to use Rdf2Go as a basis for abstraction, and a new project (a merge of RDFReactor and rdf2java) to generate Schema-based code.
Schema-based code will then support ontology-based java classes, like
PErson p = new Person(uri, model);
p.getName();
p.setEMail("aasdf@sdfasf");

through this door, RDF2Go sneaked in.
I, Leo, see it as a pragmatic way to avoid some problems in the future, buying in new problems, we will see.


  
The two most obvious problems are:

1. A few days coding effort to make the changes to aperture
2. unknown effort to change any applications currently using aperture, I 
assume this would mainly affect gnowsis and autofocus?
    

True, but I wouldn't worry about that. The potentially positive effects 
of a switch outweigh the cost IMO.

Now about RDFContainer vs. RDF2GO.

Things I like about RDF2GO:

* Availability for a number of storage frameworks + the fact that we 
don't have to create and support these frameworks ourselves.

* Apparently chosen by Nepomuk, a potential large "Aperture customer", 
so let's try to keep their work as simple as possible to encourage 
cooperation.

* Full RDF querying (as in: graph walkthrough).

Things I dislike about RDF2GO:

* The use of java.net.URI. We've used this in the past with Sesame and 
ran into performance problems due to the parsing that takes place in its 
constructor. Perhaps one of the Sesame developers can add some more 
details here, I'm only repeating what I've been told.
  
we had this discussion already last year, and I had it repeatedly now with Max Völkel
The untyped objects in classes Statement and many other interfaces are not acceptable for good quality coding.

we want to make as you said, also probably using the uri checkers that were recently suggested on the sesame2 mailinglist

Instead of using java.net.URI, we've created the org.openrdf.model 
package (deliberately *not* called org.openrdf.sesame.model!) that has a 
very small jar and that offers its own URI, has interfaces for all model 
data structures and organizes them in a nice class hierarchy.

Is there any chance you can convince Max to use this package instead? Or 
do you actually like his approach better?
  
I like his approach better, because its not bound to any project at all but to his.
If we use the org.openrdf.model package, we are restricted to use Sesame1, or?

Also, if the sesames again change something, we are incompatible again and cannot use the latest CVS version of Sesame2.

As Nepomuk we don't want to have dependencies on such a core level from outside projects.
At the end, its about 7 classes with five properties each, so its not a big redundancy anyway.
they are roughly: uri, resource, blanknode, rdfnode, literal, statement, model.

It was a schock to us to look at the CVS changes from sesame2 alpha3 to the current version,
changing the most important interface "repository" in a thourough way. (we asked Jeen and Arjohn about more Details and the reasons they have to do this are good, we do not question the refactoring, we just want to be immune to problems here).

I hope we can make Rdf2go very stable and "boringly unchanging".


Things I like about RDFContainer:

* It has overloaded methods, offering support for various Java types 
such as ints and Dates. Consequently:
- Extractor and Crawler implementors don't need to worry about 
converting this from/to an RDF representation,
- it puts the encoding and decoding logic in a single place and
- into the hands of the application developer that chooses a particular 
RDFContainer implementation.
How would you propose to handle this when using RDF2GO? A Decorator 
class wrapping a Model perhaps that adds overloaded methods?
  
No, I would keep the separated RDFContainer class introduced for Aperture.
But I would have liked to remove the RDFContainerFactories completly,
because they are just overhead.

But we will probably keep the RDFContainerFactories, because we need them to tell which RDF framework to use (RDF2GO has to be configured to run on top of either sesame or jena or yars, so the factories can decide)

* The notion of a described URI.

* last but not least: we control it, so we can be sure it's best suited 
for what Aperture needs.
  
+1 on all of your arguments here.
Things I dislike about it:

* No full RDF querying (in the sense of being able to walk through the 
RDF graph).

The issues I mention can probably all be worked out one way or another 
when switching to RDF2GO, but let's first see if we really want/need to 
go that way.
  
For me the question is: either only RDF2GO or only Sesame2,

the RDFContainer methods with rdf2go would look like this:
public org.Rdf2Go.Model  getModel();
public void setProperty(org.rdf2go.URI property, String value);

or like this using only sesame2
public org.openrdf.model.Model  getModel();
public void setProperty(org.openrdf.model.URI property, String value);

our "theoretical" case that we ever support funny rdf stores using custom RDFContainers was never realized anyway, but if we want to realize it, using RDF2GO would be more clever, because then the tedious task of implementing a yars-RDFContainer would be solved by using rdf2go-yars. (yars=yet another rdf server).

also because of the weird inmemory-transaction-performance issue i would be in favor for rdf2go, but I do not exactly know yet what we exactly want.
On the other hand we need a solution to solve some of the deficienceis of RDFContainer, like setting backlinks (parents) or convenient access to the store behind. So I see a need, but the decision has to be discussed in a longer communication amongst us,

thanx for the answers so far,
Leo



Chris
--


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Aperture-devel mailing list
Aperture-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/aperture-devel
  


-- 
____________________________________________________
DI Leo Sauermann       http://www.dfki.de/~sauermann 
DFKI GmbH
P.O. Box 2080          Fon:   +49 631 205-3503
67608 Kaiserslautern   Fax:   +49 631 205-3472
Germany                Mail:  leo.sauermann@dfki.de
____________________________________________________