On 31/10/11 19:17, Samuel Lampa wrote:
> On 10/31/2011 07:55 PM, Markus Krötzsch wrote:
>> On 31/10/11 18:13, Samuel Lampa wrote:
>>> On 10/31/2011 06:52 PM, Samuel Lampa wrote:
>>>> === Q2: Status of SMWData/SMWDataItem as API? ===
>>>> Also I wondered what status the SMWData/SMWDataItem classes are
>>>> to have, as a general API? ... Are they the supposed API, or is SMW
>>>> going towards preferring to talk SPARQL with all extensions ... or even
>>>> I ask this since it does not seem clear that I will really*need* to use
>>>> the SMWData/SMWDataItem combo as a representation, if I do the wiki
>>>> updates either with the Wiki Object Model extension or an own writer
>>>> I would still prefer to use it, if it is pushed as a preferred API for
>>>> these kind of things, but I wondered whether that is so for the
>>>> foreseeable future?
>>> The thing that makes me wonder, is since we're basically talking about
>>> two slightly different (though very much overlapping) representations:
>>> RDF (as represented by SMWExpElement rel. classes), and Semantic
>>> MediaWiki facts (as repr. by SMWData/SMWDataItem).
>>> My problem, in the context of RDFIO, is that it seems I actually need
>>> both of these to capture the information from both worlds ... since:
>>> a. I need to store the URI:s, which only SMWExpElement classes do
>>> b. I need to store the wiki page titles that I choose to use (as part of
>>> RDFIO:s algorithm), which only the SMWData/SMWData combo does.
>>> ... thus it seems there's at least two options:
>>> 1. RDFIO creates an own more general data container, which wraps both
>>> the SMWData/SMWDataItem one, and the RDF one (possibly both the
>>> SMWExpElement one, and ARC2:s data structures), with in-built converters
>>> between all of these,
>>> 2. SMWData/SMWDataItem classes are updated to contain the "Original
>>> URI", and then this format will be the only needed one, in addition to
>>> possibly the ARC2 format, just for making use of it's parsers.
>>> Number one is the one I've been pondering so far ... I just wanted to
>>> point out this now and ask whether there would be any interest in
>>> storing also the original URI directly in the SMWData/SMWDataItem
>>> classes ... (which would not need to be required, for data that has no
>>> counterpart in the outside world, though ... or maybe can just be
>>> prefilled with the URIResolver URI:s ... this maybe on-the-fly, in a
>>> getter method)?
>>> ... it seems that would make the SMWData/SMWDI combo more general, and
>>> of course would make RDFIO add a lot less overhead :")
>>> (I know we discussed this on SMWCon already, but these things weren't
>>> really that clear to me then, about the partly but not completely
>>> overlap between RDF and SMW data representations ... so wanted to point
>>> it out ... )
>> I suggest to go for (1) if you need the full information in one object.
>> You should think of SMW data items as small and simple "values", similar
>> to an integer or a char in a programming language. They should be used
>> like constants of datatypes. They should only be used for storing data,
>> not for converting data or for augmenting it. They are pure data and
>> know nothing about HTML, wikitext or RDF. [Exception: the SMWDIContainer
>> type is a placeholder for compound data; it is not really considered as
>> an atomic value in SMW but just used for transporting compound data in
>> the API]
>> With this view in mind, making an object that holds a URI and a dataitem
>> does not seem a bad idea (like making an object that holds an integer
>> and a string).
>> Alternatively, you could of course represent URIs in an SMW data item as
>> well and relate them to wiki page with a property, stored together in an
> Ok, many thanks for the feedback!
> The suggestions sounds reasonable - keeping in line with the modelling
> approach already taken.
> The only little caution I'd like to make, is that the decision keeping
> data objects atomic makes them follow the Anemic Model antipattern  a
> bit. But that is of course a question about model design approach
> overall, and not this specific case only - that is, whether one wants to
> follow Domain Driven Design patterns  or not.
Reading , I think there is a misunderstanding in the way you seem to
apply this text to SMW (probably due to my ill-chosen examples of
property and wiki page out of all dataitems). The text states that
domain specific behaviour of domain objects should be implemented in the
classes that represent the objects. This is what we do. Our domain
objects are strings, numbers, geographic coordinates. This is the very
data that we want to manage in SMW, it just happens to be rather atomic,
simple and (application) domain independent. Note that we do not
artificially try to abstract or simplify the objects to get this
representation -- these simple concepts are really the kinds of things
that SMW users deal with.
Yet we include all related code into the objects whenever such code is
needed. For example, you can have a look at SMWDITime to see a lot of
calendar/date specific code. We could also have similar methods for
strings (e.g., substring computation) and for numbers (e.g., for
rounding) but this was not necessary so far. Our data items do not
include parsing/rendering functions that are specific to syntactic
formats like HTML, wikitext, JSON, RDF, SQL, ... which I think is good
(and established) design (you don't mix all parsing/serialisation code
into one class).
The big fallacy of  is to suggest that "object code" must always be
much larger that "application/service code". If taken too serious, this
could lead to a design that tries to merge all functionality into a few
objects, thus contradicting the fundamental programming paradigm of
separation of concerns. For example, SMW used to have HTML rendering and
RDF serialisation methods for data in a single class, in spite of the
fact that these functions are not at all related but merely work on the
same input data.
This earlier design of SMW has also undermined another important idea of
OO design: the definition of clear interfaces with limited visibility.
The code for parsing, rendering, representation and serialisation used
to have full access to all internal fields of the objects. Before the
introduction of data items, it was quite unclear for some objects where
the data is actually stored (there were multiple redundant/overlapping
internal representations, sometimes optional, to reflect the internal
state of the object; all code would directly read/write to any of the
A third main reason for keeping single objects small is that SMW is
meant to be extendible. If each new storage backend or display format
would rely on adding code to domain object classes, it would be very
hard to extend the system.
Overall, I still think that SMW follows most of the guidelines of
Domain-Driven Design but for a domain (data management) that is very
different of what the author of  had in mind. Another special
observation about SMW is that most of our "business logic" is related to
parsing and serialisation -- tasks that should normally be separated
from the data that they work on. But maybe one has to take a step back
and ask what the "domain layer" and "application layer" in SMW really
are to compare it to the DDD idea. :-)
> ... so for the moment I'm happy to follow the existing model design
> approach :)
> // Samuel
>  http://martinfowler.com/bliki/AnemicDomainModel.html
>  http://en.wikipedia.org/wiki/Domain-driven_design