For the duplicate data thing: you're right that duplicate data is never a
good idea; I was wrong about that. Still, in practical terms I really don't
think it matters whether the duplicate data gets stored or not. Duplicate
data just doesn't happen very much, if at all; and even when it happens,
that's an error in data entry: SMW doesn't have any obligation to fix such
(I think all of this is somewhat incidental, actually, because however the
index gets stored, there should always be a way to prevent duplicates using
As for whether storing the index, whether it's part of the subobject name
or in a "Has index" property, changes the data model - I don't think it
does. You can simply ignore the index value; I would just think of it as
additional data that can either be used or not. If you didn't like my
"Modification date" example, how about this: the subobject hash itself is
some additional, SMW-only data that gets added to each row, but that
doesn't affect the data model either.
On Thu, Jun 20, 2013 at 2:04 PM, Stephan Gambke <s7eph4n@...> wrote:
> Hi Yaron,
> I did not propose a general 'has index' property. In fact, I would
> strongly advise against it. Your recipe example is a good one for a case
> where an index does not make sense and implying one would be wrong.
> For the students example: If your data model identifies students by their
> name alone, then again the data model is insufficient, not SMW. Basically
> your statement is 'John has a score of 2'. If you repeat that statement,
> then a natural person will tell you that you already said that. SMW will
> drop the second statement. If you actually want both of these statements
> stored you better think of a way to disambiguate.
> On the point of the index number being or not being a part of the data
> That has nothing to do at all with wether it comes from a user input or
> not. You should first build you data model. I do not say that it should not
> contain index numbers. If you need them, by all means include them. But do
> so explicitly. Don't just include them in all data model just because they
> are useful in some cases. Then, when you have your data model, think about
> how to use it. E.g. how to assign those index numbers. They can for sure
> come from a user input. They might as well be assigned somehow
> automatically. I don't care.
> The order of the elements may be controllable by the user. But deriving
> the order of the elements from that in the model is wrong. When the user
> says he needs eggs, milk and flour then you should not translate that into
> 'first eggs, second milk and finally flour'. The correct translation would
> be 'eggs, milk and sugar and by the way he ordered eggs first, milk second
> and sugar third'. This means you will end up with six statements - three on
> the ingredients and three on the statements on the ingredients. Do not mix
> Regarding Modification date: There is quite a difference. The modification
> date is a statement on a subject (the wiki page) that is stored with the
> subject, but without modifying it. Storing the index number of a statement
> the way you propose, _would_ modify the statement. So, nothing broken with
> the Modification date, with that index, though...
> On Jun 20, 2013 7:02 PM, "Yaron Koren" <yaron@...> wrote:
> > Stephan - you make some good points. As far as displaying the
> number/index in queries - that sounds interesting, though even a separate
> "Has index" property might not necessarily be ideal for that. If you have
> two or more different kinds of #subobject calls on a page, it might not
> work out nicely. For instance, a recipe page might have subobjects for
> ingredients, and then subobjects for instructions. In that case, the
> ingredients might have "Has index" values of 1-10, and then the
> instructions might have "Has index" values of 11-15. (That's how numbering
> used to work in SIO.) So displaying this property might just look weird.
> > Your second point is that the hash system lets SMW only display unique
> subobjects. Which is true, but (a) in my experience that's not a major
> issue, and (b) actually, sometimes you really do want to store duplicate
> data. What if you have a page of test scores, and you use #subobject to
> store each student's name and their score, and two students happen to have
> the same name and the same score? (Let's say that there aren't wiki pages
> for each student, which would force you to disambiguate.) Duplicate data
> might not be an error - it might be valid data.
> > You seem to also make the point that, because it hasn't been entered by
> a user, the index/number isn't truly a part of the data model, and
> shouldn't be stored at all. Assuming you are in fact making that point,
> it's a reasonable opinion, but I disagree, for two reasons: (1) the order
> of elements is something that users can control, and thus, it is actually
> implicitly part of the data model, and (2) SMW already stores a bunch of
> stuff that's even less a part of the data model: the "Modification date"
> property, etc. You could say that two wrongs don't make a right (to use an
> expression), but at the very least, this wouldn't be breaking anything
> that's not already broken. Again, though, I'm not sure if that's what you
> were getting at.
> > -Yaron
> > On Thu, Jun 20, 2013 at 12:37 PM, Stephan Gambke <s7eph4n@...>
> >> Hi Yaron,
> >> I do not think that your approach will work.
> >> At a first glance it seems to be an easy way out to provide sorting.
> >> But from a software engineering point of view it loads the identifier
> with information that just does not belong there. From a practical point of
> view it falls short if anybody wants to query that number. And finally from
> a semantic point of view it inseparably mixes two statements (the original
> one and the one about the sequence number) that the originator usually does
> not want to be mixed.
> >> This last problem btw is also the key to your question about the
> determination of the hash key. To state the same thing twice is just that:
> A duplicate statement. As opposed to two statements. To my best knowledge
> SMW will not store such a statement twice. Instead it will generate the
> hash key based on the property and value and if that hash already exists,
> then the statement it represents is considered already known and the second
> occurrence will be dropped and not appear in any query results. I am not
> sure if this is also true for subjects, but it really should be.
> >> So, long story short: If your data model for project management does
> not explicitly contain the sequence number for the activities, then your
> model is incomplete, not SMW. In fact, should two activities be exactly the
> same, you will probably lose one of them.
> >> Cheers,
> >> Stephan
> >> On Jun 20, 2013 5:30 PM, "Yaron Koren" <yaron@...> wrote:
> >>> Hi Alexey,
> >>> Yes, that's a good point - I actually thought about an approach like
> that, but forgot to include it in the email. A property called "Sort" (a
> name like "Has index" might be a little clearer) would solve this problem -
> and it would be a more "semantic" solution. On the other hand, it would add
> to the proliferation of special properties (for what that's worth), and it
> would mean a little more work for administrators to get queries of
> subobjects ordered correctly.
> >>> I still think my original proposed solution would work fine, though I
> confess I don't quite understand how the subobject hashing works. Are
> people supposed to be able to directly link to or reference a subobject,
> using the hash? I don't see how that could work, given that everything
> about a subobject could change from one page save to the next - its order,
> its properties, etc. I don't see how the system could keep consistency of
> subobject naming.
> >>> -Yaron
> >>> On Thu, Jun 20, 2013 at 10:54 AM, Alexey Klimovich <
> god.vedmaka@...> wrote:
> >>>> Hi, Yaron!
> >>>> I think subobjects sorting is good task, but i suggest not to use
> >>>> name for this because of big problem with that:
> >>>> imagine we have 3 subobjects on page:
> >>>> Page name#001_4bd1f1b74a76de5322dd74956a71f089
> >>>> Page name#002_03163dfd1d2502668b00c1f521688984
> >>>> Page name#003_02dwa3j349j8d3jds3843234jd8349490
> >>>> now, we edit page, delete subobject 002. What should happen?
> >>>> Should other subobjects be renamed to keep sorting? What if they
> >>>> linked from other pages/queries?
> >>>> I think better way is to automaticaly attach some semantic property
> >>>> for example)
> >>>> to every subobject on page. This property should contain subobjects
> >>>> on page.
> >>>> --
> >>>> View this message in context:
> >>>> Sent from the Semantic Mediawiki - Development mailing list archive
> at Nabble.com.
> >>>> This SF.net email is sponsored by Windows:
> >>>> Build for Windows Store.
> >>>> http://p.sf.net/sfu/windows-dev2dev
> >>>> _______________________________________________
> >>>> Semediawiki-devel mailing list
> >>>> Semediawiki-devel@...
> >>>> https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
> >>> --
> >>> WikiWorks · MediaWiki Consulting · http://wikiworks.com
> >>> This SF.net email is sponsored by Windows:
> >>> Build for Windows Store.
> >>> http://p.sf.net/sfu/windows-dev2dev
> >>> _______________________________________________
> >>> Semediawiki-devel mailing list
> >>> Semediawiki-devel@...
> >>> https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
> > --
> > WikiWorks · MediaWiki Consulting · http://wikiworks.com
WikiWorks · MediaWiki Consulting · http://wikiworks.com