Hi Stephan,

For the duplicate data thing: you're right that duplicate data is never a good idea; I was wrong about that. Still, in practical terms I really don't think it matters whether the duplicate data gets stored or not. Duplicate data just doesn't happen very much, if at all; and even when it happens, that's an error in data entry: SMW doesn't have any obligation to fix such errors.

(I think all of this is somewhat incidental, actually, because however the index gets stored, there should always be a way to prevent duplicates using the hash.)

As for whether storing the index, whether it's part of the subobject name or in a "Has index" property, changes the data model - I don't think it does. You can simply ignore the index value; I would just think of it as additional data that can either be used or not. If you didn't like my "Modification date" example, how about this: the subobject hash itself is some additional, SMW-only data that gets added to each row, but that doesn't affect the data model either.

-Yaron

On Thu, Jun 20, 2013 at 2:04 PM, Stephan Gambke <s7eph4n@gmail.com> wrote:

Hi Yaron,

I did not propose a general 'has index' property. In fact, I would strongly advise against it. Your recipe example is a good one for a case where an index does not make sense and implying one would be wrong.

For the students example: If your data model identifies students by their name alone, then again the data model is insufficient, not SMW. Basically your statement is 'John has a score of 2'. If you repeat that statement, then a natural person will tell you that you already said that. SMW will drop the second statement. If you actually want both of these statements stored you better think of a way to disambiguate.

On the point of the index number being or not being a part of the data model:
That has nothing to do at all with wether it comes from a user input or not. You should first build you data model. I do not say that it should not contain index numbers. If you need them, by all means include them. But do so explicitly. Don't just include them in all data model just because they are useful in some cases. Then, when you have your data model, think about how to use it. E.g. how to assign those index numbers. They can for sure come from a user input. They might as well be assigned somehow automatically. I don't care.

The order of the elements may be controllable by the user. But deriving the order of the elements from that in the model is wrong. When the user says he needs eggs, milk and flour then you should not translate that into 'first eggs, second milk and finally flour'. The correct translation would be 'eggs, milk and sugar and by the way he ordered eggs first, milk second and sugar third'. This means you will end up with six statements - three on the ingredients and three on the statements on the ingredients. Do not mix them.

Regarding Modification date: There is quite a difference. The modification date is a statement on a subject (the wiki page) that is stored with the subject, but without modifying it. Storing the index number of a statement the way you propose, _would_ modify the statement. So, nothing broken with the Modification date, with that index, though...

Cheers,
Stephan

On Jun 20, 2013 7:02 PM, "Yaron Koren" <yaron@wikiworks.com> wrote:

> Stephan - you make some good points. As far as displaying the number/index in queries - that sounds interesting, though even a separate "Has index" property might not necessarily be ideal for that. If you have two or more different kinds of #subobject calls on a page, it might not work out nicely. For instance, a recipe page might have subobjects for ingredients, and then subobjects for instructions. In that case, the ingredients might have "Has index" values of 1-10, and then the instructions might have "Has index" values of 11-15. (That's how numbering used to work in SIO.) So displaying this property might just look weird.
>
> Your second point is that the hash system lets SMW only display unique subobjects. Which is true, but (a) in my experience that's not a major issue, and (b) actually, sometimes you really do want to store duplicate data. What if you have a page of test scores, and you use #subobject to store each student's name and their score, and two students happen to have the same name and the same score? (Let's say that there aren't wiki pages for each student, which would force you to disambiguate.) Duplicate data might not be an error - it might be valid data.
>
> You seem to also make the point that, because it hasn't been entered by a user, the index/number isn't truly a part of the data model, and shouldn't be stored at all. Assuming you are in fact making that point, it's a reasonable opinion, but I disagree, for two reasons: (1) the order of elements is something that users can control, and thus, it is actually implicitly part of the data model, and (2) SMW already stores a bunch of stuff that's even less a part of the data model: the "Modification date" property, etc. You could say that two wrongs don't make a right (to use an expression), but at the very least, this wouldn't be breaking anything that's not already broken. Again, though, I'm not sure if that's what you were getting at.
>
> -Yaron
>
> On Thu, Jun 20, 2013 at 12:37 PM, Stephan Gambke <s7eph4n@gmail.com> wrote:
>>
>> Hi Yaron,
>>
>> I do not think that your approach will work.
>>
>> At a first glance it seems to be an easy way out to provide sorting.
>>
>> But from a software engineering point of view it loads the identifier with information that just does not belong there. From a practical point of view it falls short if anybody wants to query that number. And finally from a semantic point of view it inseparably mixes two statements (the original one and the one about the sequence number) that the originator usually does not want to be mixed.
>>
>> This last problem btw is also the key to your question about the determination of the hash key. To state the same thing twice is just that: A duplicate statement. As opposed to two statements. To my best knowledge SMW will not store such a statement twice. Instead it will generate the hash key based on the property and value and if that hash already exists, then the statement it represents is considered already known and the second occurrence will be dropped and not appear in any query results. I am not sure if this is also true for subjects, but it really should be.
>>
>> So, long story short: If your data model for project management does not explicitly contain the sequence number for the activities, then your model is incomplete, not SMW. In fact, should two activities be exactly the same, you will probably lose one of them.
>>
>> Cheers,
>> Stephan
>>
>> On Jun 20, 2013 5:30 PM, "Yaron Koren" <yaron@wikiworks.com> wrote:
>>>
>>> Hi Alexey,
>>>
>>> Yes, that's a good point - I actually thought about an approach like that, but forgot to include it in the email. A property called "Sort" (a name like "Has index" might be a little clearer) would solve this problem - and it would be a more "semantic" solution. On the other hand, it would add to the proliferation of special properties (for what that's worth), and it would mean a little more work for administrators to get queries of subobjects ordered correctly.
>>>
>>> I still think my original proposed solution would work fine, though I confess I don't quite understand how the subobject hashing works. Are people supposed to be able to directly link to or reference a subobject, using the hash? I don't see how that could work, given that everything about a subobject could change from one page save to the next - its order, its properties, etc. I don't see how the system could keep consistency of subobject naming.
>>>
>>> -Yaron
>>>
>>> On Thu, Jun 20, 2013 at 10:54 AM, Alexey Klimovich <god.vedmaka@gmail.com> wrote:
>>>>
>>>> Hi, Yaron!
>>>>
>>>> I think subobjects sorting is good task, but i suggest not to use subobjects
>>>> name for this because of big problem with that:
>>>>
>>>> imagine we have 3 subobjects on page:
>>>>
>>>> Page name#001_4bd1f1b74a76de5322dd74956a71f089
>>>> Page name#002_03163dfd1d2502668b00c1f521688984
>>>> Page name#003_02dwa3j349j8d3jds3843234jd8349490
>>>>
>>>> now, we edit page, delete subobject 002. What should happen?
>>>> Should other subobjects be renamed to keep sorting? What if they already
>>>> linked from other pages/queries?
>>>>
>>>> I think better way is to automaticaly attach some semantic property ("Sort"
>>>> for example)
>>>> to every subobject on page. This property should contain subobjects number
>>>> on page.
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context: http://wikimedia.7.x6.nabble.com/Making-subobjects-correctly-ordered-tp5007553p5007558.html
>>>> Sent from the Semantic Mediawiki - Development mailing list archive at Nabble.com.
>>>>
>>>> ------------------------------------------------------------------------------
>>>> This SF.net email is sponsored by Windows:
>>>>
>>>> Build for Windows Store.
>>>>
>>>> http://p.sf.net/sfu/windows-dev2dev
>>>> _______________________________________________
>>>> Semediawiki-devel mailing list
>>>> Semediawiki-devel@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
>>>
>>>
>>>
>>>
>>> --
>>> WikiWorks MediaWiki Consulting http://wikiworks.com
>>>
>>> ------------------------------------------------------------------------------
>>> This SF.net email is sponsored by Windows:
>>>
>>> Build for Windows Store.
>>>
>>> http://p.sf.net/sfu/windows-dev2dev
>>> _______________________________________________
>>> Semediawiki-devel mailing list
>>> Semediawiki-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
>>>
>
>
>
> --
> WikiWorks MediaWiki Consulting http://wikiworks.com




--
WikiWorks MediaWiki Consulting http://wikiworks.com