2008/7/28 Markus Krötzsch <markus@semantic-mediawiki.org>:
> After returning from Wikimania in Alexandria, I can contribute my five
> piastres to close this issue:
>
> As was rightly remarked, SMW does not define the sorting behaviour if a
> property has many values (or no value). This is documented. If you want to
> define one special value (the first or whatever) for this purpose, then you
> just need to make a property for that task and give it only one value per
> page. Using the new sortkeys can also help in some cases.
>
> The JavaScript live sorting of tables acts on the text as displayed, hence
> depends on the order of displayed values. One could make that alphabetical if
> considered useful. In general, we would like to have some more parameters for
> printouts (e.g. to set a limit on how many values of a multi-valued property
> should be given). We currently lack some smart syntax for doing this.
>
> In the early days of SMW, we have also had one property value per column and
> line (similar to what Mov GP 0 suggested). But this has funny effects if
> there are many such columns, since you get all combinations of values -- just
> as a proper DB is supposed to do it. This was not very useful and hence has
> been dropped.
>
> Another reason to keep sorting options low is performance. If you multiply
> columns for each value given in certain columns, then you get much larger
> result sets to handle (this was one reason early SMWs were slower on
> querying), and result caching as scheduled for next release would probably be
> less effective or more difficult.
>
> So, to make long things short, we do not intend to extend the sorting options
> anytime soon. I also feel that the multi-property sorting is already quite
> complex; think of all that poor not so technically minded users that must
> grok all that!

Sorry to dig up this old thread, but I was searching back over the list to see if the behaviour that I ran into was documented.

I have several pages with several properties, some properties with multiple values. When I #ask for pages (default, tabular output), I get one row of data per page, with multiple values per-page grouped into separate lines of one cell. However, when I sort on a property that can have more than one value, I see some pages turning up multiple times in the result. The page occurs once in the table for each unique instance of the property, in the right sort order for that property.

Because that's a bit cryptic, here is an example:

Sorting on ID (or without sorting):
+----+-----------+-----------+
| ID | Property1 | Property2 |

+----+-----------+-----------+
|  A |         P |         X |
+----+-----------+-----------+
|  B |         Q |         W |
|    |           |         Y |

+----+-----------+-----------+
|  C |         R |         Z |
+----+-----------+-----------+
...

Sorting on Property2:
+----+-----------+-----------+
| ID | Property1 | Property2 |

+----+-----------+-----------+
|  B |         Q |         W |
|    |           |         Y |

+----+-----------+-----------+
|  A |         P |         X |
+----+-----------+-----------+
|  B |         Q |         W |
|    |           |         Y |

+----+-----------+-----------+
|  C |         R |         Z |
+----+-----------+-----------+
...


Is this now the agreed correct behaviour? It seems reasonable, but the above discussion was never resolved, so I thought I'd ask.

At first I found this behaviour confusing, but actually it fits what I need quite well. The only slightly annoying thing is that the multiple value that is being sorted on occurs both times in both places (in an arbitrary order). If you go the whole hog and duplicate the row, I'd rather see something like this:

Sorting on Property2:
+----+-----------+-----------+
| ID | Property1 | Property2 |

+----+-----------+-----------+
|  B |         Q |         W |
+----+-----------+-----------+
|  A |         P |         X |
+----+-----------+-----------+
|  B |         Q |         Y |
+----+-----------+-----------+
|  C |         R |         Z |
+----+-----------+-----------+
...


(Its just a bit neater).


Thanks,
Dan.


> Cheers,
>
> Markus
>
>
> On Dienstag, 1. Juli 2008, Jon Lang wrote:
>> Mov GP 0 wrote:
>> > Hello,
>> > I think the problem with sorting this is that the lines are not
>> > atomar. Instead of having a table of the form
>> >
>> > |-
>> > | Property1 || Property2.1, Property2.2, Property2.3, Property2.4
>> > |-
>> >
>> > the output should be rather
>> >
>> > |-
>> > | Property1 || Property2.1
>> > |-
>> > | Property1 || Property2.2
>> > |-
>> > | Property1 || Property2.3
>> > |-
>> > | Property1 || Property2.4
>> > |-
>> >
>> > this would allow proper sorting. To not break anything, I suggest a
>> > new parameter ie. called "group":
>> >
>> > {{#ask:
>> >  [[Author::+]]
>> >
>> >  |?Author
>> >  |sort=Author
>> >  |group= false
>> >
>> > }}
>> >
>> > or, more familar to SQL, "groupby":
>> >
>> > {{#ask:
>> >  [[Author::+]]
>> >
>> >  |?Author
>> >  |sort=Author
>> >  |groupby= Author
>> >
>> > }}
>> >
>> > This syntax could resolve the sorting problem.
>>
>> This does not resolve the sorting problem, since you're still left
>> with the question of how to handle sorting when grouping multiple
>> values into a single entry.  As well, it opens a new can of worms in
>> bringing up the question of whether a given page should be reported
>> once, or once for every value that it has in a given property.  It's a
>> fascinating question that deserves discussion; but it has consequences
>> well beyond the issue of sorting.
>>
>> For instance, take the following query:
>>
>>   {{#ask:
>>   [[Category:Book]]
>>
>>   |  sort=Author
>>
>>   }}
>>
>> Note that this query does not display the author for each book found;
>> indeed, it doesn't even guarantee that a given book will name the
>> author(s).  Note also that I did not include any sort of grouping or
>> degrouping parameter.  What sort of result should this query produce?
>>
>> As written, I believe that it should list each book on the wiki
>> exactly once.  The question at hand is the order in which the books
>> should be presented.  There are actually two issues at hand here: what
>> to do with multiple values of a property on a page, and what to do
>> with the absence of values for a property on a page.  I've already
>> stated my proposal for resolving the first issue; for the second
>> issue, Pages without the sorted property should probably be listed
>> after pages with it, unless you explicitly state otherwise.
>>
>> --
>>
>> Now, let's look at "grouping":
>>
>>   {{#ask:
>>   [[Category:Book]]
>>
>>   | ?Author
>>   | duplicate=Author
>>
>>   }}
>>
>> My proposal here is that "duplicate" causes the page to show up in the
>> results once if it has zero or one Author, and once per Author if it
>> has more than one, treating each entry as if it only had one Author.
>> Note that I'm not sorting by Author in this query; you don't have to
>> sort by a property in order to duplicate on it.  Conversely, while I
>> _am_ having it list the Author for each result, you don't have to do
>> that, either.  This is why I say that the subject should be addressed
>> separately: it's largely orthogonal to the sorting problem, with the
>> sole exception that for the case where you're willing to duplicate on
>> the property that you're sorting on, the multiple values issue goes
>> away.
>
>
>
> --
> Markus Krötzsch
> Semantic MediaWiki    http://semantic-mediawiki.org
> http://korrekt.org    markus@semantic-mediawiki.org
>
> -------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
> Build the coolest Linux based applications with Moblin SDK & win great prizes
> Grand prize is a trip for two to an Open Source event anywhere in the world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> _______________________________________________
> Semediawiki-devel mailing list
> Semediawiki-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
>
>