Order By German vowels and special characters

  • ailli

    Today I found out that Saxon seems to treat sepcial characters as German
    vowels (ä, ö, ü) or French characters as é in a rather unfortunate way when it
    comes to ordering. Consider the following statement:

    let $d :=
    for $i in $d//liquid
    order by $i ascending
    return $i

    This returns the following result:

    (By the way, those are the German words for milk, water and oil.)

    One (at least a German speaking person) would expect the follwing result:

    The reason therefore is that Ö may be replaced by Oe in this case. (ä=ae,
    ö=oe, ü=ue)
    As far as I'm conserned also the French would expect to find é somewhere
    around e in an ordered set.

    My question now is whether Saxon strictly sticks to any standard here, ie.
    order by UTF-8 hex or something like that?
    Is there a way to achieve a language based ordering by setting any parameter
    or something?

    Writing a filter would be rather inefficient I guess.

    If there is nothing Saxon can do about this, it might be a useful extension
    for up comming releases.
    In example one could configure Saxons sorting algorithm to stick to some
    language rules.

    Best Regards!

  • Michael Kay
    Michael Kay

    To get a language-sensitive collation for German in XSLT, use <xsl:sort lang="de"/>.

    To get a language-sensitive collation for German in XQuery, use

    order by $i collation "http://saxon.sf.net/collation?lang=de"

    Unfortunately the XQuery version of this is not portable across XQuery

    There are other parameters you can set to control sorting with more precision,
    for example whether case is significant. See


    for details.

  • ailli

    I read your post and followed the links.
    Unfortunately I can't get my collation set as described.

    What I do is the following:
    I have a webapplication buit as MVC architecture. So from the parameters
    provided the controller fetches the data in XQuery according to the ordering
    provided. Then the result is pushed into the view, where another XQuery
    statement renders the data as HTML.

    The only explicit ordering happens at the controller level. I plugged in your
    statement as described, but it didn't show any effect. Then I tried
    implementing my own Collation in Java, but still no imapct.

    I'm a bit stuck here - is it possible, that the Xquery statements for the view
    implicitly overwrite the collation again?
    Is there a way to force saxon to use a specific collation for all statements?

    Best Regards!

  • Michael Kay
    Michael Kay

    I'm sorry, but I can't tell why your code isn't working without seeing your
    code. General information about the design of your code isn't enough. It could
    be a very simple error like misspelling the collation name. Please try to
    drill down until you can find a free-standing piece of code that produces
    different results from those you expect it to produce, preferably in an
    environment where anyone can replicate the problem.

  • ailli


    I found the mistake today:
    What I do in Java is parsing the GET parameters and building order by
    constraints from what I receive.
    As last element I always append an empty sequence, because of the final comma.

    I added collation 'http://saxon.sf.net/collation?lang=de;' once at the end, wherefore it was applied to () only
    and thus having no impact on the result.

    So as advice to anyone ever having the same problem:
    Make sure you add the collation to EVERY field that has to consider it, ie:

    order by $item/name ascending collation '...', $item/age descending collation

    Setting it only at the end of the order by does NOT imply, that it is set
    globally as I asumed.

    Thanks and Regards!