From: Krzysztof D. <krz...@gm...> - 2014-06-06 16:46:22
Attachments:
signature.asc
|
good afternoon; On 6 Jun 2014, at 18:03, Krzysztof Drewniak <krz...@gm...> wrote: > On 06/06/2014 10:54 AM, james anderson wrote: >> good afternoon, >> […] >> >> while a drop-in would offer advantages for general internationalization, >> an alternative predicate which required a local object or a language tag >> as a designator would likely be sufficient in our case, as the intended >> locale tends to be specific to the request. >> > Could you provide an example of what you're looking for? Are you trying > to use the ICU collation data, which provides locale-specific overrides, yes > or are you trying to make other custom adjustments to the default > collation table in order to meet some external protocol's requirements? the specific case it to provide a more reasonable result for string literal comparisons in a sparql processor, than the “undefined” result which the specification sets down (http://www.w3.org/TR/2013/REC-sparql11-query-20130321/#modOrderBy) : SPARQL does not define a total ordering of all possible RDF terms. Here are a few examples of pairs of terms for which the relative order is undefined: • "a" and "a"@en_gb (a simple literal and a literal with a language tag) • "a"@en_gb and "b"@en_gb (two literals with language tags) • "a" and "1"^^xsd:integer (a simple literal and a literal with a supported datatype) • "1"^^my:integer and "2"^^my:integer (two unsupported datatypes) • "1"^^xsd:integer and "2"^^my:integer (a supported datatype and an unsupported datatype) note entry two, in which two strings from the same language are specified to be incommensurable. it would much improve the utility of the results if we have an operator such as icu:string<( string1 string2 language-tag) which would apply the respective collating sequence. we had planned to use a direct icu-based implementation as we already employ that for data which reside off-heap, but you asked and there is a general case for the facility, so … best regards, from berlin, --- james anderson | ja...@dy... | http://dydra.com |
From: Krzysztof D. <krz...@gm...> - 2014-06-06 17:16:23
Attachments:
signature.asc
|
On 06/06/2014 11:46 AM, Krzysztof Drewniak wrote: > good afternoon; > > On 6 Jun 2014, at 18:03, Krzysztof Drewniak <krz...@gm...> wrote: > >> On 06/06/2014 10:54 AM, james anderson wrote: >>> good afternoon, >>> […] >>> >>> while a drop-in would offer advantages for general internationalization, >>> an alternative predicate which required a local object or a language tag >>> as a designator would likely be sufficient in our case, as the intended >>> locale tends to be specific to the request. >>> >> Could you provide an example of what you're looking for? Are you trying >> to use the ICU collation data, which provides locale-specific overrides, > > yes > >> or are you trying to make other custom adjustments to the default >> collation table in order to meet some external protocol's requirements? > > the specific case it to provide a more reasonable result for string > literal comparisons in a sparql processor, than the “undefined” result > which the specification sets down > (http://www.w3.org/TR/2013/REC-sparql11-query-20130321/#modOrderBy) : > > SPARQL does not define a total ordering of all possible RDF terms. Here > are a few examples of pairs of terms for which the relative order is > undefined: > > • "a" and "a"@en_gb (a simple literal and a literal with a language tag) > • "a"@en_gb and "b"@en_gb (two literals with language tags) > • "a" and "1"^^xsd:integer (a simple literal and a literal with a > supported datatype) > • "1"^^my:integer and "2"^^my:integer (two unsupported datatypes) > • "1"^^xsd:integer and "2"^^my:integer (a supported datatype and an > unsupported datatype) > > note entry two, in which two strings from the same language are > specified to be incommensurable. it would much improve the utility of > the results if we have an operator such as icu:string<( string1 string2 > language-tag) which would apply the respective collating sequence. > > we had planned to use a direct icu-based implementation as we already > employ that for data which reside off-heap, but you asked and there is a > general case for the facility, so … > I'll consider implementing the ICO collations. The main drawback is the significant increase in character database size needed to implement them. It does seem like generally useful functionality, though. It might end up in a contrib. > best regards, from berlin, > --- > james anderson | ja...@dy... | http://dydra.com > > > [I think this might not have gone through to the list the first time. Apologies for duplicates. -Krzysztof] |
From: Faré <fa...@gm...> - 2014-06-06 19:14:27
|
I'm not sure why any of this should be in sb-unicode vs cl-unicode or babel. Indeed, this is all general-purpose unicode support, that isn't implementation-specific, unless you want to plug spell-checking or some such into the undefined-variable mechanism. —♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org I only ever buy but local: inhabitants of other galactic clusters are not even human and I refuse to trade with them despite mutual benefits |
From: Krzysztof D. <krz...@gm...> - 2014-06-06 16:52:29
Attachments:
signature.asc
|
On 06/06/2014 11:46 AM, Krzysztof Drewniak wrote: > good afternoon; > > On 6 Jun 2014, at 18:03, Krzysztof Drewniak <krz...@gm...> wrote: > >> On 06/06/2014 10:54 AM, james anderson wrote: >>> good afternoon, >>> […] >>> >>> while a drop-in would offer advantages for general internationalization, >>> an alternative predicate which required a local object or a language tag >>> as a designator would likely be sufficient in our case, as the intended >>> locale tends to be specific to the request. >>> >> Could you provide an example of what you're looking for? Are you trying >> to use the ICU collation data, which provides locale-specific overrides, > > yes > >> or are you trying to make other custom adjustments to the default >> collation table in order to meet some external protocol's requirements? > > the specific case it to provide a more reasonable result for string > literal comparisons in a sparql processor, than the “undefined” result > which the specification sets down > (http://www.w3.org/TR/2013/REC-sparql11-query-20130321/#modOrderBy) : > > SPARQL does not define a total ordering of all possible RDF terms. Here > are a few examples of pairs of terms for which the relative order is > undefined: > > • "a" and "a"@en_gb (a simple literal and a literal with a language tag) > • "a"@en_gb and "b"@en_gb (two literals with language tags) > • "a" and "1"^^xsd:integer (a simple literal and a literal with a > supported datatype) > • "1"^^my:integer and "2"^^my:integer (two unsupported datatypes) > • "1"^^xsd:integer and "2"^^my:integer (a supported datatype and an > unsupported datatype) > > note entry two, in which two strings from the same language are > specified to be incommensurable. it would much improve the utility of > the results if we have an operator such as icu:string<( string1 string2 > language-tag) which would apply the respective collating sequence. > > we had planned to use a direct icu-based implementation as we already > employ that for data which reside off-heap, but you asked and there is a > general case for the facility, so … > I'll consider implementing the ICO collations. The main drawback is the significant increase in character database size needed to implement them. It does seem like generally useful functionality, though. It might end up in a contrib. > best regards, from berlin, > --- > james anderson | ja...@dy... | http://dydra.com > > > |