good afternoon,

On 6 Jun 2014, at 17:31, Krzysztof Drewniak <krzysdrewniak@gmail.com> wrote:

On 06/06/2014 10:07 AM, james anderson wrote:
good afternoon,

On 6 Jun 2014, at 16:23, Krzysztof Drewniak <krzysdrewniak@gmail.com
<mailto:krzysdrewniak@gmail.com>> wrote:

I'm working on improving SBCL's Unicode support through the Google
Summer of Code. To make my project more useful, I'd like to know what
Unicode (or possibly internationalization)-related features you'd like
to see already implemented in SBCL, so you wouldn't have to roll your own.

So far, I have implemented (on an experimental branch)
...
- The standard Unicode sorting algorithm ...
What other similar improvements would make things easier for you as an
SBCL user? Please let me know.

support for collating sequences in string comparison functions.

I've implemented the default collation algorithm, and the functions
SB-UNICODE:UNICODE< (and <=, >,>= ...), which use the DUCET to collate
strings. I have not implemented locale-specific collations, please let
me know if those would be useful for you.

the general use case is any application which must tailor results to a request-specific collation sequence.
this is the situation for a service which works with annotated data, such as xml and rdf.

I think that the standard
allows me to drop SB-UNICODE:UNICODE< in as STRING<, but that would
break a lot of things (like (string< "A" "a") => T)), so it should
probably be discussed more.

while a drop-in would offer advantages for general internationalization, an alternative predicate which required a local object or a language tag as a designator would likely be sufficient in our case, as the intended locale tends to be specific to the request.


I'm also considering implementing a "Unicode string" type that would
allow Common Lisp functions to work in a Unicode-conforming fashion
transparently, would that work?

i am not sure how that is not a contradiction, but would be curious to hear.

best regards, from berlin,

i find numerous reflections on what one might do, but nothing concrete,
which would make it possible to use collation sequences effectively,
particularly on a call-by-call basis:

- http://franz.com/support/documentation/8.1/doc/iacl.htm#collation-1 :
franz, by locale only
- https://groups.google.com/forum/#!topic/comp.lang.lisp/l_8QOu52raI :
thoughts re ecl
- http://common-lisp.net/project/babel/ : codecs, but no collation
- https://github.com/edicl/cl-unicode : properties, but no collation
- http://common-lisp.net/project/bese/cl-icu.html : cl-icu abandoned

Thanks for the pointers.

best regards, from berlin,
---
james anderson | james@dydra.com <mailto:james@dydra.com> | http://dydra.com


------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech_______________________________________________
Sbcl-help mailing list
Sbcl-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sbcl-help



---
james anderson | james@dydra.com | http://dydra.com