good afternoon,

On 6 Jun 2014, at 16:23, Krzysztof Drewniak <krzysdrewniak@gmail.com> wrote:

I'm working on improving SBCL's Unicode support through the Google
Summer of Code. To make my project more useful, I'd like to know what
Unicode (or possibly internationalization)-related features you'd like
to see already implemented in SBCL, so you wouldn't have to roll your own.

So far, I have implemented (on an experimental branch)

- Accessors for many of a character's Unicode properties, such as its
script or general category
- Functions to break a string into graphemes (what users would think of
as "characters"), words, sentences, and lines according to the Unicode
standard
- Unicode standards for case conversion, with optional locale detection
so that certain locale-specific casing rules (such as i uppercasing as
dotted-I (İ) in Turkish) can be applied
- The standard Unicode sorting algorithm

I've also added an option to the reader to normalize unescaped symbols,
so that, for example :ë and :ë (LATIN SMALL LETTER E WITH DIAERESIS and
LATIN SMALL LETTER E + COMBINING DIAERESIS, respectively) are EQ with
normalization enabled.

What other similar improvements would make things easier for you as an
SBCL user? Please let me know.

support for collating sequences in string comparison functions.

i find numerous reflections on what one might do, but nothing concrete, which would make it possible to use collation sequences effectively, particularly on a call-by-call basis:

- http://franz.com/support/documentation/8.1/doc/iacl.htm#collation-1 :  franz, by locale only
- https://groups.google.com/forum/#!topic/comp.lang.lisp/l_8QOu52raI : thoughts re ecl
- http://common-lisp.net/project/babel/ : codecs, but no collation
https://github.com/edicl/cl-unicode : properties, but no collation
http://common-lisp.net/project/bese/cl-icu.html : cl-icu abandoned


best regards, from berlin,
---
james anderson | james@dydra.com | http://dydra.com