From: David Goodman <David.G<oodman@li...> - 2004-04-06 13:03:32
Unfortunately, the correct sort order differs in each language. Thus it is impossible to write a general algorithm without knowing the language of the author's name. Libraries have faced this in different ways: the usual current method is to ignore the accent mark foor sorting and file o-umlaut as if it were plain o. The various library systems have sort algorithms for doing this. (In more primitive computer systems they were sometimes changed to the plain letter and the diacritic marking ignored altogether, but unicode has made that unnecessary)
(the previous general practice was to file it the way it would sound in the native language; this is obviously no help.)
Assoc. Prof. of Library and Information Science
Long Island University
From: dspace-general-bounces@... on behalf of Frank
Sent: Mon 4/5/2004 10:29 PM
Subject: [Dspace-general] Sort order of author names with diacritics
We just started using DSpace to build a repository of research reports and
have been very impressed so far.
But we have a problem: author names involving diacritical marks are sorted
very strangely. For example, the name Ozler, where the O has an umlaut,
entered as Ö sorts between Azam and Azzoni. It seems that the O-umlaut
sorts as an A. Similar is Soricut when the S has a cedilla (Ş).
Soricut sorts between Aoki and Appleton: again the diacritic sorts as an A.
These names were input using the bulk input method, using the standard
If such names are entered in forms using this encoding the resulting names
sort before any unaccented characters: looks like they are sorted as
starting with &.
The database was created with Unicode enabled.
We would be grateful for any help.
Dspace-general mailing list