From: Tim L. <guy...@gm...> - 2013-02-01 11:07:21
|
A challenge for searching for an answer!! Benny Malengier wrote > 2009/4/24 Peter Landgren < > peter.talken@ > > > That indicates that the procedure that adds these first letters must be > made > more clever: > > 1/ If symbols are different but equal in the sort of the locale, consider > them as one group. I guess doing sort of va and wb then vb and wa would > indicate that v and w are one group, so the logic for a small function is > not difficult. You expand on this algorithm in http://www.gramps-project.org/bugs/view.php?id=2933#c9317. See http://www.unicode.org/charts/uca/ for the main collation chart with primary differences marked. I entirely agree, and plan to implement this in NarWeb. However, I have one problem. I can't find out how to determine the letter that has a primary difference from the current letter (sorry, that's not quite the right wording, but I am not sure how to express it). For example, I do a sort, and the first few names are "Ándre, Arnot", The algorithm shows that these should be grouped together. But which letter should be used for the index header. In this case, it should be "a" (or "A" if I upper case everything) as this is the letter from which "Á" and "A" have secondary or teriary differences. In another language, "Á" might have a primary difference from "Z", and then the sort order would be "Andrew, Arnot, Zulu, Ándre". In this case the index header should be "Á". So I can't just normalise the character to remove accents etc. I have studied Unicode, CLDR and ICU and Googled extensively, but I can't find out how to determine the preceding primary character! Can anyone help? -- View this message in context: http://gramps.1791082.n4.nabble.com/Sort-mystery-tp1804087p4658441.html Sent from the GRAMPS - Dev mailing list archive at Nabble.com. |