#4960 string tolower \u01c5 is wrong

closed-fixed
5
2011-12-07
2011-11-29
No

\u01c5 is the title case variant: Dž

The lower case variant should be \u01c6 (dž), and this works for 8.5.8 but instead 8.5.11 and 8.6b2 give \u01c5 (.i.e unchanged).

Here is the relevant entry from http://unicode.org/Public/UNIDATA/UnicodeData.txt

01C5;LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON;Lt;0;L;<compat> 0044 017E;;;;N;LATIN LETTER CAPITAL D SMALL Z HACEK;;01C4;01C6;01C5

Discussion

  • Steve Bennett

    Steve Bennett - 2011-11-29

    Ditto, \u01cb and \u01f2

     
  • Donal K. Fellows

    Probably related to the fix for 3393714.

     
  • Donal K. Fellows

    • milestone: --> 2361739
     
  • Jan Nijtmans

    Jan Nijtmans - 2011-11-29
    • milestone: 2361739 -->
     
  • Jan Nijtmans

    Jan Nijtmans - 2011-11-29

    Confirmed. Will have a look.

     
  • Jan Nijtmans

    Jan Nijtmans - 2011-12-04

    This bug is introduced earlier, at 2010-10-23
    with the upgrade to Unicode 6.0 (Bug 3085863),
    it has no relation to 3393714

     
  • Jan Nijtmans

    Jan Nijtmans - 2011-12-05

    Compare this UnicodeData.text line with the earlier entry in Unicode 2.x:

    01C5;LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON;Lt;0;L;<compat> 0044 017E;;;;N;LATIN LETTER CAPITAL D SMALL Z HACEK;;01C4;01C6;

    So, the bug is introduced by a syntax change in the
    UnicodeData.txt file, not by any change at the Tcl
    side. The uniParse.tcl handles the line differently
    when the 'totitle' entry is filled.

    Other characters which changed the same way are
    \u01cb and \u01f2 (as mentioned by Steve), but
    many more.....

    OK, now I have all information needed to fix this....

     
  • Jan Nijtmans

    Jan Nijtmans - 2011-12-06

    proposed fix

     
  • Jan Nijtmans

    Jan Nijtmans - 2011-12-06

    Here is the fix (see attached patch), just a single
    number 32931 should have been
    32963 (line 754 of tclUniData.c).

    Will check that in soon, together with the
    updated uniParse.tcl which generates
    this correctly.

     
  • Jan Nijtmans

    Jan Nijtmans - 2011-12-07

    Fix committed to all open branches, so it will appear in Tcl 8.5.12 and 8.6b3

     
  • Jan Nijtmans

    Jan Nijtmans - 2011-12-07
    • status: open --> closed-fixed