|
From: Rolf A. <tcl...@po...> - 2023-02-04 01:34:14
|
Donald G Porter via Tcl-Core writes: > On 1/27/23 10:36, apnmbx-public--- via Tcl-Core wrote: >> >> I’ve written up my view of “state of Unicode in Tcl 9” at >> https://www.magicsplat.com/tcl9/tcl9unicode.html > [...] Revise [glob] [...] I agree with that. But no matter what or how ... > For 1) if the alphabet for Tcl strings is larger than unicode scalar > values, that provides a clear use and meaning for [string is unicode] > which has puzzled some people. Maybe a change to [string is usv] > would be clearer to the reader that the test is whether symbols > outside the set of unicode scalar values are present. These are > symbols that cannot be properly encoded in the Unicode encodings > utf-8, utf-16, utf-32. In its current existence on trunk (and during its whole short lifetime AFAIK) [string is unicode] returns 0 on surrogate _and_ "noncharacter" code-points. But there is no doubt that "noncharacter" code-points _can_ be properly encoded in utf-8, utf-16, utf-32. There's no way as I have mentioned a few times and Ashok discusses in his paper to alter or remove [string is unicode], no handed later justification helps. But, agreed, thats the smallest portion of the things to do here. rolf |