I've written up my view of "state of Unicode in Tcl 9" at
https://www.magicsplat.com/tcl9/tcl9unicode.html
My hope is that this will (a) serve as a tutorial for those not familiar
with the issues around Unicode (one-eyed leading the blind and all that) and
(b) prompt a broader discussion around the issues raised in the mailing list
and tickets.
A summary TOC is below. I hope this prods more folks in the TCT (and
outside) to weigh in with their opinions one way or the other.
Apologies for the length of the document but it's not easy to summarise.
/Ashok
* 1
<https://www.magicsplat.com/tcl9/tcl9unicode.html#about-this-document>
About this document
* 2 <https://www.magicsplat.com/tcl9/tcl9unicode.html#background>
Background
* 3 <https://www.magicsplat.com/tcl9/tcl9unicode.html#tcl-strings>
Tcl strings
* 3.1
<https://www.magicsplat.com/tcl9/tcl9unicode.html#ascii-escape-sequences-for
-non-ascii-code-points> ASCII escape sequences for non-ASCII code points
* 3.2
<https://www.magicsplat.com/tcl9/tcl9unicode.html#binary-strings> Binary
strings
* 3.3
<https://www.magicsplat.com/tcl9/tcl9unicode.html#issues-in-string-definitio
n> Issues in string definition
* 3.3.1
<https://www.magicsplat.com/tcl9/tcl9unicode.html#no-definition-of-what-cons
titutes-a-tcl-string> No definition of what constitutes a Tcl string
* 3.3.2
<https://www.magicsplat.com/tcl9/tcl9unicode.html#inconsistent-handling-for-
out-of-range-code-points> Inconsistent handling for out of range code
points
* 3.3.3
<https://www.magicsplat.com/tcl9/tcl9unicode.html#surrogates-as-literals>
Surrogates as literals
* 3.3.4
<https://www.magicsplat.com/tcl9/tcl9unicode.html#variable-length-escape-seq
uences> Variable length escape sequences
* 4 <https://www.magicsplat.com/tcl9/tcl9unicode.html#string-commands>
String commands
* 4.1
<https://www.magicsplat.com/tcl9/tcl9unicode.html#string-classification>
String classification
* 4.2
<https://www.magicsplat.com/tcl9/tcl9unicode.html#issues-in-string-commands>
Issues in string commands
* 4.2.1
<https://www.magicsplat.com/tcl9/tcl9unicode.html#string-is-unicode> string
is unicode
* 4.2.2
<https://www.magicsplat.com/tcl9/tcl9unicode.html#nonconformant-interpretati
on-of-string-values> Nonconformant interpretation of string values
* 5
<https://www.magicsplat.com/tcl9/tcl9unicode.html#encoding-transforms>
Encoding transforms
* 5.1
<https://www.magicsplat.com/tcl9/tcl9unicode.html#transforming-encoded-byte-
sequences-to-tcl-strings> Transforming encoded byte sequences to Tcl
strings
* 5.2
<https://www.magicsplat.com/tcl9/tcl9unicode.html#transforming-tcl-strings-t
o-encoded-byte-sequences> Transforming Tcl strings to encoded byte
sequences
* 5.3
<https://www.magicsplat.com/tcl9/tcl9unicode.html#issues-in-encoding-transfo
rms> Issues in encoding transforms
* 5.3.1
<https://www.magicsplat.com/tcl9/tcl9unicode.html#only-partial-support-for-c
onforming-error-handling-behavior> Only partial support for conforming
error handling behavior
* 5.3.2
<https://www.magicsplat.com/tcl9/tcl9unicode.html#error-handling-options-are
-incomplete-and-inconsistent> Error handling options are incomplete and
inconsistent
* 5.3.3
<https://www.magicsplat.com/tcl9/tcl9unicode.html#default-handling-of-invali
d-bytes-is-neither-conformant-nor-consistent> Default handling of invalid
bytes is neither conformant nor consistent
* 5.3.4
<https://www.magicsplat.com/tcl9/tcl9unicode.html#no-support-for-lossless-op
eration> No support for lossless operation
* 5.3.5
<https://www.magicsplat.com/tcl9/tcl9unicode.html#default-encoder-handling-s
hould-be-strict-conformance> Default encoder handling should be strict
conformance
* 5.3.6
<https://www.magicsplat.com/tcl9/tcl9unicode.html#failindex-does-not-disting
uish-errors-from-incomplete-sequences> -failindex does not distinguish
errors from incomplete sequences
* 5.3.7
<https://www.magicsplat.com/tcl9/tcl9unicode.html#inconsistency-in-default-h
andling-of-surrogates> Inconsistency in default handling of surrogates
* 5.3.8
<https://www.magicsplat.com/tcl9/tcl9unicode.html#inconsistency-between-erro
r-handling-for-different-encodings> Inconsistency between error handling
for different encodings
* 5.3.9
<https://www.magicsplat.com/tcl9/tcl9unicode.html#manpages-for-encoding-have
-errors> Manpages for encoding have errors
* 6
<https://www.magicsplat.com/tcl9/tcl9unicode.html#input-and-output> Input
and Output
* 6.1
<https://www.magicsplat.com/tcl9/tcl9unicode.html#input-from-channels>
Input from channels
* 6.1.1
<https://www.magicsplat.com/tcl9/tcl9unicode.html#blocking-read> Blocking
read
* 6.1.2
<https://www.magicsplat.com/tcl9/tcl9unicode.html#non-blocking-read>
Non-blocking read
* 6.1.3
<https://www.magicsplat.com/tcl9/tcl9unicode.html#blocking-gets> Blocking
gets
* 6.1.4
<https://www.magicsplat.com/tcl9/tcl9unicode.html#non-blocking-gets>
Non-blocking gets
* 6.2
<https://www.magicsplat.com/tcl9/tcl9unicode.html#output-on-channels>
Output on channels
* 6.3
<https://www.magicsplat.com/tcl9/tcl9unicode.html#binary-channels> Binary
channels
* 6.4
<https://www.magicsplat.com/tcl9/tcl9unicode.html#file-paths-and-system-inte
rfaces> File paths and system interfaces
* 6.5
<https://www.magicsplat.com/tcl9/tcl9unicode.html#issues-in-io-and-system-in
terfaces> Issues in I/O and system interfaces
* 6.5.1
<https://www.magicsplat.com/tcl9/tcl9unicode.html#behavior-of-read-violates-
defined-semantics> Behavior of read violates defined semantics
* 6.5.2
<https://www.magicsplat.com/tcl9/tcl9unicode.html#channel-read-state-after-e
rrors> Channel read state after errors
* 6.5.3
<https://www.magicsplat.com/tcl9/tcl9unicode.html#channel-write-state-after-
errors> Channel write state after errors
* 6.5.4
<https://www.magicsplat.com/tcl9/tcl9unicode.html#file-and-system-apis-are-n
ot-lossless> File and system APIs are not lossless
* 6.5.5
<https://www.magicsplat.com/tcl9/tcl9unicode.html#no-error-raised-for-confli
cting-options> No error raised for conflicting options
|