From: Poor Y. <org...@po...> - 2024-06-25 20:05:53
|
On 2024-06-25 18:12, Brian Griffin wrote: >> On Jun 25, 2024, at 06:30, Jan Nijtmans <jan...@gm...> >> wrote: >> >> This is a CFV warning for TIP #699: >> Eliminate encoding alias "binary"; provide introspection for >> binary channels >> <https://core.tcl-lang.org/tips/doc/trunk/tip/699.md> >> >> It's meant for Tcl 9.0+ only (except for the "chan isbinary" >> command). >> >> If you think this is a bad idea, speak up now. If not, >> I'll start the vote in a few days. > > I think it's a bad idea making "binary" conceptually equivalent to > "iso-8859-1". At the script level, this implantation detail is > irrelevant, and potentially misleading. The term "binary" means the > bytes have NO meaning, they are just 8-bit numeric values. However, > iso-8859-1 specifies an abstract meaning to each byte value. For > example, 0x0a is NOT a linefeed in raw binary. It is only a linefeed > as defined by iso-8859-1. This distinction is important, and should be > reflected in the configuration of the channel, even if this has no > material impact on the underlying implantation that moves these > various bytes around internally. > Tcl is not a typed language. A value means whatever a command successfully interprets it as. The Unicode ASCII and Latin-1 code charts and iso8859-1 encode the same characters at the same positions. That effectively makes iso8859-1 a Unicode encoding scheme, and one that is tailor-made for round-tripping bytes through a Unicode processor unchanged. Tcl is basically a scriptable Unicode processor, and the status of Unicode as a superset of iso8859-1 is part of its public interface. Therefore, iso8859-1 has always been the "contractual" encoding for reading bytes into Tcl, and that's not an implementation detail of Tcl, but a feature of Unicode. The "binary" alias for iso8859-1 has always been problematic. People are constantly popping up on Tcl chat channels asking why their bytes are getting munged even though they've set the encoding to "binary". They are of course told that what they want is "-translation binary", not "-encoding binary", and their reply is usually something along the lines of, "well that's stupid, but thanks!" It's high-time to get rid of this wart. -- Yorick |