Re: [TCLCORE] Unicode in Tcl 9 - a commentary and critique

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

The idea has some merit but I have a couple of concerns with the approach
below.

At first glance it tackles a different problem than what is being discussed.
It addresses configuration of what is to be considered an invalid byte
sequence. It does not address how a sequence considered invalid is to be
handled (map to U+FFFD, map to lossless, map to numeric equivalent etc.).
Now one could add those as additional dictionary options/keys but that
increases complexity from a user perspective (what does "strict 1 surrogates
0 invalid 1" etc. mean?). And the user / application does not care in the
vast majority of cases where the error stems from (exception being the
needmoredata case which is a separate category discussed elsewhere). It
feels like over-generalization to me.

Second, and possibly more important, I foresee considerable implementation
complexity in the encoders to handle this fine-grained, "tunable"
configuration. Particularly so since there is no mechanism currently to pass
this down into the encoder "call chains" and would entail API changes. Of
course, I might be wrong and a prototype implementation could immediately
refute this "implementability" concern.

/Ashok

From: Peter Da Silva <pet...@fl...> 
Sent: Thursday, February 2, 2023 10:49 PM
To: Poor Yorick <org...@po...>; Tcl Core List
<tcl...@li...>
Subject: Re: [TCLCORE] Unicode in Tcl 9 - a commentary and critique

I really like this idea. It also adds the option of turning flags off (eg
{strict 0})

The value of "-encoding" could be a dictionary:
        chan configure $chan -encoding {name utf-8 strict 1 surrogates 0
...}
If the number of items in the list is odd, "name" could be implied:
        chan configure $chan -encoding {utf-8 strict 1 surrogates 0 ...}

        chan configure $chan -encoding utf-8

Re: [TCLCORE] Unicode in Tcl 9 - a commentary and critique

The Tool Command Language implementation

Re: [TCLCORE] Unicode in Tcl 9 - a commentary and critique