|
From: Christian G. <aur...@gm...> - 2023-02-01 20:47:09
|
Am 01.02.23 um 12:45 schrieb apnmbx-public--- via Tcl-Core:
> A comment on Christian's -erroron mask suggestion.
>
> -erroron would define what constitutes an error. But it does not say what
> should be done in case of that error which I think is the more important
> issue to address.
>
> So for example, if \xC0 is encountered in [encoding convertfrom utf-8],
> should that be mapped to U+00C0, mapped to U+FFFD, raise an exception etc. I
> think that is more important than distinguishing between error cases like
> surrogate in utf-8 vs \xC0 in utf-8.
>
> So while it may have some use, it doesn't really address the current
> discussion.
OK thanks, than how about an expanded variant:
-handle {SURROGATE error INVALID replace INCOMPLETE ignore ...}
Basically, what I would suggest is a way to configure the behaviour
during the en/decoding and then, of course, set some sensible default -
e.g. the same behaviour that Python uses, or Tcl8 - but leave it open
for the future programmer to set the error handling to their liking. It
may be application dependant, that's why "strict" and "nocomplain" etc.
exist - just that I do not think one should hardcode those, especially
with "weird" names that do not explain what is going on.
Christian
|