#641 Ability to perform stricter encoding conversion checks


Currently, Tcl never throws an error when:

1) Encountering an invalid byte sequence in the
source encoding when performing an "encoding
convertfrom" or while reading from a channel.

2) Encountering a Unicode character that cannot
be represented in the target encoding when
performing an "encoding convertto" or while
writing to a channel.

For example, these commands succeed
despite the data being incorrect:

$ tclsh
% encoding convertto iso8859-1 "\u4E24"
% encoding convertfrom utf-8 "\xC3\x28"

This behaviour is often convenient if one wants
ones app to carry on "working" whatever happens,
but it's less desirable if one wants security or

I feel Tcl would benefit from allowing a user to
request stricter encoding conversions.

Perhaps one way to do this would be to add a -strict
option to the encoding convert* commands:

encoding convertfrom ?-strict? ?encoding? data
encoding convertto ?-strict? ?encoding? string

And add a "-strictEncoding boolean" option to
fconfigure and chan configure.

I guess there might be other Tcl commands that
perform encoding conversions that might need to
be considered ("source" is one that comes to mind).

I'm not familiar enough with the C-level API to
propose changes there.

Alternatively, if this was targetted for Tcl 9 then
strictness could be made the default ...


Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks