From: Alexandre F. <ale...@gm...> - 2008-12-13 18:59:19
|
Hi, While looking at http://sourceforge.net/tracker/index.php?func=detail&aid=2380293&group_id=10894&atid=110894 I discovered that [binary decode], for all three supported encodings (hex, base64 and uuencode), had a "-strict" option, meaning the default was non-strict: characters outside the expected range are just ignored. Question: what is the rationale for such "robustness" ? Is there a known perturbation process that is modelled this way, which would insert exclusively out-of-range chars ? Otherwise, it seems _very_ dangerous to ignore such errors (an insertion/deletion in any of the three schemes, especially the two 6bit-based, has catastrophic long-range effects). Moreover, in situations where such a "controlled perturbation" is expected, it is trivial for the programmer to apply a [regsub] first, to filter out the offending characters. What about removing the non-strict pathway entirely ? -Alex |
From: Alexandre F. <ale...@gm...> - 2008-12-15 15:28:37
|
In the absence of any response to this, I am preparing to commit the removal of non-strict decoding. Unless somebody objects and argues ;-) -Alex On Sat, Dec 13, 2008 at 7:59 PM, Alexandre Ferrieux <ale...@gm...> wrote: > Hi, > > While looking at > http://sourceforge.net/tracker/index.php?func=detail&aid=2380293&group_id=10894&atid=110894 > I discovered that [binary decode], for all three supported encodings > (hex, base64 and uuencode), had a "-strict" option, meaning the > default was non-strict: characters outside the expected range are just > ignored. > > Question: what is the rationale for such "robustness" ? > Is there a known perturbation process that is modelled this way, which > would insert exclusively out-of-range chars ? > > Otherwise, it seems _very_ dangerous to ignore such errors (an > insertion/deletion in any of the three schemes, especially the two > 6bit-based, has catastrophic long-range effects). > Moreover, in situations where such a "controlled perturbation" is > expected, it is trivial for the programmer to apply a [regsub] first, > to filter out the offending characters. > > What about removing the non-strict pathway entirely ? > > -Alex > |
From: Alexandre F. <ale...@gm...> - 2008-12-15 17:26:18
|
After a chat with Kevin I committed a middle ground which only allows for whitespace in non-strict mode. And fixes the bug by the way. -Alex On Mon, Dec 15, 2008 at 4:28 PM, Alexandre Ferrieux <ale...@gm...> wrote: > In the absence of any response to this, I am preparing to commit the > removal of non-strict decoding. Unless somebody objects and argues ;-) > > -Alex > > On Sat, Dec 13, 2008 at 7:59 PM, Alexandre Ferrieux > <ale...@gm...> wrote: >> Hi, >> >> While looking at >> http://sourceforge.net/tracker/index.php?func=detail&aid=2380293&group_id=10894&atid=110894 >> I discovered that [binary decode], for all three supported encodings >> (hex, base64 and uuencode), had a "-strict" option, meaning the >> default was non-strict: characters outside the expected range are just >> ignored. >> >> Question: what is the rationale for such "robustness" ? >> Is there a known perturbation process that is modelled this way, which >> would insert exclusively out-of-range chars ? >> >> Otherwise, it seems _very_ dangerous to ignore such errors (an >> insertion/deletion in any of the three schemes, especially the two >> 6bit-based, has catastrophic long-range effects). >> Moreover, in situations where such a "controlled perturbation" is >> expected, it is trivial for the programmer to apply a [regsub] first, >> to filter out the offending characters. >> >> What about removing the non-strict pathway entirely ? >> >> -Alex >> > |
From: Pat T. <pat...@us...> - 2008-12-15 21:19:35
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Firstly - I do this stuff in my spare time. That would be time that I'm not involved with family or work - so if it takes a couple of days to get a response that doesnt mean I never answer your mails. I did the uuencoding purely to facilitate the tcllib implementation which handles the outer wrapping "begin ... end" and the line length stuff separately from the encoding of the data. So its been done as a straight C replacement for the inner part of tcllibs implementation. If everyone wants to complain about it we can just throw it away entirely - the only important encodings are base64 and hex and maybe ascii85 which isn't done but should be soon as its becoming used in more protocols now. The fix for just ignoring whitespace looks ok to me. I dont remember if there was a particular reason for ignoring any invalid characters and in the absence of specific tests or comments aboout the issue we may assume it is an accident. Ignoring whitespace is definately required. Pat Thoyts -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iQCVAwUBSUbJ0GB90JXwhOSJAQLnDwP/Ys3fHqLxIeOO51Tmwhlv7CQhc2WNtu+C jZ9ryoOgxuRQwFyn8+5aMY6EYX7tEJCxP8pSnQbBHSV6hWA097wz/aNyqaM/2i5D 5r3fK/mqhDl9XOE1zxXWaN1fawrIZpq2IdNgTpxUtGFKxwtGVEGz3AndokMA+3Mm hByV4bLTg8U= =S+rd -----END PGP SIGNATURE----- |
From: Andreas K. <and...@ac...> - 2008-12-15 22:46:16
|
> ... ascii85 which > isn't done but should be soon as its becoming used in more protocols now. Interesting. Can you give us references to protocols doing so ? In that context I should also note that the encoding used by tclcompiler/tbcload is a (slightly) modified ascii85. IIRC a \0 in the input is replaced by a 'z' instead of the regular coding, and some output characters are remapped to avoid Tcl's special characters ( ", {, }, [, ], $, \ ): (string map {" v $ w [ x ] | \ y} (I have not made sure of the proper Tcl quoting here)). > The fix for just ignoring whitespace looks ok to me. I dont remember if > there was a particular reason for ignoring any invalid characters and in > the absence of specific tests or comments aboout the issue we may assume > it is an accident. Ignoring whitespace is definately required. -- Andreas Kupries <and...@Ac...> Developer @ http://www.ActiveState.com Tel: +1 778-786-1122 |