Menu

#101 make-encoding :input-action on bad input

lisp error
closed-fixed
clisp (524)
5
2003-03-11
2002-04-12
No

(ext:convert-string-from-bytes
'#(255 254 65 0 13 0)
(ext:make-encoding :charset 'charset:utf-16))
"A^M" as expected

(ext:convert-string-from-bytes
'#(255 254 65 0 13) ; missing last 0
(ext:make-encoding :charset 'charset:utf-16
:input-error-action :error))
""
(ext:convert-string-from-bytes
'#(255 254 65 0 13) ; missing last 0
(ext:make-encoding :charset 'charset:utf-16
:input-error-action #\Z))
""

Ok, the input is truncated and thus erroneous. But I
don't expect CLISP to silently return an empty string
on such input. An error should be signaled or whatever.

Tested with CLISP-2.28 (but not with later 2002-03-12
patch to encoding :output-action), MS-VC6.

Regards,
Jörg Höhle

Discussion

  • Sam Steingold

    Sam Steingold - 2002-04-16
    • labels: --> clisp
     
  • Sam Steingold

    Sam Steingold - 2002-04-16

    Logged In: YES
    user_id=5735

    this patch should fix the bug

    Index: src/stream.d

    RCS file: /cvsroot/clisp/clisp/src/stream.d,v
    retrieving revision 1.266
    diff -u -w -b -r1.266 stream.d
    --- src/stream.d 13 Apr 2002 18:50:48 -0000 1.266
    +++ src/stream.d 16 Apr 2002 18:46:54 -0000
    @@ -3624,9 +3624,7 @@
    var size_t outsize = tmpbufsize*sizeof(chart);
    var size_t res =
    iconv(cd,&inptr,&insize,&outptr,&outsize);
    if (res == (size_t)(-1) && errno != E2BIG) {
    - if (errno == EINVAL) # incomplete input?
    - break;
    - else if (errno == EILSEQ) {
    + if (errno == EILSEQ || errno == EINVAL) {
    ASSERT(insize > 0);
    var object action =
    TheEncoding(encoding)->enc_towcs_error;
    if (eq(action,S(Kignore))) {
    @@ -3675,9 +3673,7 @@
    while (insize > 0 && outsize > 0) {
    var size_t res =
    iconv(cd,&inptr,&insize,&outptr,&outsize);
    if (res == (size_t)(-1)) {
    - if (errno == EINVAL) { # incomplete input?
    - inptr += insize; break;
    - } else if (errno == EILSEQ) {
    + if (errno == EILSEQ || errno == EINVAL) {
    ASSERT(insize > 0);
    var object action =
    TheEncoding(encoding)->enc_towcs_error;
    if (eq(action,S(Kignore))) {

     
  • Sam Steingold

    Sam Steingold - 2002-04-16
    • milestone: --> lisp error
    • assigned_to: nobody --> sds
     
  • Sam Steingold

    Sam Steingold - 2002-04-21

    Logged In: YES
    user_id=5735

    thank you for your bug report.
    the bug has been fixed in the CVS tree.
    you can either wait for the next release (recommended)
    or check out the current CVS tree (see http://clisp.cons.org\)
    and build CLISP from the sources (be advised that between
    releases the CVS tree is very unstable and may not even build
    on your platform).

     
  • Sam Steingold

    Sam Steingold - 2002-04-21
    • status: open --> closed-fixed
     
  • Jörg Höhle

    Jörg Höhle - 2002-04-30
    • status: closed-fixed --> open-fixed
     
  • Jörg Höhle

    Jörg Höhle - 2002-04-30

    Logged In: YES
    user_id=377168

    Sorry to reopen this. It appears not everything is
    consistent:
    3. Break [7]> (ext:convert-string-from-bytes #(0 65 0)
    charset:utf-16)
    [stream.d:3645]
    *** - Win32 error 6 (ERROR_INVALID_HANDLE): The handle is
    invalid.
    -- misleading error, see also #550528
    4. Break [8]> (ext:convert-string-from-bytes #(0 65 0)
    charset:ucs-2)
    "A"
    -- no error, even though it's incorrect input - not what I
    wish for!
    So this is two bug reports in one:
    (ext:convert-string-from-bytes #(0 65 0) (ext:make-encoding
    :charset 'charset:ucs-2 :input-error-action :error))
    "A" is not correct, while the same with :UTF-16 correctly
    signals an error.

    Before the patch, (0 65 0) :UTF-16 -> "" empty string.

    I believe part of the culprit is
    errno = saved_errno; OS_error()
    which only works on UNIX.
    SetLastError(status); OS_error();
    is needed for MS-Windows.

    Wasn't there a wrapper for set_errno some time in history?
    There's #define saving_sock_errno(statement) in socket.d

    This pattern would well fit the >10 places in stream.d which
    have int saved_errno = errno; ... (at least the ~6 places
    not specific to UNIX pipes, esp. iconv).

    Regards,
    Jrg Hhle

     
  • Sam Steingold

    Sam Steingold - 2002-04-30

    Logged In: YES
    user_id=5735

    what's the difference between utf-16 and ucs-2?
    are you _really_ sure that the behavior you observe is broken?
    could you please veryfy that it is?
    E.g., look at the GNU libiconv testsuite.
    Can you produce an example when CLISP and libiconv differ?
    Thanks!

     
  • Sam Steingold

    Sam Steingold - 2003-03-11
    • status: open-fixed --> closed-fixed
     
  • Sam Steingold

    Sam Steingold - 2003-03-11

    Logged In: YES
    user_id=5735

    I checked in a fix (and added a test to the testsuite).
    please try again.

     

Log in to post a comment.