#107 misleading error converting UTF-7

lisp error
closed-fixed
Sam Steingold
libiconv (7)
5
2002-04-30
2002-04-30
Jörg Höhle
No

I was trying out
(ext:convert-string-from-bytes #(0) charset:utf-7)
I guess it's an illegal sequence. However, the error
being reported is misleading.

CLISP-2.28 (including EILSEQ||EINVAL) bug-fix [ 543072
] ) reports:
[stream.d:3645]
*** - Win32 error 6 (ERROR_INVALID_HANDLE): The handle
is invalid.

CLISP-2.27 reports
[%s:3673]
*** - Win32 error 18 (ERROR_NO_MORE_FILES): There are
no more files.

Line 3645 is the one terminating the with_string_0 of
iconv_mblen().

Maybe there's an auxiliary problem that errwin32.c
doesn't seem to mention EINVAL??
MS-VC/include/errno.h has it as error 42.

(ext:convert-string-from-bytes #(0 0) charset:utf-7)
or #(0 0 0) doesn't work either.

I was looking for a way to portably zero-terminate a
string (the number of 0 bytes depends on the encoding).

Regards,
Jörg Höhle.

Discussion

  • Sam Steingold
    Sam Steingold
    2002-04-30

    • assigned_to: haible --> sds
    • status: open --> closed-fixed
     
  • Sam Steingold
    Sam Steingold
    2002-04-30

    Logged In: YES
    user_id=5735

    thank you for your bug report.
    the bug has been fixed in the CVS tree.
    you can either wait for the next release (recommended)
    or check out the current CVS tree (see http://clisp.cons.org\)
    and build CLISP from the sources (be advised that between
    releases the CVS tree is very unstable and may not even build
    on your platform).

     
  • Bruno Haible
    Bruno Haible
    2002-05-06

    Logged In: YES
    user_id=5923

    > CLISP-2.28 (including EILSEQ||EINVAL) bug-fix [ 543072
    > ] ) reports:
    > [stream.d:3645]
    > *** - Win32 error 6 (ERROR_INVALID_HANDLE): The handle
    > is invalid.

    Yes, it would make sense to give an error with a more
    specific error
    message.

    > I was looking for a way to portably zero-terminate a
    > string (the number of 0 bytes depends on the encoding).

    There's no API for this. You have to special case on the
    encoding
    name. Among the well-known encodings,

    UCS-2, UCS-2BE, UCS-2LE
    UTF-16, UTF-16BE, UTF-16LE

    need 2 0x00 bytes,

    UCS-4, UCS-4BE, UCS-4LE
    UTF-32, UTF-32BE, UTF-32LE

    need 4 0x00 bytes,

    and all other encodings need only one. Btw, adding a NUL
    character is
    a technique that is usually only done on ASCII compatible
    strings, so
    that strlen etc. can be applied, but is not often done when
    dealing
    with UTF-16 or UTF-32 strings, because for these you have to
    write the
    library functions yourself anyway.