(ext:convert-string-from-bytes
'#(255 254 65 0 13 0)
(ext:make-encoding :charset 'charset:utf-16))
"A^M" as expected
(ext:convert-string-from-bytes
'#(255 254 65 0 13) ; missing last 0
(ext:make-encoding :charset 'charset:utf-16
:input-error-action :error))
""
(ext:convert-string-from-bytes
'#(255 254 65 0 13) ; missing last 0
(ext:make-encoding :charset 'charset:utf-16
:input-error-action #\Z))
""
Ok, the input is truncated and thus erroneous. But I
don't expect CLISP to silently return an empty string
on such input. An error should be signaled or whatever.
Tested with CLISP-2.28 (but not with later 2002-03-12
patch to encoding :output-action), MS-VC6.
Regards,
Jörg Höhle
Logged In: YES
user_id=5735
this patch should fix the bug
Index: src/stream.d
RCS file: /cvsroot/clisp/clisp/src/stream.d,v
retrieving revision 1.266
diff -u -w -b -r1.266 stream.d
--- src/stream.d 13 Apr 2002 18:50:48 -0000 1.266
+++ src/stream.d 16 Apr 2002 18:46:54 -0000
@@ -3624,9 +3624,7 @@
var size_t outsize = tmpbufsize*sizeof(chart);
var size_t res =
iconv(cd,&inptr,&insize,&outptr,&outsize);
if (res == (size_t)(-1) && errno != E2BIG) {
- if (errno == EINVAL) # incomplete input?
- break;
- else if (errno == EILSEQ) {
+ if (errno == EILSEQ || errno == EINVAL) {
ASSERT(insize > 0);
var object action =
TheEncoding(encoding)->enc_towcs_error;
if (eq(action,S(Kignore))) {
@@ -3675,9 +3673,7 @@
while (insize > 0 && outsize > 0) {
var size_t res =
iconv(cd,&inptr,&insize,&outptr,&outsize);
if (res == (size_t)(-1)) {
- if (errno == EINVAL) { # incomplete input?
- inptr += insize; break;
- } else if (errno == EILSEQ) {
+ if (errno == EILSEQ || errno == EINVAL) {
ASSERT(insize > 0);
var object action =
TheEncoding(encoding)->enc_towcs_error;
if (eq(action,S(Kignore))) {
Logged In: YES
user_id=5735
thank you for your bug report.
the bug has been fixed in the CVS tree.
you can either wait for the next release (recommended)
or check out the current CVS tree (see http://clisp.cons.org\)
and build CLISP from the sources (be advised that between
releases the CVS tree is very unstable and may not even build
on your platform).
Logged In: YES
user_id=377168
Sorry to reopen this. It appears not everything is
consistent:
3. Break [7]> (ext:convert-string-from-bytes #(0 65 0)
charset:utf-16)
[stream.d:3645]
*** - Win32 error 6 (ERROR_INVALID_HANDLE): The handle is
invalid.
-- misleading error, see also #550528
4. Break [8]> (ext:convert-string-from-bytes #(0 65 0)
charset:ucs-2)
"A"
-- no error, even though it's incorrect input - not what I
wish for!
So this is two bug reports in one:
(ext:convert-string-from-bytes #(0 65 0) (ext:make-encoding
:charset 'charset:ucs-2 :input-error-action :error))
"A" is not correct, while the same with :UTF-16 correctly
signals an error.
Before the patch, (0 65 0) :UTF-16 -> "" empty string.
I believe part of the culprit is
errno = saved_errno; OS_error()
which only works on UNIX.
SetLastError(status); OS_error();
is needed for MS-Windows.
Wasn't there a wrapper for set_errno some time in history?
There's #define saving_sock_errno(statement) in socket.d
This pattern would well fit the >10 places in stream.d which
have int saved_errno = errno; ... (at least the ~6 places
not specific to UNIX pipes, esp. iconv).
Regards,
Jrg Hhle
Logged In: YES
user_id=5735
what's the difference between utf-16 and ucs-2?
are you _really_ sure that the behavior you observe is broken?
could you please veryfy that it is?
E.g., look at the GNU libiconv testsuite.
Can you produce an example when CLISP and libiconv differ?
Thanks!
Logged In: YES
user_id=5735
I checked in a fix (and added a test to the testsuite).
please try again.