From: SourceForge.net <no...@so...> - 2006-11-17 17:24:20
|
Bugs item #1592343, was opened at 2006-11-07 19:24 Message generated for change (Comment added) made by sds You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=101355&aid=1592343&group_id=1355 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: clisp Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Pascal J. Bourguignon (informatimago) Assigned to: Jörg Höhle (hoehle) Summary: thru TCP without pseudo-tty => UNIX error 14 bad address Initial Comment: When clisp is invoked thru bare ssh, it fails. We need to use the -t option to ssh to force pseudo-tty allocation for clisp. I see no good reason why clisp couldn't detect that it isn't connected to a pty and avoid to use an unapplyable API, after all, it does it right with files and pipes: # (not localhost) $ ssh janus-1 clisp --version stty: standard input: Invalid argument GNU CLISP 2.39 (2006-07-16) (built 3365477833) (memory 3365480464) Software: GNU C 3.3 20030226 (prerelease) (SuSE Linux) gcc -g -O2 -W -Wswitch -Wcomment -Wpointer-arith -Wimplicit -Wreturn-type -Wmissing-declarations -Wno-sign-compare -O2 -fexpensive-optimizations -falign-functions=4 -DUNICODE -DDYNAMIC_FFI -I. -x none libcharset.a libavcall.a libcallback.a -lreadline -lncurses -ldl -lsigsegv -L/usr/X11R6/lib SAFETY=0 HEAPCODES LINUX_NOEXEC_HEAPCODES GENERATIONAL_GC SPVW_BLOCKS SPVW_MIXED TRIVIALMAP_MEMORY libsigsegv 2.2 libreadline 4.3 Features: (READLINE REGEXP SYSCALLS I18N LOOP COMPILER CLOS MOP CLISP ANSI-CL COMMON-LISP LISP=CL INTERPRETER SOCKETS GENERIC-STREAMS LOGICAL-PATHNAMES SCREEN FFI GETTEXT UNICODE BASE-CHAR=CHARACTER PC386 UNIX) C Modules: (clisp i18n syscalls regexp readline) Installation directory: /usr/local/languages/clisp-2.39-pjb1-regexp/lib/clisp/ User language: ENGLISH Machine: I686 (I686) janus-1.janus.afaa.asso.fr [195.114.85.145] [1]> *** - UNIX error 14 (EFAULT): Bad address (ext:quit) [pjb@thalassa httpd]$ ssh -t janus-1 clisp --version GNU CLISP 2.30 (released 2002-09-15) (built on bragg.suse.de [127.0.0.2]) Features: (CLOS LOOP COMPILER CLISP ANSI-CL COMMON-LISP LISP=CL INTERPRETER SOCKETS GENERIC-STREAMS LOGICAL-PATHNAMES SCREEN FFI UNICODE BASE-CHAR=CHARACTER SYSCALLS PC386 UNIX) Connection to janus-1 closed. [pjb@thalassa httpd]$ ls -l | clisp --version 2> /tmp/errs | cat GNU CLISP 2.39 (2006-07-16) (built 3364813332) (memory 3364813914) Software: GNU C 3.3 20030226 (prerelease) (SuSE Linux) gcc -g -O2 -W -Wswitch -Wcomment -Wpointer-arith -Wimplicit -Wreturn-type -Wmissing-declarations -Wno-sign-compare -O2 -fexpensive-optimizations -falign-functions=4 -DUNICODE -DDYNAMIC_FFI -I. -x none libcharset.a libavcall.a libcallback.a -lreadline -lncurses -ldl -lsigsegv -L/usr/X11R6/lib SAFETY=0 HEAPCODES LINUX_NOEXEC_HEAPCODES GENERATIONAL_GC SPVW_BLOCKS SPVW_MIXED TRIVIALMAP_MEMORY libsigsegv 2.4 libreadline 4.3 Features: (READLINE REGEXP SYSCALLS I18N LOOP COMPILER CLOS MOP CLISP ANSI-CL COMMON-LISP LISP=CL INTERPRETER SOCKETS GENERIC-STREAMS LOGICAL-PATHNAMES SCREEN FFI GETTEXT UNICODE BASE-CHAR=CHARACTER PC386 UNIX) C Modules: (clisp i18n syscalls regexp readline) Installation directory: /usr/local/languages/clisp-2.39-pjb1-regexp/lib/clisp/ User language: ENGLISH Machine: I686 (I686) thalassa.informatimago.com [62.93.174.79] ---------------------------------------------------------------------- >Comment By: Sam Steingold (sds) Date: 2006-11-17 12:24 Message: Logged In: YES user_id=5735 Originator: NO >I think I should now report a kernel bug please do. this way we will learn from the true source of knowledge about the subject. BTW, SF CF also gives you solaris, *bsd, and macos, did you try testing there? ---------------------------------------------------------------------- Comment By: Jörg Höhle (hoehle) Date: 2006-11-17 10:48 Message: Logged In: YES user_id=377168 Originator: NO For the record, when trying and changing force-output to finish-output in FRESH-LINE, clisp randomly(!) outputs one of two forms given the following command: ./lisp1.run -M lispinit.mem -x '(progn (format *standard-output* "~&line1~.") (format *error-output* "~&line2~.") *error-output*)' | cat Either: line2 ;<<<----!!! i i i i i i i ooooo o ooooooo ooooo ooooo [...banner...] Copyright (c) Sam Steingold, Bruno Haible 2001-2006 line1 #<OUTPUT UNBUFFERED FILE-STREAM CHARACTER #P"/dev/fd/2"> or: i i i i i i i ooooo o ooooooo ooooo ooooo [...banner...] Copyright (c) Sam Steingold, Bruno Haible 2001-2006 line1 line2#<OUTPUT UNBUFFERED FILE-STREAM CHARACTER #P"/dev/fd/2"> These results were incredible at first. They tell a lot about scheduling in Linux. One might raise the question about whether it's CLISP's job to automatically FINISH-OUTPUT in such a case when the programmer didn't bother. I.e. is the first form of output really unacceptable? $ date --foo | date --version produces exactly similar symptoms on my Linux box: randomly, the complaint about foo appears before or after the version text. In summary, the change is not in CVS and I'm not really satisfied with the current handling of stdin/stderr or *terminal-io* handling in clisp. In any case, I think I should now report a kernel bug and ask for harmonization of errno codes or the origin of EFAULT. My findings are: $ ~/Bugs/tcdrain tcdrain(0)=(0,0) tcdrain(1)=(0,0) tcdrain(2)=(0,0) $ ~/Bugs/tcdrain | cat tcdrain(0)=(0,0) tcdrain(1)=(-1,22) # EINVAL = Invalid argument tcdrain(2)=(0,0) $ ~/Bugs/tcdrain 2>/dev/null tcdrain(0)=(0,0=) tcdrain(1)=(0,0=) tcdrain(2)=(-1,25) # ENOTTY = Inappropriate ioctl for device $ echo foo | ~/Bugs/tcdrain tcdrain(0)=(-1,22) tcdrain(1)=(0,0) tcdrain(2)=(0,0) # Now I'd expect an error similar to the previous ones, or possibly EOPNOTSUPP, but not EFAULT: $ ssh localhost ~/Bugs/tcdrain tcdrain(0)=(-1,14) # EFAULT = Bad address tcdrain(1)=(-1,14) # EFAULT tcdrain(2)=(-1,14) # EFAULT $ ssh -t localhost ~/Bugs/tcdrain tcdrain(0)=(0,0) tcdrain(1)=(0,0) tcdrain(2)=(0,0) Connection to localhost closed. Does anybody know how to redirect I/O to/from a socket on the command-line, without intervening ssh or installing inetd? ---------------------------------------------------------------------- Comment By: Jörg Höhle (hoehle) Date: 2006-11-10 04:39 Message: Logged In: YES user_id=377168 I don't like Sam's suggestion of adding even more cases two the terminal IO detection. It makes for less reliable SW. Why? Because programmers try out their software in few situations using "clisp" or "ssh localhost clisp" or "clisp | cat". The more distinctions there are, the more likely somebody else than the programmer will run into a case where something fails. The more understandable the distinctions, the better. We should be able to explain the behaviour to the user, and s/he should understand it. I observed the following surprising situation: clisp | tee foo *error-output* == *standard-output* = #<terminal-io> (error output goes to the pipe) which may be fine in itself, but is not what a UNIX guy expects. About stream.d:fresh_line I'd like to query whether it's needed for FRESH-LINE to call FINISH-OUTPUT instead of FORCE-OUTPUT. The reasoning is that "fresh-line is similar to TERPRI" (21.2). It's output only. For output, all that counts to me is that sometimes I want to empty internal buffers. But there's no reason to wait until output is reported complete to continue. Waiting for completion is a potentially expensive operation. It's opposed to the idea streaming. (This is not off topic, because fresh-line is what causes the many EFAULT errors via finish-output) I agree with Bruno that fsync() and tcdrain() must not depend on each other (unless one succeeds, of course, which the current patch implements). One could argue that if tcdrain() returns ENOTTY, there's no need to call ioctl() to try the same in other words. But that's a minor thing now. I'd like to know whether it's finish_tty_output() specification to be able to handle all kinds of streams, or whether it should be restricted to (pseudo) ttys only. In the latter case, that would mean there a bug in CLISP which causes this function to be used even on sockets and pipes. For now, it looks like I have to report a kernel (or glibc?) bug about tcdrain() returning EFAULT on a socket (and also ssh, probably for the same reason). Yet we could work-around the bug in some way (e.g. via fresh-line -> force-output instead of finish-output, even though direct calls to finish-output would fail). If nobody complains, I'll have FRESH-LINE use FORCE-OUTPUT. That will make the --version symptom disappear (and possibly provide some speed-up). ---------------------------------------------------------------------- Comment By: Sam Steingold (sds) Date: 2006-11-09 13:58 Message: Logged In: YES user_id=5735 Bruno writes: > it makes sense to ignore all errors in finish_tty_output() et al. > the risk is that we may miss some bugs this way. The risk is high: EFAULT in particular means that you/we/the tcdrain function has passed to the kernel an invalid memory address. EFAULT is a friendly warning. On other kernels / in other situations, it could overwrite arbitrary regions of memory and make the program crash later on. => NEVER ignore EFAULT, unless you have investigated it in depth and are 200% convinced that it's a kernel bug. ---------------------------------------------------------------------- Comment By: Sam Steingold (sds) Date: 2006-11-09 10:14 Message: Logged In: YES user_id=5735 I started to think along the lines of this patch: --- stream.d 09 Nov 2006 06:01:30 -0500 1.570 +++ stream.d 09 Nov 2006 10:13:17 -0500 @@ -14911,22 +14911,32 @@ return O(standard_error_file_stream); } -# UP: Returns the default value for *terminal-io*. -# can trigger GC -local maygc object make_terminal_io (void) { - # If stdin or stdout is a file, use a buffered stream instead of an - # unbuffered terminal stream. For the ud2cd program used as filter, - # this reduces the runtime on Solaris from 165 sec to 47 sec. - var bool stdin_file = regular_handle_p(stdin_handle); - var bool stdout_file = regular_handle_p(stdout_handle); +/* UP: Returns the default value for *terminal-io*. + > batch_p : is this an interactive session? + can trigger GC */ +local maygc object make_terminal_io (bool batch_p) { + /* If stdin or stdout is a file, use a buffered stream instead of an + unbuffered terminal stream. For the ud2cd program used as filter, + this reduces the runtime on Solaris from 165 sec to 47 sec. */ + var bool stdin_file; + var bool stdout_file; + if (batch_p) { + begin_system_call(); + stdin_file = !isatty(stdin_handle); + stdout_file = !isatty(stdout_handle); + end_system_call(); + } else { + stdin_file = regular_handle_p(stdin_handle); + stdout_file = regular_handle_p(stdout_handle); + } if (stdin_file || stdout_file) { /* Input side: */ - var object istream = - (stdin_file ? get_standard_input_file_stream() : make_terminal_stream()); + var object istream = (stdin_file ? get_standard_input_file_stream() + : make_terminal_stream()); pushSTACK(istream); /* Output side: */ - var object ostream = - (stdout_file ? get_standard_output_file_stream() : make_terminal_stream()); + var object ostream = (stdout_file ? get_standard_output_file_stream() + : make_terminal_stream()); /* Build a two-way-stream: */ return make_twoway_stream(popSTACK(),ostream); } @@ -15034,7 +15044,7 @@ end_call(); #endif { - var object stream = make_terminal_io(); + var object stream = make_terminal_io(batch_p); define_variable(S(terminal_io),stream); # *TERMINAL-IO* } { but it does not help - I get a stream of [stream.d:3479] [stream.d:3479] *** - UNIX error 14 (EFAULT): Bad address [stream.d:3479] [stream.d:3479] *** - UNIX error 14 (EFAULT): Bad address [stream.d:3479] [stream.d:3479] *** - UNIX error 14 (EFAULT): Bad address [stream.d:3479] [stream.d:3479] *** - UNIX error 14 (EFAULT): Bad address [stream.d:3479] [stream.d:3479] *** - UNIX error 14 (EFAULT): Bad address [stream.d:3479] [stream.d:3479] *** - UNIX error 14 (EFAULT): Bad address instead of a single error. looks like you are right -- we need to fix tty flushing. ---------------------------------------------------------------------- Comment By: Sam Steingold (sds) Date: 2006-11-09 09:39 Message: Logged In: YES user_id=5735 ... especially since this way readline is indeed available under ssh which is a good thing. but this does not solve the "clisp --version" issue because it is kind of hard to sell that --version requires a tty. here the "interactive" vs "non-interactive" distinction should help. alas, init_streamvars batch_p argument does not prevent the creation of a terminal tty stream because a batch (shell script) job may still want to interact with the user. maybe we should make batch_p take 3 values: 0: normal, 1: shell script (like now); 2: non-interactive (no tty), for --version. ---------------------------------------------------------------------- Comment By: Sam Steingold (sds) Date: 2006-11-09 09:31 Message: Logged In: YES user_id=5735 another option is to add a note to FAQ that clisp under ssh requires "-t" and forget the whole issue. ---------------------------------------------------------------------- Comment By: Sam Steingold (sds) Date: 2006-11-09 09:29 Message: Logged In: YES user_id=5735 if you follow my suggestion in "1", are the i/o streams created unbuffered? if yes, this is a valid venue. you will need to send the patch to out cygwin maintainer Reini Urban and ask him to check if this change makes readline unavailable on cygin console, xterm, rxvt &c. actually, it would be nice to check that for all "exotic" consoles, but we have access to only these: linux xterm, console, rsvt; woe32 console; cygwin console, xterm, rxvt and cygwin is the only one affected. (there is also an issue of clisp under gdb on rxvt &c :-) ---------------------------------------------------------------------- Comment By: Jörg Höhle (hoehle) Date: 2006-11-09 05:31 Message: Logged In: YES user_id=377168 Some facts: ;ssh: fsync: (-1,22) EINVAL no error ;ssh: tcdrain: (-1,14) EFAULT error ;pipe: tcdrain: (-1,22) EINVAL error ; raises no error in CLISP ;socket: tcdrain (-1,14) EFAULT However, (finish-output #<socket stream>) does not raise an error, because that goes to local void low_finish_output_unbuffered_pipe (object stream) {} # do nothing instead of finish_tty_output. I'd say there's no bug in ssh, it just returns what a socket yields. Open issues: 1. is it normal that finish_tty_output gets called in that situation? 2a. if it is, consider adding EFAULT to the list of acceptable returns 2b. if it is, should it try all three methods in turn, or should it rather implement #if HAVE_FSYNC # try only that #elif HAVE_TERMIOS # or only that #elif HAVE_IOCTL ---------------------------------------------------------------------- Comment By: Jörg Höhle (hoehle) Date: 2006-11-09 05:14 Message: Logged In: YES user_id=377168 1. Indeed, I was wondering why finish_tty_output would be concerned when obviously there's no tty. Actually, this may be the reason: local void low_finish_output_unbuffered_handle (object stream) { finish_tty_output(TheHandle(TheStream(stream)->strm_ochannel)); 2. I've learnt not to deduce anything from missing exceptional situations in the CLHS. E.g. WRITE does not mention exceptional I/O situations. Is that an argument to ignore them? My first patch was broken, I've a fix pending -- but that does not solve the ssh issue. Another solution path is to explore why only --version is affected, i.e. why that invokes (finish-output *terminal-io*) while a normal interactive session does not. Note that even after applying my patch, within ssh: [1]> (finish-output *terminal-io*) *** - UNIX error 14 (EFAULT): Bad address The following restarts are available: ABORT :R1 ABORT Break 1 [2]> (finish-output *terminal-io*) *** - UNIX error 14 (EFAULT): Bad address *** - UNIX error 14 (EFAULT): Bad address *** - UNIX error 14 (EFAULT): Bad address *** - UNIX error 14 (EFAULT): Bad address Weird, isn't it? ---------------------------------------------------------------------- Comment By: Sam Steingold (sds) Date: 2006-11-08 13:09 Message: Logged In: YES user_id=5735 sorry Jorg, my comment crossed your. also, please avoid '# ' comments in new code. ---------------------------------------------------------------------- Comment By: Sam Steingold (sds) Date: 2006-11-08 12:58 Message: Logged In: YES user_id=5735 there are two separate issues here: 1. under SSH, make_terminal_io() should call make_twoway_stream() instead of make_terminal_stream(). this requires replacing regular_handle_p() calls with isatty() there. the risk is that in some obscure cases (cygwin xterm &c) this would lose us readline). 2. FINISH-OUTPUT does not list exceptional situations due to OS interaction, so it makes sense to ignore all errors in finish_tty_output() et al. the risk is that we may miss some bugs this way. ---------------------------------------------------------------------- Comment By: Jörg Höhle (hoehle) Date: 2006-11-08 12:47 Message: Logged In: YES user_id=377168 thank you for your bug report. the bug has been fixed in the CVS tree. you can either wait for the next release (recommended) or check out the current CVS tree (see http://clisp.cons.org) and build CLISP from the sources (be advised that between releases the CVS tree is very unstable and may not even build on your platform). ---------------------------------------------------------------------- Comment By: Jörg Höhle (hoehle) Date: 2006-11-08 12:30 Message: Logged In: YES user_id=377168 Confirmed. Two things are remarkable about the gdb backtrace Third, stream.d:finish_tty_output() looks strange. It does both fsync() and ioctl(). I'd have expected either one or the other. Maybe the #ifdef are broken? Or return; on success is missing, before trying the next method in sequence? 1. a link to [ Bug #1220548 ] -- See how reset(count=<broken value>) is found on the stack. #11 0x08065b70 in interpret_bytecode_ (closure=0x20385cfe, codeptr=0x202aa5dc, byteptr_in=0x42 <Address 0x42 out of bounds>) at eval.d:7077 #12 0x080667e8 in funcall_closure (closure=0xb7b24084, args_on_stack=<value optimized out>) at eval.d:5771 #13 0x080dab73 in driver () at debug.d:477 #14 0x0805b3ec in reset (count=3081912564) at eval.d:517 #15 0x080665b7 in interpret_bytecode_ (closure=0x20387996, codeptr=0x20343fc4, byteptr_in=0x20344004 "\022\002\031\005") at eval.d:7608 2. the problem itself: #25 0x0805d111 in funcall_subr (fun=0x81a0e06, args_on_stack=0) at eval.d:5325 #26 0x080dd8e6 in signal_and_debug (condition=0xbfbb3e10) at error.d:204 #27 0x080ddb4b in end_error (stackptr=<value optimized out>, start_driver_p=true) at error.d:317 #28 0x080dfa7b in OS_error () at errunix.d:688 #29 0x080851f7 in low_finish_output_unbuffered_handle (stream=0xbfbb3e10) at stream.d:3493 #30 0x08085c7d in finish_output_unbuffered (stream=0x2038709e) at stream.d:5678 #31 0x08096954 in fresh_line (stream_=0xb7b24008) at stream.d:16607 #32 0x080a1b0d in C_fresh_line () at io.d:10610 #33 0x0805d18c in funcall_subr (fun=0x81a19e6, args_on_stack=1) at eval.d:5330 #34 0x08052884 in quit () at spvw.d:3423 #35 0x08068c61 in C_exit () at control.d:15 stream.d:3493 contains local void finish_tty_output (Handle handle) ... #if defined(UNIX_TERM_TERMIOS) && defined(TCGETS) && defined(TCSETSW) { var struct termios term_parameters; if (!( ( ioctl(handle,TCGETS,&term_parameters) ==0) && ( ioctl(handle,TCSETSW,&term_parameters) ==0))) { if (!((errno==ENOTTY)||(errno==EINVAL))) { OS_error(); } # no TTY: OK, report other Error } } Actually, tcdrain() is unsupported (gdb line number must be out of sync) and yields (-1,14). 4. It's curious that this only appears with --version. (finish-output *terminal-io*) also raises the error within ssh. Obviously, finish-output is almost never called. 5. Maybe a bug should be reported to ssh, that tcdrain() yields (-1,14) Could you please investigate other situations (inside socket etc.) using this code? (use-package "FFI") (def-call-out tcdrain (:arguments (fd int)) (:return-type int) (:library :default)) (locally (declare (compile)) (setf errno 0) (values (tcdrain 1) errno)) ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=101355&aid=1592343&group_id=1355 |