#547 mt: clisp hangs repeating sched_yield()

lisp error
closed-fixed
5
2010-04-15
2010-04-15
Sam Steingold
No

<http://article.gmane.org/gmane.lisp.clisp.devel/21551>
<http://article.gmane.org/gmane.lisp.clisp.devel/21558>

every now and then I observe the following weird behavior from clisp (cvs head)
mt amd64 builds: a seemingly innocent form (e.g., (describe 'i18n:gettext)) can make clisp
unresponsive, ignoring C-c and pegging cpu at 100%.
"killall lisp.run" results in a message "Exiting on signal 15" and no change in
behavior (clisp does not exit, load is at 100%).
kill -9 terminates clisp.
"strace -f" prints a stream of
[pid 27228] sched_yield() = 0

under gdb: C-c does nothing, while C-z gives:

Program received signal SIGTSTP, Stopped (user).
[Switching to Thread 0x2abe3aa4e160 (LWP 19001)]
0x00002abe3a4e82e4 in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) where
#0 0x00002abe3a4e82e4 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00002abe3a4e3c3a in _L_lock_1034 () from /lib64/libpthread.so.0
#2 0x00002abe3a4e3afc in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x000000000041c942 in gc_suspend_all_threads (lock_heap=false)
at ../src/spvw_global.d:577
#4 0x00000000004453c6 in signal_handler_thread (arg=0x0) at ../src/spvw.d:4647
#5 0x000000000043ce13 in main (argc=11, argv=0x7fff070773b8)
at ../src/spvw.d:3844

(gdb) info threads
2 Thread 0x42430940 (LWP 19004) 0x00002abe3a7b0937 in sched_yield ()
from /lib64/libc.so.6
* 1 Thread 0x2abe3aa4e160 (LWP 19001) 0x00002abe3a4e82e4 in __lll_lock_wait ()
from /lib64/libpthread.so.0

(gdb) thread 2
[Switching to thread 2 (Thread 0x42430940 (LWP 19004))]#0 0x00002abe3a7b0937
in sched_yield () from /lib64/libc.so.6
(gdb) where
#0 0x00002abe3a7b0937 in sched_yield () from /lib64/libc.so.6
#1 0x00000000004e25f6 in rd_ch_terminal3 (stream_=0x2abe39a641a8)
at ../src/stream.d:9736
#2 0x00000000004f6f9d in read_line (stream_=0x2abe39a641a8,
buffer_=0x2abe39a641c8) at ../src/stream.d:16321
#3 0x0000000000517e24 in C_read_line () at ../src/io.d:4482
#4 0x000000000046259d in funcall_subr (fun={one_o = 281474986531264},
args_on_stack=3) at ../src/eval.d:5227
#5 0x00000000004610dd in funcall (fun={one_o = 281474986531264},
args_on_stack=3) at ../src/eval.d:4860
#6 0x00000000005c4655 in read_form () at ../src/debug.d:236
#7 0x00000000005c6229 in C_read_eval_print () at ../src/debug.d:400
#8 0x000000000046259d in funcall_subr (fun={one_o = 281474986525216},
args_on_stack=2) at ../src/eval.d:5227
#9 0x0000000000461161 in funcall (fun={one_o = 1125899916718560},
args_on_stack=2) at ../src/eval.d:4867
#10 0x0000000000468ce1 in interpret_bytecode_ (closure=
{one_o = 2533288821663024}, codeptr=0x3444e99e8,
byteptr=0x3444e9a2f "\037\a\a") at ../src/eval.d:6790
#11 0x0000000000463e27 in funcall_closure (closure={one_o = 2533288821663024},
args_on_stack=0) at ../src/eval.d:5630
#12 0x0000000000461105 in funcall (fun={one_o = 2533288821663024},
args_on_stack=0) at ../src/eval.d:4862
#13 0x0000000000483185 in C_driver () at ../src/control.d:1999
#14 0x0000000000468f59 in interpret_bytecode_ (closure=
{one_o = 2533288821300400}, codeptr=0x3444e9960,
byteptr=0x3444e998d "\031\003") at ../src/eval.d:6796
#15 0x0000000000463e27 in funcall_closure (closure={one_o = 2533288821300400},
args_on_stack=0) at ../src/eval.d:5630
#16 0x0000000000461105 in funcall (fun={one_o = 2533288821300400},
args_on_stack=0) at ../src/eval.d:4862
#17 0x000000000046a32c in interpret_bytecode_ (closure=
{one_o = 2533288821649072}, codeptr=0x3444ec3c0,
byteptr=0x3444ec40c "\031\001\230\016\033l") at ../src/eval.d:6845
#18 0x0000000000463e27 in funcall_closure (closure={one_o = 2533288821649072},
args_on_stack=0) at ../src/eval.d:5630
#19 0x0000000000461105 in funcall (fun={one_o = 2533288821649072},
args_on_stack=0) at ../src/eval.d:4862
#20 0x00000000005c6d55 in driver () at ../src/debug.d:478
#21 0x000000000043c974 in main_actions (p=0x987940) at ../src/spvw.d:3600
#22 0x0000000000439cd3 in mt_main_actions (param=0x14ffc010)
at ../src/spvw.d:3624
#23 0x00002abe3a4e1617 in start_thread () from /lib64/libpthread.so.0
#24 0x00002abe3a7c9c2d in clone () from /lib64/libc.so.6
(gdb)

stream.d:9736 is "end_blocking_system_call();"

adding "rl_catch_signals = 0;" in make_terminal_stream_:

--- stream.d.~1.675.~ 2010-03-18 10:48:19.000000000 -0400
+++ stream.d 2010-04-15 10:45:31.000176000 -0400
@@ -10005,6 +10005,9 @@ local maygc object make_terminal_stream_
var bool same_tty = stdin_tty && stdout_tty && stdio_same_tty_p();
end_system_call();
#ifdef HAVE_TERMINAL3
+ #ifdef MULTITHREAD
+ rl_catch_signals = 0;
+ #endif
if (rl_gnu_readline_p && same_tty && !disable_readline) { /* Build a TERMINAL3-Stream: */
pushSTACK(make_ssstring(80)); /* allocate line-buffer */
pushSTACK(make_ssstring(80)); /* allocate line-buffer */

does not change anything.

Discussion

  • Sam Steingold
    Sam Steingold
    2010-04-15

    • labels: 100543 --> multithreading
    • milestone: 1107844 --> lisp error
     
  • thank you for your bug report.
    the bug has been fixed in the CVS tree.
    you can either wait for the next release (recommended)
    or check out the current CVS tree (see http://clisp.cons.org\)
    and build CLISP from the sources (be advised that between
    releases the CVS tree is very unstable and may not even build
    on your platform).

     
  • Sam Steingold
    Sam Steingold
    2010-04-15

    thanks!
    in the future, please close the bug when you fix it.

     
  • Sam Steingold
    Sam Steingold
    2010-04-15

    • status: open --> closed-fixed