From: Daniel J. <dan...@gm...> - 2017-03-29 18:27:12
|
I'm currently trying to find the source of a very strange bug. Steps to reproduce: <<<<<<<<<------------CUT--------------->>>>>>>>>> (defvar *s* (socket-connect 12345 "localhost" :TIMEOUT 0)) (gc) *s* ;; or any other use of *s* <<<<<<<<<------------CUT-END----------->>>>>>>>>> Console output: <<<<<<<<<------------CUT--------------->>>>>>>>>> [1]> (defvar *s* (socket-connect 12345 "localhost" :TIMEOUT 0)) *S* [2]> (gc) 3516208 ; 879052 ; 165312 ; 1 ; 70936 ; 9000 [3]> *s* *** - handle_fault error1 ! SIGSEGV cannot be cured. Fault address = 0x0. GC count: 1 Space collected by GC: 70936 Run time: 0 25000 Real time: 19 738446 GC time: 0 9000 Permanently allocated: 165312 bytes. Currently in use: 3523648 bytes. Free space: 871612 bytes. Segmentation fault <<<<<<<<<------------CUT-END----------->>>>>>>>>> So far I was only able to reproduce this with above socket stream. Lists, arrays, other streams all correctly survive a garbage collection cycle. Further this also happens with let bindings and defparameter. Configure flags: --with-readline --with-dynamic-ffi --with-ffcall operating system: Linux 4.4.30 x86_64 GNU/Linux (NixOS) hg commit: d2b04f050f97 software-version and software-type output follows: <<<<<<<<<------------CUT--------------->>>>>>>>>> [1]> (software-version) "GNU-C 5.4.0" [2]> (software-type) "gcc -g -O2 -W -Wswitch -Wcomment -Wpointer-arith -Wreturn-type -Wmissing-declarations -Wimplicit -Wno-sign-compare -Wno-format-nonliteral -Wno-shift-negative-value -O -fwrapv -fno-strict-aliasing -DENABLE_UNICODE -DDYNAMIC_FFI -DDYNAMIC_MODULES libgnu.a -lreadline -lncurses -ldl -lavcall -lcallback -lsigsegv SAFETY=0 TYPECODES WIDE_HARD GENERATIONAL_GC SPVW_BLOCKS SPVW_PURE SINGLEMAP_MEMORY libsigsegv 2.10 libreadline 6.3" <<<<<<<<<------------CUT-END----------->>>>>>>>>> I was able to further narrow down possible sources of this bug due to this: socket-connect returns a two-way-stream, with by default the input buffered and the output unbuffered. Passing :buffered T makes both buffered, and still segfaults. Passing :buffered nil makes both streams (input and output) unbuffered, and does NOT segfault anymore. Thus I suspect that something of the buffered input stream is not visible to garbage collection. More interesting behaviour: Above commands need to be issued on different lines. Otherwise, if any of the three commands (or all of them) appear on the same line, then I get this: <<<<<<<<<------------CUT--------------->>>>>>>>>> [1]> (defvar *s* (socket-connect 12345 "localhost" :TIMEOUT 0)) (gc) *s* *S* [2]> 3516816 ; 879204 ; 165312 ; 1 ; 67816 ; 10000 [3]> *** - An array has been shortened by adjusting it while another array was displaced to it. The following restarts are available: ABORT :R1 Abort main loop Break 1 [4]> :bt <1/183> #<SYSTEM-FUNCTION SHOW-STACK> 3 <2/176> #<COMPILED-FUNCTION SYSTEM::PRINT-BACKTRACE> <3/170> #<COMPILED-FUNCTION SYSTEM::DEBUG-BACKTRACE> <4/161> #<SYSTEM-FUNCTION SYSTEM::READ-EVAL-PRINT> 2 <5/158> #<COMPILED-FUNCTION SYSTEM::BREAK-LOOP-2-3> <6/154> #<SYSTEM-FUNCTION SYSTEM::SAME-ENV-AS> 2 <7/140> #<COMPILED-FUNCTION SYSTEM::BREAK-LOOP-2> <8/138> #<SYSTEM-FUNCTION SYSTEM::DRIVER> - T Printed 8 frames Break 1 [4]> :m 1 Break 1 [4]> :bt <1/183> #<SYSTEM-FUNCTION SHOW-STACK> 3 <2/176> #<COMPILED-FUNCTION SYSTEM::PRINT-BACKTRACE> <3/170> #<COMPILED-FUNCTION SYSTEM::DEBUG-BACKTRACE> <4/161> #<SYSTEM-FUNCTION SYSTEM::READ-EVAL-PRINT> 2 <5/158> #<COMPILED-FUNCTION SYSTEM::BREAK-LOOP-2-3> <6/154> #<SYSTEM-FUNCTION SYSTEM::SAME-ENV-AS> 2 <7/140> #<COMPILED-FUNCTION SYSTEM::BREAK-LOOP-2> <8/138> #<SYSTEM-FUNCTION SYSTEM::DRIVER> - T - NIL <9/98> #<COMPILED-FUNCTION SYSTEM::BREAK-LOOP> <10/95> #<SYSTEM-FUNCTION INVOKE-DEBUGGER> 1 [94] frame binding variables (~ = dynamically): | ~ SYSTEM::*PRIN-STREAM* <--> #<UNBOUND> [88] frame binding variables (~ = dynamically): | ~ *PRINT-ESCAPE* <--> T [84] frame binding variables (~ = dynamically): | ~ SYSTEM::*PRIN-JBLOCKS* <--> #<UNBOUND> [78] frame binding variables (~ = dynamically): | ~ SYSTEM::*PRIN-JBMODUS* <--> #<UNBOUND> [72] frame binding variables (~ = dynamically): | ~ SYSTEM::*PRIN-TRAILLENGTH* <--> 0 [66] frame binding variables (~ = dynamically): | ~ SYSTEM::*PRIN-LM* <--> 0 - *** - Internal error: statement in file "../src/lispbibl.d", line 14210 has been reached!! Please see <http://clisp.org/impnotes/faq.html#faq-bugs> for bug reporting instructions. The following restarts are available: ABORT :R1 Abort debug loop ABORT :R2 Abort main loop Break 2 [5]> <<<<<<<<<------------CUT-END----------->>>>>>>>>> The given line number seems to be completely off, I cannot see atm how this could be in any way related. Any ideas? I'll try to get examine this behaviour using gdb if I have time later. hg bisect wasn't so useful due to other errors making it hard to see when this was introduced (must have been after January this year). |