#470 segfault on SIGHUP

segfault
closed-fixed
Sam Steingold
clisp (525)
5
2008-05-22
2008-05-03
Andrew Kroll
No

May 2 07:14:39 shit kernel: lisp.run[19883]: segfault at 1 ip 08068794 sp bf8d26d0 error 6 in lisp.run[8048000+18c000]

I am use CLISP for a project, unfortunately I can't reveal the code for it, but I can describe the application, and list for you the modules used.

Basically CLISP is loading in as a web application, using the following switches:

#!/usr/bin/clisp -q -q -q -q -norc
then the code...

This is actually a stub to load in FAS's, and execute the actual application.

Modules used are:

asdf Revision: 1.110, with a small modification to not print certain annoyance messages
cl-ppcre 1.3.2
puri 1.5.1
acl-compat dated 2006-01-22 in the Changelog

There is one other module loaded that is part of the project, but it does not segfault any place else.

The only thing I can think of that can cause a similar effect is if STDOUT/STDERR go away, like what would happen on an aborted web page containing lots of data.

If there is something I can do to debug this, please let me know so I can home in on the cause and get to you the information you really need to fix the problem. It's not fatal, just annoying, and the information here says to report segfaults... ;-) And it should be cosmetically cleaned up.

I also notice when building CLISP 2.44.1 that there are many segfaults caused during the build phase...

The exact method of how it is built can be located at the following URL (it's a slackbuild) in case you would like to look at how I build it... Small note on that too, is I had to increase the ulimit to 32768, or else the build fails, you may want to update the size to that during the warning when configuring...

ftp://ftp.uglyplace.org/pkg_dreams/Slackware-12.0.0/repos/development/clisp/2.44.1/src/

Discussion

1 2 3 > >> (Page 1 of 3)
  • Sam Steingold
    Sam Steingold
    2008-05-04

    Logged In: YES
    user_id=5735
    Originator: NO

    This bug report is now marked as "pending"/"works for me".
    This means that we think that we cannot reproduce the problem
    and cannot do anything about it.
    Unless you - the reporter - act within 2 weeks
    (e.g., by submitting a self-contained test case
    or answering our other recent requests),
    the bug will be permanently closed.
    Sorry about the inconvenience -
    we hope your silence means that
    you are no longer observing the problem either.

     
  • Sam Steingold
    Sam Steingold
    2008-05-04

    • assigned_to: haible --> sds
    • status: open --> pending-works-for-me
     
  • Sam Steingold
    Sam Steingold
    2008-05-04

    Logged In: YES
    user_id=5735
    Originator: NO

    unless I have a reproducible test case, there is little I can do.
    it would be nice if you could debug this though.
    first thing you could do is strace/ltrace - it could be useful so that I would look at it and say "hmmm, looks weird, I cannot say anything definite" :-)
    then you could compile CLISP with debugging information, run under gdb (or dump core), and give me the backtrace with the values of the relevant variables.
    http://clisp.cons.org/impnotes/faq.html#faq-debug

     
  • Andrew Kroll
    Andrew Kroll
    2008-05-05

    Logged In: YES
    user_id=2049191
    Originator: YES

    simple to do...
    open your favorite terminal (I'm using konsole, use xterm or whatever you like).
    start up clisp
    close the terminal window
    check dmesg :-)
    it seems consistant for me

    lisp.run[19330]: segfault at 0 ip 08068794 sp bfb76270 error 6 in lisp.run[8048000+18c000]
    lisp.run[19411]: segfault at 0 ip 08068794 sp bf9e48e0 error 6 in lisp.run[8048000+18c000]

    So it's got to be the case of losing it's tty :-)

     
  • Andrew Kroll
    Andrew Kroll
    2008-05-05

    • status: pending-works-for-me --> open-works-for-me
     
  • Andrew Kroll
    Andrew Kroll
    2008-05-05

    strace showing segfault

     
    Attachments
  • Andrew Kroll
    Andrew Kroll
    2008-05-05

    Logged In: YES
    user_id=2049191
    Originator: YES

    Here is the strace :-)

    File Added: trace.clisp.19618

     
  • Sam Steingold
    Sam Steingold
    2008-05-05

    Logged In: YES
    user_id=5735
    Originator: NO

    I cannot reproduce this on fc8 (i386) and fc5 (x86_64).
    the strace says:
    --- SIGHUP (Hangup) @ 0 (0) ---
    --- SIGSEGV (Segmentation fault) @ 0 (0) ---
    i.e., the segfault in triggered by SIGHUP.
    what happens when you send a sighup to the clisp process explicitly?
    i.e.,
    1. start clisp in a window
    2. kill -1 `pidof lisp.run`
    3. what do you see in the window where you started CLISP?
    what I see is
    Exiting on signal 1
    Hangup
    which is correct.
    strace shows no segfault.

     
  • Sam Steingold
    Sam Steingold
    2008-05-05

    • status: open-works-for-me --> pending-works-for-me
     
  • Andrew Kroll
    Andrew Kroll
    2008-05-05

    • status: pending-works-for-me --> open-works-for-me
     
  • Andrew Kroll
    Andrew Kroll
    2008-05-05

    Logged In: YES
    user_id=2049191
    Originator: YES

    [1]>
    *** - handle_fault error2 ! address = 0x0 not in [0x2025b004,0x203c566c) !
    SIGSEGV cannot be cured. Fault address = 0x0.
    Permanently allocated: 92128 bytes.
    Currently in use: 2136704 bytes.
    Free space: 423144 bytes.
    Segmentation fault

     
  • Sam Steingold
    Sam Steingold
    2008-05-05

    Logged In: YES
    user_id=5735
    Originator: NO

    good, we now have something to work with - we can reproduce the bug without loosing the window.
    please build clisp with debug (http://clisp.cons.org/impnotes/faq.html#faq-debug), run under gdb,
    and examine the stack.

     
  • Sam Steingold
    Sam Steingold
    2008-05-05

    • status: open-works-for-me --> pending-works-for-me
     
  • Andrew Kroll
    Andrew Kroll
    2008-05-05

    • status: pending-works-for-me --> open-works-for-me
     
  • Andrew Kroll
    Andrew Kroll
    2008-05-05

    Logged In: YES
    user_id=2049191
    Originator: YES

    I don't seem to get the segfault under gdb :-/ Here is everything I did, and the results.
    The configure line is made exactly as the slackbuild, with the "--with-debug --build build-g" flags added, in the hope to get the program to fall apart... I'll try it vanilla next, as in the faq.

    CFLAGS="-O2 -march=i486 -mtune=i686" ./configure --with-debug --build build-g --with-dynamic-ffi --with-module=clx/new-clx --with-module=pcre --with-module=rawsock --with-module=wildcard --with-module=zlib
    cd build-g
    gdb lisp.run
    boot
    run

    Starting program: /root/code/testclisp/clisp-2.44.1/build-g/lisp.run -B . -N locale -E 1:1 -q -norc -M lispinit.mem
    STACK depth: 98206 [0xb7d0bf00 0xb7cac088]
    ffi_sint32: 4 1
    ffi_uintp: 4 0
    [1]>
    Program received signal SIGHUP, Hangup.
    0xb7e02cce in __read_nocancel () from /lib/libc.so.6
    (gdb) bt 20
    #0 0xb7e02cce in __read_nocancel () from /lib/libc.so.6
    #1 0xb7ee8624 in rl_getc () from /usr/lib/libreadline.so.5
    #2 0xb7ee8a93 in rl_read_key () from /usr/lib/libreadline.so.5
    #3 0xb7ed5b4e in readline_internal_char () from /usr/lib/libreadline.so.5
    #4 0xb7ed5fe5 in readline () from /usr/lib/libreadline.so.5
    #5 0x08089676 in rd_ch_terminal3 (stream_=0xb7cac0f4) at ../src/stream.d:9438
    #6 0x0809ca7b in read_line (stream_=0xb7cac0f4, buffer_=0xb7cac104)
    at ../src/stream.d:15835
    #7 0x080aca60 in C_read_line () at ../src/io.d:4533
    #8 0x0805ba27 in funcall_subr (fun={one_o = 136160790}, args_on_stack=3)
    at ../src/eval.d:5184
    #9 0x080ed94f in read_form () at ../src/debug.d:236
    #10 0x080ede79 in C_read_eval_print () at ../src/debug.d:398
    #11 0x0805ba27 in funcall_subr (fun={one_o = 136157334}, args_on_stack=2)
    at ../src/eval.d:5184
    #12 0x08066436 in interpret_bytecode_ (closure=<value optimized out>,
    codeptr=<value optimized out>, byteptr_in=0x2036ecdc "Üì6 \rQ")
    at ../src/eval.d:6743
    #13 0x080692e1 in funcall_closure (closure={one_o = 540470854},
    args_on_stack=0) at ../src/eval.d:5587
    #14 0x0806e596 in C_driver () at ../src/control.d:1971
    #15 0x080652d8 in interpret_bytecode_ (closure=<value optimized out>,
    codeptr=<value optimized out>,
    byteptr_in=0x2036ec9e "Å\017\001Ü1U¤ì6 \022,") at ../src/eval.d:6749
    ---Type <return> to continue, or q <return> to quit---
    #16 0x080692e1 in funcall_closure (closure={one_o = 540470942},
    args_on_stack=0) at ../src/eval.d:5587
    #17 0x08066d30 in interpret_bytecode_ (closure=<value optimized out>,
    codeptr=<value optimized out>, byteptr_in=0x202c64ee "k\0166")
    at ../src/eval.d:6798
    #18 0x080692e1 in funcall_closure (closure={one_o = 540689918},
    args_on_stack=0) at ../src/eval.d:5587
    #19 0x080ed0ae in driver () at ../src/debug.d:476
    (More stack frames follow...)

     
  • Sam Steingold
    Sam Steingold
    2008-05-05

    • status: open-works-for-me --> pending-works-for-me
     
  • Sam Steingold
    Sam Steingold
    2008-05-05

    Logged In: YES
    user_id=5735
    Originator: NO

    1. please remove -O2 from CFLAGS.
    2. so, what happens when you type "c" at the gdb prompt?

     
  • Andrew Kroll
    Andrew Kroll
    2008-05-05

    Logged In: YES
    user_id=2049191
    Originator: YES

    No segfault on vanilla either.... Perhaps it is due to stripping the binary? If so, why would that really matter?

    I'll build like normal, minus the stripping, and see if the error still pops up it's ugly head.

    I will note tho, that if I gdb the binary that is segfaulting, it segfaults *ALOT* but continues on, and also barfs at the end. This is why I am suspecting the problem being a stripped binary, although, that shouldn't really matter!

     
  • Andrew Kroll
    Andrew Kroll
    2008-05-05

    • status: pending-works-for-me --> open-works-for-me
     
  • Andrew Kroll
    Andrew Kroll
    2008-05-05

    Logged In: YES
    user_id=2049191
    Originator: YES

    Negative on the stripping too... Seems this is going to be one tough sucker to track down :-)
    And yes I did the continues, no segfaults, I simply did not paste them...
    [1]>
    Program received signal SIGHUP, Hangup.
    0xb7e65cce in __read_nocancel () from /lib/libc.so.6
    (gdb) continue
    Continuing.

    Exiting on signal 1

    Program received signal SIGHUP, Hangup.
    0xb7dcec87 in raise () from /lib/libc.so.6
    (gdb) continue
    Continuing.

    Program terminated with signal SIGHUP, Hangup.
    The program no longer exists.
    (gdb)

    :-)

    So what the heck??!! :-)

     
  • Andrew Kroll
    Andrew Kroll
    2008-05-05

    Logged In: YES
    user_id=2049191
    Originator: YES

    I will have to recompile unstripped to debug I think, but....
    Here is the transcription from the binary package I released (usual unimportant parts clipped out):

    gdb /usr/lib/clisp-2.44.1/base/lisp.run
    run -B /usr/lib/clisp-2.44.1 -M /usr/lib/clisp-2.44.1/base/lispinit.mem -N /usr/share/locale
    Starting program: /usr/lib/clisp-2.44.1/base/lisp.run -B /usr/lib/clisp-2.44.1 -M /usr/lib/clisp-2.44.1/base/lispinit.mem -N /usr/share/locale
    (no debugging symbols found)
    (no debugging symbols found)
    (no debugging symbols found)
    (no debugging symbols found)
    (no debugging symbols found)
    (no debugging symbols found)
    (no debugging symbols found)
    (no debugging symbols found)

    Program received signal SIGSEGV, Segmentation fault.
    0x08098b02 in ?? ()
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    0x08098b02 in ?? ()
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    0x0812453f in ?? ()
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    0x0812453f in ?? ()
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    0x0812453f in ?? ()
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    0x08124453 in ?? ()
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    0x08124453 in ?? ()
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    0x08124453 in ?? ()
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    0x08124453 in ?? ()
    (gdb) continue
    Continuing.
    i i i i i i i ooooo o ooooooo ooooo ooooo
    I I I I I I I 8 8 8 8 8 o 8 8
    I \ `+' / I 8 8 8 8 8 8
    \ `-+-' / 8 8 8 ooooo 8oooo
    `-__|__-' 8 8 8 8 8
    | 8 o 8 8 o 8 8
    ------+------ ooooo 8oooooo ooo8ooo ooooo 8

    (no debugging symbols found)
    Welcome to GNU CLISP 2.44.1 (2008-02-23) <http://clisp.cons.org/>

    Copyright (c) Bruno Haible, Michael Stoll 1992, 1993
    Copyright (c) Bruno Haible, Marcus Daniels 1994-1997
    Copyright (c) Bruno Haible, Pierpaolo Bernardi, Sam Steingold 1998
    Copyright (c) Bruno Haible, Sam Steingold 1999-2000
    Copyright (c) Sam Steingold, Bruno Haible 2001-2008

    Type :h and hit Enter for context help.

    Program received signal SIGSEGV, Segmentation fault.
    0x0807ec33 in ?? ()
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    0x0807ec77 in ?? ()
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    0x0807ec77 in ?? ()
    (gdb) continue
    Continuing.
    [1]>
    Program received signal SIGSEGV, Segmentation fault.
    0x080b6ea9 in ?? ()
    (gdb) continue
    Continuing.

    Program received signal SIGHUP, Hangup.
    0xb7d91cce in __read_nocancel () from /lib/libc.so.6
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    0x08068794 in ?? ()
    (gdb) bt 20
    #0 0x08068794 in ?? ()
    #1 0xbfa1cc7b in ?? ()
    #2 0xb7e0f420 in ?? () from /lib/libc.so.6
    #3 0xbfa1cc88 in ?? ()
    #4 <signal handler called>
    #5 0xb7d91ccc in __read_nocancel () from /lib/libc.so.6
    #6 0xb7ecd624 in rl_getc () from /usr/lib/libreadline.so.5
    #7 0xb7ecda93 in rl_read_key () from /usr/lib/libreadline.so.5
    #8 0xb7ebab4e in readline_internal_char () from /usr/lib/libreadline.so.5
    #9 0xb7ebafe5 in readline () from /usr/lib/libreadline.so.5
    #10 0x0809bfbe in ?? ()
    #11 0x08225898 in ?? ()
    #12 0x203de9ac in ?? ()
    #13 0x08225898 in ?? ()
    #14 0x081d4510 in ?? ()
    #15 0xb7cd8a14 in ?? () from /lib/libc.so.6
    #16 0x08225898 in ?? ()
    #17 0x08225898 in ?? ()
    #18 0x00000000 in ?? ()

     
  • Sam Steingold
    Sam Steingold
    2008-05-05

    Logged In: YES
    user_id=5735
    Originator: NO

    I don't think stripping should make a difference
    the segfaults you see are due to generational gc and should be ignored.
    see clisp/src/.gdbinit

     
  • Andrew Kroll
    Andrew Kroll
    2008-05-05

    Logged In: YES
    user_id=2049191
    Originator: YES

    Here it is, without stripping, I'll paste EVERYTHING even if I (or you) think it is unimportant, perhaps you can now locate the problem :-)
    root@shit:~/code/testclisp# gdb /usr/lib/clisp-2.44.1/base/lisp.run
    GNU gdb 6.6
    Copyright (C) 2006 Free Software Foundation, Inc.
    GDB is free software, covered by the GNU General Public License, and you are
    welcome to change it and/or distribute copies of it under certain conditions.
    Type "show copying" to see the conditions.
    There is absolutely no warranty for GDB. Type "show warranty" for details.
    This GDB was configured as "i486-slackware-linux"...
    Using host libthread_db library "/lib/libthread_db.so.1".
    (gdb) run -B /usr/lib/clisp-2.44.1 -M /usr/lib/clisp-2.44.1/base/lispinit.mem -N /usr/share/locale
    Starting program: /usr/lib/clisp-2.44.1/base/lisp.run -B /usr/lib/clisp-2.44.1 -M /usr/lib/clisp-2.44.1/base/lispinit.mem -N /usr/share/locale

    Program received signal SIGSEGV, Segmentation fault.
    0x08098b02 in closed_buffered (stream=0x203c4536) at stream.d:8024
    8024 stream.d: No such file or directory.
    in stream.d
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    0x08098b02 in closed_buffered (stream=0x203a1c9e) at stream.d:8024
    8024 in stream.d
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    register_foreign_variable (address=0x81ee7cc,
    name_asciz=0x81cab41 "ffi_user_pointer", flags=0, size=4) at foreign.d:203
    203 foreign.d: No such file or directory.
    in foreign.d
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    register_foreign_variable (address=0x81ecacc,
    name_asciz=0x812db00 "rl_line_buffer", flags=0, size=4) at foreign.d:203
    203 in foreign.d
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    register_foreign_variable (address=0x81ec9e8,
    name_asciz=0x812db82 "rl_already_prompted", flags=0, size=4)
    at foreign.d:203
    203 in foreign.d
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    register_foreign_function (address=0x804ca40,
    name_asciz=0x812dd24 "rl_get_keymap_by_name", flags=1024) at foreign.d:239
    239 in foreign.d
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    register_foreign_function (address=0x804d870,
    name_asciz=0x812df2c "rl_do_undo", flags=1024) at foreign.d:239
    239 in foreign.d
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    register_foreign_function (address=0x804d7b0,
    name_asciz=0x812e120 "rl_extend_line_buffer", flags=1024) at foreign.d:239
    239 in foreign.d
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    register_foreign_function (address=0x804d460,
    name_asciz=0x812e2c5 "history_search_prefix", flags=1024) at foreign.d:239
    239 in foreign.d
    (gdb) continue
    Continuing.
    i i i i i i i ooooo o ooooooo ooooo ooooo
    I I I I I I I 8 8 8 8 8 o 8 8
    I \ `+' / I 8 8 8 8 8 8
    \ `-+-' / 8 8 8 ooooo 8oooo
    `-__|__-' 8 8 8 8 8
    | 8 o 8 8 o 8 8
    ------+------ ooooo 8oooooo ooo8ooo ooooo 8

    Welcome to GNU CLISP 2.44.1 (2008-02-23) <http://clisp.cons.org/>

    Copyright (c) Bruno Haible, Michael Stoll 1992, 1993
    Copyright (c) Bruno Haible, Marcus Daniels 1994-1997
    Copyright (c) Bruno Haible, Pierpaolo Bernardi, Sam Steingold 1998
    Copyright (c) Bruno Haible, Sam Steingold 1999-2000
    Copyright (c) Sam Steingold, Bruno Haible 2001-2008

    Type :h and hit Enter for context help.

    Program received signal SIGSEGV, Segmentation fault.
    0x0807ec33 in interpret_bytecode_ (closure=0x20276c16, codeptr=0x2027679c,
    byteptr_in=0x202767e3 "=\021cf\032¿Ä-\003\016B\002j\033j") at eval.d:6390
    6390 eval.d: No such file or directory.
    in eval.d
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    0x0807ec77 in interpret_bytecode_ (closure=0x203c4426, codeptr=0x202c4234,
    byteptr_in=0x202c425e "") at eval.d:6385
    6385 in eval.d
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    0x0807ec77 in interpret_bytecode_ (closure=0x203c4426, codeptr=0x202c4234,
    byteptr_in=0x202c4261 "\a") at eval.d:6385
    6385 in eval.d
    (gdb) continue
    Continuing.
    [1]>
    Program received signal SIGSEGV, Segmentation fault.
    get_buffers () at io.d:1163
    1163 io.d: No such file or directory.
    in io.d
    (gdb) continue
    Continuing.

    Program received signal SIGHUP, Hangup.
    0xb7d96cce in __read_nocancel () from /lib/libc.so.6
    (gdb) continue
    Continuing.

    Program received signal SIGSEGV, Segmentation fault.
    0x08068794 in quit_on_signal (sig=1) at spvw_sigterm.d:55
    55 spvw_sigterm.d: No such file or directory.
    in spvw_sigterm.d
    (gdb) continue
    Continuing.

    *** - handle_fault error2 ! address = 0x0 not in [0x2025b004,0x203c566c) !
    SIGSEGV cannot be cured. Fault address = 0x0.
    Permanently allocated: 92128 bytes.
    Currently in use: 2136704 bytes.
    Free space: 423144 bytes.

    Program received signal SIGSEGV, Segmentation fault.
    0x08068794 in quit_on_signal (sig=1) at spvw_sigterm.d:55
    55 in spvw_sigterm.d
    (gdb) continue
    Continuing.

    Program terminated with signal SIGSEGV, Segmentation fault.
    The program no longer exists.
    (gdb)

     
  • Sam Steingold
    Sam Steingold
    2008-05-06

    • status: open-works-for-me --> pending-works-for-me
     
  • Sam Steingold
    Sam Steingold
    2008-05-06

    Logged In: YES
    user_id=5735
    Originator: NO

    OK, now please do the same thing, BUT:
    in gdb do
    handle SIGSEGV noprint nostop
    break sigsegv_handler_failed
    (see .gdbinit).
    then re-run and give me the backtrace when you stop in sigsegv_handler_failed.
    also, please try to debug it yourself - look for obviously bad things.
    thanks.

     
1 2 3 > >> (Page 1 of 3)