Menu

#252 Stack Overflow Twice Causes Segfault

segfault
closed-fixed
clisp (524)
5
2006-04-28
2005-04-10
David Reiss
No

Causing a stack overflow twice in a row leads to a
segfault in clisp 2.33.2 on x86 Linux from Gentoo
ebuild. gcc is version 3.3.5, and glibc is version
2.3.4. My "CFLAGS" is '-mcpu=i686 -O2 -pipe'. Gzipped
core file is attached. Note that this error does *not*
happen with a clean, debug-enabled build. Should I
send this bug report to the Gentoo ebuild maintainer
instead?

This is how I produced the problem:

$ clisp -q -q

[1]> (defun f (n) (if (zerop n) 0 (f (1- n))))
F
[2]> (f 10000)

*** - Program stack overflow. RESET
[3]> (f 10000)
Segmentation fault (core dumped)

This is my system configuration:

$ uname -a
Linux ballpoint 2.6.10-gentoo-r6_dr #1 Sun Mar 6
13:56:17 PST 2005 i686 AMD Athlon(tm) 64 Processor
3200+ AuthenticAMD GNU/Linux
$
$ gcc -v
Reading specs from
/usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.5/specs
Configured with:
/var/tmp/portage/gcc-3.3.5-r1/work/gcc-3.3.5/configure
--enable-version-specific-runtime-libs --prefix=/usr
--bindir=/usr/i686-pc-linux-gnu/gcc-bin/3.3.5
--includedir=/usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.5/include
--datadir=/usr/share/gcc-data/i686-pc-linux-gnu/3.3.5
--mandir=/usr/share/gcc-data/i686-pc-linux-gnu/3.3.5/man
--infodir=/usr/share/gcc-data/i686-pc-linux-gnu/3.3.5/info
--with-gxx-include-dir=/usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.5/include/g++-v3
--host=i686-pc-linux-gnu --disable-altivec
--disable-nls --enable-__cxa_atexit
--enable-clocale=gnu --with-system-zlib
--disable-checking --disable-werror
--disable-libunwind-exceptions --enable-shared
--enable-threads=posix --disable-multilib
--disable-libgcj --enable-languages=c,c++
Thread model: posix
gcc version 3.3.5 (Gentoo Linux 3.3.5-r1, ssp-3.3.2-3,
pie-8.7.7.1)
$
$ clisp --version
GNU CLISP 2.33.2 (2004-06-02) (built 3322071952)
(memory 3322072066)
Software: GNU C 3.3.5 (Gentoo Linux 3.3.5-r1,
ssp-3.3.2-3, pie-8.7.7.1) ANSI C program
Features: (PCRE CLX-ANSI-COMMON-LISP CLX SYSCALLS
REGEXP CLOS LOOP COMPILER CLISP ANSI-CL COMMON-LISP
LISP=CL INTERPRETER SOCKETS GENERIC-STREAMS
LOGICAL-PATHNAMES SCREEN FFI GETTEXT UNICODE
BASE-CHAR=CHARACTER PC386 UNIX)
Installation directory: /usr/lib/clisp/
User language: ENGLISH
Machine: I686 (I686) ballpoint.Stanford.EDU [128.12.51.95]

Discussion

  • David Reiss

    David Reiss - 2005-04-10

    Logged In: YES
    user_id=887335

    I got and error from the file attachment. I'll try to
    attach it again.

     
  • Sam Steingold

    Sam Steingold - 2005-04-11
    • assigned_to: sds --> haible
     
  • Sam Steingold

    Sam Steingold - 2005-04-11

    Logged In: YES
    user_id=5735

    There have been patches on clisp-devel recently which were
    supposed to fix this or something similar.
    the patches have to be applied both to clisp and libsigsegv.
    presumably, Bruno will review them and apply to clisp and
    libsigsegv...

     
  • Jörg Höhle

    Jörg Höhle - 2005-05-23

    Logged In: YES
    user_id=377168

    [ME too], with an even simpler case:
    (f -1) -> stack overflow, RESET
    (cl::barf) (or other errors) -> core dump
    Debian (April 2005 Hoary/Ubuntu) on Linux-386, both
    clisp-2.33.2 from Debian as well as clisp-cvs (a few days
    old), using libsigsegv-dev 2.1-1 packaged for Debian by Will
    Newton.

    I'll have to locate those sigsegv patches and see how I can
    put them into my current Debian system (replacing the Debian
    pre-built package).

     
  • Jörg Höhle

    Jörg Höhle - 2005-05-24

    Logged In: YES
    user_id=377168

    I asked Bruno Haible and he remembers/knows of no patches.
    Furthermore, libsigsegv-cvs is unchanged since 2.1 (what I
    have installed) w.r.t. i386 (mach and MacOSX changed), thus
    Bruno suspects a bug in CLISP: maybe STACK is in a register
    and not restored properly (I'll have to check whether my
    build and also the Ubuntu/Debian clisp-2.33.2 build uses a
    register variable for STACK).

    Summary: the crash bug is still in cvs-clisp-2005-05-18, as
    well as in Ubuntu's clisp-2.33.2 Debian package.

     
  • Sam Steingold

    Sam Steingold - 2005-05-24

    Logged In: YES
    user_id=5735

    patches are in this thread:
    <http://thread.gmane.org/gmane.lisp.clisp.general/9405>

     
  • Jörg Höhle

    Jörg Höhle - 2005-06-14

    Logged In: YES
    user_id=377168

    today's experimental results:
    SAFETY=3 fixed the crash, while it's still in with SAFETY=2
    Now, where to look next??
    Note that with SAFETY=2, STACK_register is not used, so that
    should not be the culprit this time.
    Well, actually, STACK_register was not used in my default
    build anyway since I'm using gcc-3.3 per default and
    lispbibl.d disables it for GNUC_MINOR<4.

     
  • Jörg Höhle

    Jörg Höhle - 2005-06-15

    Logged In: YES
    user_id=377168

    As I noticed that SAFETY=3 disables generational GC, I tried
    again with normal SAFETY settings but -DNO_GENERATIONAL_GC.
    The bug disappears.
    BTW, I'm still using the old sigsegv (i.e. without some
    patches that should not affect i386 anyway).

    I tried normal settings and -DDEBUG_SPVW. It crashed as
    usual. I was surprised that the only debug output was, right
    after program start:
    STACK depth: 114415
    SP depth: 67108956
    I had expected some more messages from using that option.

    Here's another way to crash:
    [1]> (defun fact(n)(if (zerop n) 1 (* n (fact (1- n)))))
    FACT
    [2]> (fact -1)
    *** - Program stack overflow. RESET
    [3]> (room) ; or call (ext:gc)
    Speicherzugriffsfehler
    which shows that the memory is corrupt -- somewhere
    Here again, I'm surprised there's no output from DEBUG_SPVW.

     
  • Jörg Höhle

    Jörg Höhle - 2005-07-04

    Logged In: YES
    user_id=377168

    I now tried the following:
    win32-native build using MS-VC 6.0 (instead of Linux/gcc)
    clisp/cvs (from Friday, 1st of July
    libsigsegv/cvs (I'll have to use the cvs one on Linux also)

    A. Build without libsigsegv:
    (fact -1) -> *** - Lisp stack overflow. RESET, but no crash
    (compile *) (f -1) -> crash & window requester "unknown
    software exception" (0xc00000fd)

    B. Build with libsigsegv (and working generational GC)
    Lisp stack overflow is detected in both interpreted and
    compiled mode, everything is fine.
    Note that this version says
    *** - Program stack overflow. RESET
    in both cases, not "Lisp stack".

    I.e., the bug is not present in the MS-VC/win32 version of
    CLISP.

    I believe the crash in the compiled function + nolibsigsegv
    case is not new and due to weaker stack bounds checking with
    compiled code (possibly still somewhat surprising w.r.t.
    what I was used to from the Amiga, where CLISP sort of never
    crashed in years).

     
  • Jörg Höhle

    Jörg Höhle - 2006-02-07

    Logged In: YES
    user_id=377168

    The bug is still present at least on Linux/i686:
    [1]> (defun fact(n)(if (zerop n) 1 (* n (fact (1- n)))))
    FACT
    [2]> (fact -1)
    Speicherzugriffsfehler

    Even in the interpreter! (identical crash when compiling first).

    ./lisp.run -B. --version
    GNU CLISP 2.37 (2006-01-02) (built 2006-02-04 18:02:32)
    Software: GNU C 4.0.2 20050808 (prerelease) (Ubuntu
    4.0.1-4ubuntu9) gcc -W -Wswitch -Wcomment -Wpointer-arith
    -Wimplicit -Wreturn-type -Wmissing-declarations
    -Wno-sign-compare -O2 -fexpensive-optimizations
    -DDYNAMIC_FFI -DDYNAMIC_MODULES -I. -x none libcharset.a
    libavcall.a libcallback.a -lreadline -lncurses -ldl
    -lsigsegv -L/usr/X11R6/lib -lX11
    SAFETY=0 HEAPCODES LINUX_NOEXEC_HEAPCODES GENERATIONAL_GC
    SPVW_BLOCKS SPVW_MIXED TRIVIALMAP_MEMORY
    libsigsegv 2.2
    Features: (CLISP ANSI-CL COMMON-LISP LISP=CL INTERPRETER
    SOCKETS GENERIC-STREAMS LOGICAL-PATHNAMES SCREEN FFI GETTEXT
    BASE-CHAR=CHARACTER PC386 UNIX)
    C Modules: (clisp)
    Installation directory: ./
    User language: ENGLISH
    Machine: I686 (I686) localhost.localdomain [127.0.0.1]

    libsigsegv is that from Peter van Eynde's people.debian.org
    2.2-2breezy

     
  • Jörg Höhle

    Jörg Höhle - 2006-03-06

    Logged In: YES
    user_id=377168

    Here's a work-around that does not disable generational GC:
    Compile with -DCONS_HEAP_GROWS_UP
    Note it can only make a difference on machines using
    SPVW_MIXED_BLOCKS where TRIVIALMAP_MEMORY is defined (e.g.
    Linux or MS-VC) -- see spvw.d

     
  • Jörg Höhle

    Jörg Höhle - 2006-03-24

    Logged In: YES
    user_id=377168

    The suggested work-arounds -DCONS_HEAP_GROWS_UP or
    -DSAFETY=3 or -DNO_GENERATIONAL_GC have nothing todo with
    the error. A simple clisp -m14MB also helps to make the Lisp
    stack much larger in some configurations, so CLISP runs in a
    (non-deadly) Lisp stack overflow instead of a (upto now
    deadly) C program stack overflow.

    I've identified the cause of the exit on the second stack
    overflow:
    * spvw_sigsegv.d (stackoverflow_handler) [UNIX]: libsigsegv doc
    says to restore normal signal mask prior to leaving handler.
    Expect a fix in CVS ASAP.

     
  • Jörg Höhle

    Jörg Höhle - 2006-04-28

    Logged In: YES
    user_id=377168

    thank you for your bug report.
    the bug has been fixed in the CVS tree.
    you can either wait for the next release (recommended)
    or check out the current CVS tree (see http://clisp.cons.org\)
    and build CLISP from the sources (be advised that between
    releases the CVS tree is very unstable and may not even build
    on your platform).

     
  • Jörg Höhle

    Jörg Höhle - 2006-04-28
    • assigned_to: haible --> hoehle
    • status: open --> closed-fixed
     

Log in to post a comment.