#647 Hang with exemption

lisp error
open-accepted
5
2012-09-15
2012-07-19
Anonymous
No

The following RUN function eventually hangs for both the latest build (15589) and clisp-2.49.

(defstruct sema
(count 0)
(lock (mt:make-mutex))
(cvar (mt:make-exemption)))

(defun inc-sema (sema)
(mt:with-mutex-lock ((sema-lock sema))
(incf (sema-count sema))
(mt:exemption-signal (sema-cvar sema))))

(defun dec-sema (sema)
(mt:with-mutex-lock ((sema-lock sema))
(loop (cond ((plusp (sema-count sema))
(decf (sema-count sema))
(return))
(t
(mt:exemption-wait
(sema-cvar sema) (sema-lock sema)))))))

(defun test (thread-count)
(let ((from-threads (make-sema)))
(loop repeat thread-count do
(mt:make-thread
(lambda () (inc-sema from-threads))
:name "test"))
(loop repeat thread-count do (dec-sema from-threads))))

(defun run ()
(loop
(test 16)
(format t ".")
(finish-output)))

Sometimes the outcome with the latest build is:

Internal error: statement in file "zthread.d", line 771 has been reached!!

The number of iterations until hanging tends to decrease as thread-count increases.

The following replacement for TEST should also be checked (it has exposed bugs in other CL implementations where the previous TEST did not).

(defun test (thread-count)
(let ((from-threads (make-sema))
(to-threads (make-sema)))
(loop repeat thread-count do
(mt:make-thread
(lambda ()
(dec-sema to-threads)
(inc-sema from-threads))
:name "test"))
(loop repeat thread-count do (inc-sema to-threads))
(loop repeat thread-count do (dec-sema from-threads))))

Linux xi 3.2.0-24-generic-pae #39-Ubuntu SMP Mon May 21 18:54:21 UTC 2012 i686 i686 i386 GNU/Linux

gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)

GNU CLISP 2.49+ (2010-07-17) (built 3551638844) (memory 3551639097)
Software: GNU C 4.6.3
gcc -g -O2 -W -Wswitch -Wcomment -Wpointer-arith -Wreturn-type -Wmissing-declarations -Wimplicit -Wno-sign-compare -Wno-format-nonliteral -O2 -fexpensive-optimizations -falign-functions=4 -pthread -DENABLE_UNICODE -DMULTITHREAD -DPOSIX_THREADS -DDYNAMIC_MODULES libgnu.a -lreadline -lncurses -ldl -lsigsegv
SAFETY=0 HEAPCODES LINUX_NOEXEC_HEAPCODES GENERATIONAL_GC SPVW_BLOCKS SPVW_MIXED TRIVIALMAP_MEMORY
libsigsegv 2.9
libreadline 6.2
Features:
(REGEXP WILDCARD SYSCALLS I18N LOOP COMPILER CLOS MOP CLISP ANSI-CL COMMON-LISP LISP=CL
INTERPRETER LOGICAL-PATHNAMES MT SOCKETS GENERIC-STREAMS SCREEN GETTEXT UNICODE
BASE-CHAR=CHARACTER PC386 UNIX)
C Modules: (clisp i18n syscalls regexp)
Installation directory: /home/jlawrence/usr/stow/clisp-dev/lib/clisp-2.49+/
User language: ENGLISH
Machine: I686 (I686) xi [127.0.1.1]

Discussion

  • Sam Steingold
    Sam Steingold
    2012-07-19

    • labels: --> multithreading
    • milestone: --> lisp error
    • assigned_to: nobody --> vtz
     
  • I can't reproduce it (tested both on Linux and OSX).
    I used up to 128 threads and waited about 5 minutes on each run. Also tested the second 'test' function without experiencing problems.

    Reaching zthread.d:771 means inconsistency in internal mutex record.
    Will continue to test - any additional info (though not sure what to ask for - debugging is an option but it's not easy one) will be helpful.

     

  • Anonymous
    2012-07-20

    I have just discovered that the problem goes away when clisp is
    configured --with-debug.

    My only configure options are --prefix and --with-threads=POSIX_THREADS.

    For me it fails within seconds for either TEST. This is a Core i7
    3.4GHz running 32-bit Linux.

    I have been compiling all functions. It takes longer for the issue to
    appear with interpreted functions, though it still fails within a few
    minutes at most.

    I recently got a new error while running interpreted functions:

    *** thread is going into lisp land without calling end_blocking_call()

    On SBCL I once had a condition variable problem which was either
    wholly present or wholly absent depending upon some dice roll at
    launch time. (The SBCL bundled with Ubuntu sometimes (but not always)
    decided to produce spurious wakeups, which I had not handled properly.
    This caused some confusion because a vanilla SBCL compiled locally did
    not generate these wakeups.) Perhaps not seeing the issue after a
    couple minutes means a restart is needed.

     
  • In the past months I've tried to reproduce this on linux and osx (both i386 and x86_64) without success. Also tried different gcc version (up to 4.7.1).

     
    • status: open --> open-works-for-me
     
  • This bug report is now marked as "pending"/"works for me".
    This means that we think that we cannot reproduce the problem
    and cannot do anything about it.
    Unless you - the reporter - act within 2 weeks
    (e.g., by submitting a self-contained test case
    or answering our other recent requests),
    the bug will be permanently closed.
    Sorry about the inconvenience -
    we hope your silence means that
    you are no longer observing the problem either.

     

  • Anonymous
    2012-09-15

    I can still reproduce with latest in hg (15594) with specs given above; symptoms haven't changed.

     
    • status: open-works-for-me --> open-accepted