From: Max M. <ma...@op...> - 2013-05-28 22:40:16
|
I have a test case, where by killing lots of threads, that use %without-interrupts/with-interrupts to try to manage their cleanup, I can somehow get SBCL into the state where 1. (decode-universal-time (get-universal-time)) ;; hangs the calling thread, interrupting ;; shows single "bogus frame" 2. (decode-universal-time (get-universal-time) 0) ;; works fine This is with slightly oldish (around 3 weeks) SBCL from GIT. Is this something interesting or no? Regards Max |
From: Max M. <ma...@op...> - 2013-05-28 22:55:14
|
At Tue, 28 May 2013 18:23:54 -0400, Max Mikhanosha wrote: > > I have a test case, where by killing lots of threads, that use > %without-interrupts/with-interrupts to try to manage their cleanup, I > can somehow get SBCL into the state where > > 1. (decode-universal-time (get-universal-time)) > ;; hangs the calling thread, interrupting > ;; shows single "bogus frame" > > 2. (decode-universal-time (get-universal-time) 0) > ;; works fine > I have traced it to localtime_r() hanging.. My understanding of this is as follows: localtime_r() uses locking on some mutex, that checks if tzset() been called since last time it cached the timezone info If a Lisp thread gets killed while its doing localtime_r() in the foreign code, that mutex or whatever its using is stuck, and any farther localtime_r() from the same process hang.. This actually has very devastating consequences for SBCL, as it shuts down the compiler, because it uses decode-universal-time. The test case to reproduce it, would be to repeatedly start and stop many threads, that call (decode-universal-time) without time-zone argument. The fix would be to block signals around calls to localtime_r and gmtime_r Regards, Max |
From: Max M. <ma...@op...> - 2013-05-28 23:17:23
|
At Tue, 28 May 2013 18:55:07 -0400, Max Mikhanosha wrote: > > The fix would be to block signals around calls to localtime_r and gmtime_r > If I patch SBCL with below patch, I can no longer reproduce the problem, but its probably horribly wrong fix, IMHO its better to do this on C side. (defun decode-universal-time (universal-time &optional time-zone) #!+sb-doc "Converts a universal-time to decoded time format returning the following nine values: second, minute, hour, date, month, year, day of week (0 = Monday), T (daylight savings time) or NIL (standard time), and timezone. Completely ignores daylight-savings-time when time-zone is supplied." (multiple-value-bind (daylight seconds-west) (if time-zone (values nil (* time-zone 60 60)) (multiple-value-bind (ignore seconds-west daylight) (let ((tmp (truncate-to-unix-range universal-time))) (sb!thread::block-deferrable-signals) (multiple-value-prog1 (sb!unix::get-timezone tmp) (sb!unix::unblock-deferrable-signals))) (declare (ignore ignore)) (declare (fixnum seconds-west)) (values daylight seconds-west))) (declare (fixnum seconds-west)) (multiple-value-bind (weeks secs) (truncate (+ (- universal-time seconds-west) seconds-offset) seconds-in-week) (let ((weeks (+ weeks weeks-offset))) (multiple-value-bind (t1 second) (truncate secs 60) (let ((tday (truncate t1 minutes-per-day))) (multiple-value-bind (hour minute) (truncate (- t1 (* tday minutes-per-day)) 60) (let* ((t2 (1- (* (+ (* weeks 7) tday november-17-1858) 4))) (tcent (truncate t2 quarter-days-per-century))) (setq t2 (mod t2 quarter-days-per-century)) (setq t2 (+ (- t2 (mod t2 4)) 3)) (let* ((year (+ (* tcent 100) (truncate t2 quarter-days-per-year))) (days-since-mar0 (1+ (truncate (mod t2 quarter-days-per-year) 4))) (day (mod (+ tday weekday-november-17-1858) 7)) (t3 (+ (* days-since-mar0 5) 456))) (cond ((>= t3 1989) (setq t3 (- t3 1836)) (setq year (1+ year)))) (multiple-value-bind (month t3) (truncate t3 153) (let ((date (1+ (truncate t3 5)))) (values second minute hour date month year day daylight (if daylight (1+ (/ seconds-west 60 60)) (/ seconds-west 60 60)))))))))))))) |