From: Helmut E. <e96...@st...> - 2004-02-04 22:34:18
|
[ Apologies if you receive multiple copies of this message. ] Hi, I'd like to implement a simple mechanism to pass message between threads, but I don't quite understand how condition variables are supposed to be used. My code looks much like the example below. There is a global mutex and a global waitqueue. I create one thread that reads all messages out from the message queue and prints them to stdout and the main thread sends messages to the other thread in a endless loop. This seems to work for a while (the counter goes sometimes up to one million), but then SBCL just hangs and even C-c-ing has no effect. Is there a bug in my code? in SBCL? Should I write this differently? It's also bit strange that "New thread: xxx" is printed two times. Are streams not thread safe by default? All this happens with SBCL 0.8.7.36, Linux 2.4.12, glibc 2.2.2. (defpackage :sb-ipc-test (:use :cl)) (in-package :sb-ipc-test) (defvar *mutex* (sb-thread:make-mutex)) (defvar *waitqueue* (sb-thread:make-waitqueue)) (defvar *message-queue* (list)) (defun receive () (sb-thread:with-mutex (*mutex*) (loop (cond (*message-queue* (return (pop *message-queue*))) (t (sb-thread:condition-wait *waitqueue* *mutex*)))))) (defun send (message) (sb-thread:with-mutex (*mutex*) (setf *message-queue* (nconc *message-queue* (list message))) (sb-thread:condition-notify *waitqueue*))) (defun receive-loop () (let ((i 0)) (loop (force-output) (let ((message (receive))) (format t "received: ~S " message) (ecase message (:print (format t "~D~%" (incf i))) (:stop (terpri) (return))))))) (defun test () (let ((thread (sb-thread:make-thread #'receive-loop))) (format t "New thread: ~S~%" thread) (loop (send :print)))) (test) |
From: Zach B. <xa...@xa...> - 2004-02-04 22:39:41
|
On Wed, Feb 04, 2004 at 11:33:55PM +0100, Helmut Eller wrote: > [ Apologies if you receive multiple copies of this message. ] > > > Hi, > > I'd like to implement a simple mechanism to pass message between > threads, but I don't quite understand how condition variables are > supposed to be used. My code looks much like the example below. > There is a global mutex and a global waitqueue. I create one thread > that reads all messages out from the message queue and prints them to > stdout and the main thread sends messages to the other thread in a > endless loop. This seems to work for a while (the counter goes > sometimes up to one million), but then SBCL just hangs and even > C-c-ing has no effect. Is there a bug in my code? in SBCL? Should I > write this differently? You may need to be careful with condition-wait; it may return even if nothing has signalled condition-notify. (I'm not sure if that's the actual problem, but it caused problems in my own threaded code.) It is probably fairly poor Lisp code, but scheduler.lisp in my TIMER package (http://www.cliki.net/TIMER) seems to work properly with SBCL's threads. Hope this helps, Zach |
From: Zach B. <xa...@xa...> - 2004-02-04 22:43:09
|
On Wed, Feb 04, 2004 at 05:39:38PM -0500, Zach Beane wrote: [snip] > > You may need to be careful with condition-wait; it may return even if > nothing has signalled condition-notify. (I'm not sure if that's the > actual problem, but it caused problems in my own threaded code.) Oops. Sorry for reading your code so poorly, I see that isn't an issue in what you wrote. Zach |
From: Daniel B. <da...@te...> - 2004-02-05 14:28:09
|
Helmut Eller <e96...@st...> writes: > endless loop. This seems to work for a while (the counter goes > sometimes up to one million), but then SBCL just hangs and even > C-c-ing has no effect. Is there a bug in my code? in SBCL? Should I > write this differently? Your code is fine, as far as I can see. I think this is a bug in SBCL when used in 2.4 kernels, where it uses signals to implement thread queues. I strongly recommend an upgrade to kernel 2.6 (or to a distribution kernel that supports the "futex" feature of NPTL) if at all possible. I ran your test code on my machine and it went to about 2.5 million before the out-of-memory weevil killed it. I think that the queuing stuff will either get rewritten to use busy-waiting and (sleep) for futex-incapable kernels, or we'll just remove non-futex compatibility completely for threaded SBCL. In either case, in the meantime consider it deprecated. > It's also bit strange that "New thread: xxx" is printed two times. > Are streams not thread safe by default? All this happens with SBCL > 0.8.7.36, Linux 2.4.12, glibc 2.2.2. fd-streams are definite "here be dragons" territory, I'm afraid. -dan -- "please make sure that the person is your friend before you confirm" |
From: David L. <da...@li...> - 2004-02-05 14:54:48
|
Quoting Daniel Barlow (da...@te...): > > It's also bit strange that "New thread: xxx" is printed two times. > > Are streams not thread safe by default? All this happens with SBCL > > 0.8.7.36, Linux 2.4.12, glibc 2.2.2. >=20 > fd-streams are definite "here be dragons" territory, I'm afraid. Even simple-threads should not force thread-safety on their users. While locking around stream operations might be convenient, it is probably too slow to be useful. =20 Note that java.io was synchronized, but java.nio is not. The Unix solution is to implement all stream functions twice and require people to write getc_unlocked instead of getc if they care about speed. See unlocked_stdio(3). Does anyone want that? d. |
From: Daniel B. <da...@te...> - 2004-02-05 16:07:12
|
David Lichteblau <da...@li...> writes: > Even simple-threads should not force thread-safety on their users. > While locking around stream operations might be convenient, it is > probably too slow to be useful. That I agree with. However, thread 1 printing to stream A should not interfere with thread 2 printing to thread B, and right now I would not warrant that fd-streams always behaves correctly in this situation. (That said, I've blatted it with apachebench more than a few times and am happily serving web pages from it. The bugs, if they exist, are definitely in the "crop up rarely" category, at least for my usage patterns, not the "reliable crash" category) > The Unix solution is to implement all stream functions twice and require > people to write getc_unlocked instead of getc if they care about speed. > See unlocked_stdio(3). Does anyone want that? If we're voting, I don't. Does my vote count for anything? -dan -- "please make sure that the person is your friend before you confirm" |
From: Helmut E. <e96...@st...> - 2004-02-05 19:14:52
|
David Lichteblau <da...@li...> writes: > Hmm, isn't that the difference between `threads' and `processes'? Yes, probably. I guess I'd rather have processes than threads. But it was only a curiosity question. Thanks for the answers. Helmut. |
From: Rahul J. <rj...@ny...> - 2004-02-15 08:16:18
|
Helmut Eller <e96...@st...> writes: > David Lichteblau <da...@li...> writes: > >> Hmm, isn't that the difference between `threads' and `processes'? > > Yes, probably. I guess I'd rather have processes than threads. But > it was only a curiosity question. Simply forking should let you do what you want to do here. -- Rahul Jain rj...@ny... Professional Software Developer, Amateur Quantum Mechanicist |
From: Christophe R. <cs...@ca...> - 2004-02-05 14:47:41
|
Helmut Eller <e96...@st...> writes: > This seems to work for a while (the counter goes > sometimes up to one million), but then SBCL just hangs and even > C-c-ing has no effect. Is there a bug in my code? in SBCL? Should I > write this differently? The first thing I'd check is to see if the out of memory killer has killed one or more of your sbcl processes when this stops working. > It's also bit strange that "New thread: xxx" is printed two times. > Are streams not thread safe by default? All this happens with SBCL > 0.8.7.36, Linux 2.4.12, glibc 2.2.2. Dan has said something about 2.6 and futexes. For me, I only get "New thread: xxx" printed once (on a 2.6/sb-futex build); running your code once "overnight" got me 1.2 million iterations and an out-of-memory kill; the run I've got now is on 18.5 million iterations and is still going. Where is this out-of-memory coming from? Well, I'm not sure, but is it not possible that *message-queue* could grow without bounds? If the sending loop is much faster than the receiving loop (and it probably is, at least for small to relatively-long lists), you'll grow *message-queue* faster than the receiver removes from it. This is only a guess -- it's slightly hard to measure, because of course under this hypothesis (length *message-queue*) takes a while to compute... Cheers, Christophe -- http://www-jcsu.jesus.cam.ac.uk/~csr21/ +44 1223 510 299/+44 7729 383 757 (set-pprint-dispatch 'number (lambda (s o) (declare (special b)) (format s b))) (defvar b "~&Just another Lisp hacker~%") (pprint #36rJesusCollegeCambridge) |
From: Nikodemus S. <tsi...@cc...> - 2004-02-05 15:07:42
|
On Thu, 5 Feb 2004, Christophe Rhodes wrote: > *message-queue* faster than the receiver removes from it. This is > only a guess -- it's slightly hard to measure, because of course under > this hypothesis (length *message-queue*) takes a while to compute... This is probably stating the obvious, but why not just add a counter to the queue? (defstruct queue size first last) ; etc Cheers, -- Nikodemus |
From: Helmut E. <e96...@st...> - 2004-02-05 16:45:34
|
Christophe Rhodes <cs...@ca...> writes: > Where is this out-of-memory coming from? Well, I'm not sure, but is > it not possible that *message-queue* could grow without bounds? If > the sending loop is much faster than the receiving loop (and it > probably is, at least for small to relatively-long lists), you'll grow > *message-queue* faster than the receiver removes from it. This is > only a guess -- it's slightly hard to measure, because of course under > this hypothesis (length *message-queue*) takes a while to compute... Doh! Silly me, haven't thought about this possibility. This of course is a problem. I tried to investigate this a bit and added a counter for the queue length. The thing stopped now with: received: :print 381339 length: 29783 received: :print 381340 length: 29782 top said that 2 sbcl processes are running, each taking about 50% of the cpu. Memory size was about 50MB when the output stopped, but grew quickly. I pressed C-\ at 100MB (this machine as 512MB and 300MB used), some output was flushed, and I got debugger invoked on a SB-INT:SIMPLE-CONTROL-ERROR in thread 11139: attempt to THROW to a tag that does not exist: SB-IMPL::TOPLEVEL-CATCHER The output also contains things like: received: :print 381340 length: 29782 length: 29782 and received: :print 381447 received: :print 381447 length: 29675 That's strange because the "received: ..." and "length: ..." bits are written by the same thread. Don't know what all this means, but the simplest solution is probably to switch to the new kernel. Thanks for all the help. Helmut. |
From: Helmut E. <e96...@st...> - 2004-02-05 17:28:58
|
David Lichteblau <da...@li...> writes: > The Unix solution is to implement all stream functions twice and require > people to write getc_unlocked instead of getc if they care about speed. > See unlocked_stdio(3). Does anyone want that? What I want is a simple execution model. When I write (format t ...) I don't want to think about threading issues. Multithreading should make my life easier and not harder. Thinking about efficiency is useless if your program is not correct and as it seems, unsynchronized stream operations make many otherwise correct programs incorrect in a multithreaded environment. I'm probably spoiled by Luke Gorrie's "Multithreading is a Bad Idea" rants, but the cost of auditing/rewriting all my code and all (system) libraries for thread safety seems to be very high. Higher than the saved cost for using a simpler structure in multithreaded programs. Just out of curiosity, have you (the SBCL hackers) considered to implement a threading model where memory is _not_ shared? Helmut. |
From: David L. <da...@li...> - 2004-02-05 18:08:55
|
[This is going slighly off-topic, but now that I've written it...] Quoting Helmut Eller (e96...@st...): > What I want is a simple execution model. When I write (format t ...) > I don't want to think about threading issues. Multithreading should > make my life easier and not harder. Thinking about efficiency is > useless if your program is not correct and as it seems, unsynchronized > stream operations make many otherwise correct programs incorrect in a > multithreaded environment. It would be nice if everything were thread-safe. But Java started out with `synchronize' all over the place and had to give it up. So we are talking about finding the right balance here, right? Personally I would hope that low-level system parts like CLOS slot access are thread safe. When it comes to streams, I would accept that my streams cannot be shared between threads. With that in mind, user code will be fine. However, that does not work for the standard stream bindings which every thread inherits. Most threads want to write to *STANDARD-OUTPUT* or *TRACE-OUTPUT*. Perhaps it would be acceptable to wrap these streams into some kind of synonym stream which does the locking. Bulk output to the terminal will always be slow (it is line-buffered after all), so the overhead might be acceptable. > I'm probably spoiled by Luke Gorrie's "Multithreading is a Bad Idea" [...] :-) > Just out of curiosity, have you (the SBCL hackers) considered to > implement a threading model where memory is _not_ shared? Hmm, isn't that the difference between `threads' and `processes'? If they cannot access each other's memory, they are not threads. If I understand Erlang's model correctly, the beauty of their environment is that data cannot be mutated anyway, so the distinction goes away (as long as the processes run in the same node). d. |
From: Daniel B. <da...@te...> - 2004-02-05 18:16:28
|
Helmut Eller <e96...@st...> writes: > David Lichteblau <da...@li...> writes: > >> The Unix solution is to implement all stream functions twice and require >> people to write getc_unlocked instead of getc if they care about speed. >> See unlocked_stdio(3). Does anyone want that? > > What I want is a simple execution model. When I write (format t ...) > I don't want to think about threading issues. Multithreading should > make my life easier and not harder. Thinking about efficiency is > useless if your program is not correct and as it seems, unsynchronized > stream operations make many otherwise correct programs incorrect in a > multithreaded environment. Perhaps we should be clearer about what we're discussing here. If two threads are writing to two different streams, that /should/ work, and should work without any changes to existing code or introduction of locks. There may be bugs in SBCL here (though as I noted previously, they're fairly rare bugs if they do exist) and if so, bug reports/test cases are welcome. If two threads are writing to the same stream simultaneously, I don't think you can _avoid_ thinking about threading issues - you're accessing a shared resource from two places. We could arrange for correct interleaving of io requests at a low level, but if you have any kind of structure to your output (e.g. you don't want part of a message from A interleaved in a message from B: assume it may take multiple output operations to send a complete message) there's no good way to indicate this to CL. So, as you have to do your own locking anyway, locks at the implementation level are not helping you. (The thought occured to me while writing this that maybe you could do this with FINISH-OUTPUT, but it seems like a kludge and still doesn't help at all with input arbitration, or deciding what to do with file-position changes or any other stream attributes) > Just out of curiosity, have you (the SBCL hackers) considered to > implement a threading model where memory is _not_ shared? I guess you could run two copies of SBCL and pass messages between them over pipes. It's not something I've thought about, I can't speak for other developers. -dan -- "please make sure that the person is your friend before you confirm" |