From: David L. R. <ra...@cs...> - 2009-12-20 23:34:51
|
Hi Nikodemus, First, thanks for your thoughtful and timely responses to my inquiries. I'm treating this email like a code review @ a large SWE company and inlining my responses. One new thing I should mention: the time out of the lutexes varies a lot. E.g, if I request a one second timeout, the lutex will timeout after 0.91 seconds. If I request a 60 second timeout, it timesout after 59.5 seconds. This strikes me as odd, but since an approximation is good enough for my use case, I'm tempted to ignore this issue for now. >> My main question concerns interrupts. The futex code checks for >> interrupts every 10 ms. Should I modify the lutex code to do the >> same? I did not include the checking for now, based on the idea that >> if such checks were necessary, it would have been in the lutex code in >> the first place. > > No, what you did is fine. The futex code you looked at was probably > the pthread-futex stuff, which emulates the Linux futex API on top of > pthreads for lesser operating systems. The real futex code that gets > used on Linux lives in src/runtime/linux-os.c. > > As long as we can, let the OS deal with the timeouts, since it can > hopefully do a better job. Okay. >> Another question concerns the return value of condition-wait when >> there was a timeout. As of now, the lutex code returns 1. This is >> slightly inconsistent with the current lutex behavior, where >> condition-wait always returns 0, if it returns a value at all. Do we >> care? It seems reasonable that the return value should be an >> unreliable indicator for whether a condition variable was signaled >> (for a reliable indicator, an object similar to CCL's >> semaphore-notification-object [search for "ccl >> make-semaphore-notification"] would be required). It looks to me like >> waiting without a timeout on futexes and lutexes return different >> values anyway. > > I think as long as we are living with deadlines in addition to local > timeouts, the only right thing is to call (SIGNAL-DEADLINE) on > timeout, because you easily cannot tell if the timeout was due to the > local timeout or the global deadline. > > So code that _expects_ timeouts will look something like > > (handler-case (condition-wait ... :timeout 0.1) > (timeout () (whatever-we-want-to-do-in-case-of-timeout)) I'm a bit confused as to what you mean by "local timeout" versus "global deadline?" Does "global deadline" refer to functions similar to the sb-impl:with-deadline that I see in tests/deadline.impure.lisp? My personal preference is to avoid this handler-case and instead (1) use the return value as an estimate of whether the signal was received and (2) a notification-object to know for certain. This notification-object should probably be added in a different patch. For all of my use cases, it doesn't matter whether the signal wasn't received due to timeout or for some other reason (like the thread being killed). If the signal was received, I do one thing; otherwise I do another. Anyway, I think your answer to my question in the previous paragraph will help address this concern. > while I think explicit return-values are in most cases much nicer for > timeouts, using serious-conditions to indicate them is more foolproof > in some ways. > > (Almost all of our synchronization primitives could do with some API > refactoring, but I don't think that's a prequisite for this patch. It > should also be noted that what Paul Khuong has cooking might supersede > lutexes sooner or later, but I'd rather take that risk than let this > fall on the floor...) > > Some comments on the patch: > >> + (assert (or (null timeout) (typep timeout 'integer))) > > It would be better to allow reals instead of integers. You should be > able to just call DECODE-TIMEOUT using the TIMEOUT argument -- though > it might be better to DECLAIM the FTYPE of CONDITION-WAIT approriately > as well. Incorporated (and tested) the suggestion about making it a real. Does the following declamation look right? I've never made one before, so it'd be good if someone could check me please. (declaim (ftype (function ((unsigned-long) (unsigned-long) (real)) condition-wait))) (defun condition-wait (queue mutex &key timeout) ...) >> - ((1) (signal-deadline)) >> + ((1) (if timeout (return) (signal-deadline))) > > As discussed above. Similar code needed for the SB-LUTEX case as well. Confused, as above. I think you're suggesting I remove the "(if timeout (return)" part and require the calling code to handle the timeout? >> -CFLAGS = -g -Wall -Wsign-compare -O3 >> +CFLAGS = -g -Wall -Wsign-compare -O3 -D_GNU_SOURCE > > Since AFAIK this is only necessary for being able to build the lutex > code on Linux, I think this is probably better left out of the final > patch. (...but as John Fremlin noted just yesterday, we should really > be inhering CFLAGS from environment, or at least provide a way to > explicitly pass them to the build for things like defining > non-standard SBCL_HOME. Whatever you all think is best is fine by me. I anticipate that my "customers" will use the default build option (which incidentally is a small argument for enabling threads as the default build option, but I do not want to derail my own thread). >> + /* computation of timeout taken from pthread-futex.c */ >> + if (sec >= 0) { >> + ret = gettimeofday(&tv, NULL); >> + if (ret != 0) >> + return ret; >> + timeout.tv_sec = tv.tv_sec + sec + (tv.tv_usec + usec) / 1000000; >> + timeout.tv_nsec = (tv.tv_usec + usec) % 1000000; >> + } > > In and ideal world we might abstract this out, but as OAOO violations > and SBCL go this is pretty small. :) Agreed. > All in all, this is looking pretty good. The biggest thing missing are > test-cases. :) Some of the test-cases in tests/deadline.impure.lisp > should be enabled for SB-LUTEX when this code and the timeout > signalling is in place. I'm happy to add test cases as part of the patch. I think I'll add something to one of the sb-queue contribs? Just looking at the deadline.impure.lisp file, I'm a bit confused, because I haven't yet found the documentation for with-deadline. > (No, I didn't actually run any of this yet.) The tests that are already there broke when I did some things incorrectly, so we have some sort of assurance. I assume you'll want something other than an email to test a patch. I setup a git repository, so the cvs-export option should work? Thanks, David |