From: Daniel B. <da...@te...> - 2001-08-03 14:39:25
|
I hope your Go went well. What kind of timeframes are there for a 0.7 release? My orbit is gradually spinning back round to the position that I can do some more SBCL work: there are a few Alpha infelicities that need fixing, the posix-environ thing if it's not already done, and then I want to start thinking about MP stuff. To be honest I'd rather start with the last one, but if 0.7 release engineering stuff is planned during the next few weeks, the other bits had better come first. Hence the question. I believe that 0.7 will be the first actual release with Alpha support... With regard to MP: I've found the messages from Raymond Wiker saying that he's successfully ported the cmucl co-operative MP, and I've been wondering how much work it would take to make it pre-empt. Here's an off-the-top-of-the-head idea - 1) define locking macros that compile to no-ops if MP is not compiled in (keep the overhead out when we don't need it) 2) rearrange the memory map to have _two_ static spaces: one of them for MP-safe code and the other for unsafe code. Mangle the core creation or purify stages (<handwave>somehow</handwave>) so that we can specify into which space each file goes. Initally all files are tagged as unsafe. 3) mprotect() the unsafe space at startup time so that accesses to it trap. Create a trap handler (in C; on Linux it would probably be another if clause in the sigsegv handler) that notices this, acquires a global lock on behalf of the current thread, and unprotects the unsafe space 4) the thread scheduler needs to check if it's changing to/from the thread with the global lock, and mprotect/unprotect as appropriate 5) over time and as we find the bits with the biggest performance hit from this approach, rewrite them with appropriate fine-granularity locks and rebuild them into the safe static space. I have no idea how well this would work in practice, but it would allow a thread-safe (if slow) lisp right from the start, so at least has the advantage that it could be developed incrementally. Raymond wrote > > in mp-tests.lisp. If anybody else is interested in looking at it, > > I would be more than happy to share what I've got so far :-) If you still have it, could you send me a copy? I'm basing my comments on the CMUCL code, insofar as they're based on anything. A propos of nothing, ww.telent.net (and thus CLiki) is now running on SBCL and has been for a couple of weeks. -dan -- http://ww.telent.net/cliki/ - Link farm for free CL-on-Unix resources |
From: William H. N. <wil...@ai...> - 2001-08-03 18:36:59
|
On Fri, Aug 03, 2001 at 03:39:19PM +0100, Daniel Barlow wrote: > I hope your Go went well. Thanks for the kind wishes. Alas, it didn't actually go so well -- my program lost all its games. Many of the interesting and important parts of my program are still much too slow to play in competition time, and what's left is terribly weak. Hopefully that will be changing fairly soon, and when it does, the Computer Go Ladder <http://www.cgl.ucsf.edu/go/ladder.html> will let me try it out even without a face-to-face tournament. > What kind of timeframes are there for a 0.7 release? My orbit is > gradually spinning back round to the position that I can do some more > SBCL work: there are a few Alpha infelicities that need fixing, the > posix-environ thing if it's not already done, and then I want to start > thinking about MP stuff. To be honest I'd rather start with the last > one, but if 0.7 release engineering stuff is planned during the next > few weeks, the other bits had better come first. Hence the question. > > I believe that 0.7 will be the first actual release with Alpha support... I plan to put out 0.6.13 any time now, so that should be the first release with Alpha support. After that my plan was to start doing the various incompatible or large changes that I've talked about before, and put some large fraction of them into 0.7.0. I've fixed all the bugs on my old gotta-do-before-0.7.0 list, and the only things pending before I can release 0.6.13 are the DEFKNOWN weirdness that I wrote about in the last CVS commit message, and MNA's INSPECT patch. And since a test compile of the DEFKNOWN-tidied version finished successfully while I was working on this message, the DEFKNOWN stuff shouldn't take too long. However, I like to play with the system for a day or more I actually do a release, so 0.6.13 will probably be no earlier than Monday. My guess is that the interval between 0.6.13 and 0.7.0 will be on the order of a month. I'd like 0.7.0 to have in it at least * the incompatible changes anticipated in BUGS (e.g. ".fasl" instead of ".x86f", and probably INTERNAL-TIME-UNITS-PER-SECOND = 1000) * miscellaneous packaging changes (e.g. old CMU CL docs no longer bundled with SBCL sources) * IR1 interpreter gone, replaced with a trivial interpreter and FUNCALL of byte-compiled LAMBDA for the hard stuff * most of the renaming (e.g. systematizing names DEFINE-FOO, DEFBAR) that I talked about long ago * the code which has been accumulating in contrib/*-extras.lisp Then sometime early in 0.7.x, but probably not in 0.7.0, I'd like to straighten out the IR1-X bugs. > With regard to MP: I've found the messages from Raymond Wiker saying > that he's successfully ported the cmucl co-operative MP, and I've been > wondering how much work it would take to make it pre-empt. Here's an > off-the-top-of-the-head idea - > > 1) define locking macros that compile to no-ops if MP is not compiled > in (keep the overhead out when we don't need it) > > 2) rearrange the memory map to have _two_ static spaces: one of them > for MP-safe code and the other for unsafe code. Mangle the core > creation or purify stages (<handwave>somehow</handwave>) so that we > can specify into which space each file goes. Initally all files are > tagged as unsafe. > > 3) mprotect() the unsafe space at startup time so that accesses to it > trap. Create a trap handler (in C; on Linux it would probably be > another if clause in the sigsegv handler) that notices this, acquires > a global lock on behalf of the current thread, and unprotects the > unsafe space > > 4) the thread scheduler needs to check if it's changing to/from the > thread with the global lock, and mprotect/unprotect as appropriate > > 5) over time and as we find the bits with the biggest performance hit > from this approach, rewrite them with appropriate fine-granularity locks > and rebuild them into the safe static space. > > I have no idea how well this would work in practice, but it would > allow a thread-safe (if slow) lisp right from the start, so at least > has the advantage that it could be developed incrementally. I suspect you'll still be subject to various kinds of problems in the dynamic space, so that the mprotect() scheme won't be enough: you'll need some hand-coded locks right from the start. E.g. when you increase the size of a package enough, its data will end up in the dynamic space; and the package operations aren't thread safe. But although I've done some threaded development, it was always threaded from the start: I have no experience with this kind of transition from single-threaded code, and I can't really guess how smoothly it might go. > A propos of nothing, ww.telent.net (and thus CLiki) is now running on > SBCL and has been for a couple of weeks. Cool. -- William Harold Newman <wil...@ai...> Communication would be much more reliable if people would turn off the gainy decompression. -- Del Cotter PGP key fingerprint 85 CE 1C BA 79 8D 51 8C B9 25 FB EE E0 C3 E5 7C |
From: Daniel B. <da...@te...> - 2001-08-03 21:04:25
|
William Harold Newman <wil...@ai...> writes: > I suspect you'll still be subject to various kinds of problems in > the dynamic space, so that the mprotect() scheme won't be enough: > you'll need some hand-coded locks right from the start. E.g. when you > increase the size of a package enough, its data will end up in the > dynamic space; and the package operations aren't thread safe. I imagine you're right for some bits, but I dont think that's one of them. The lock is set when execution enters the unsafe space[*], so if one thread is executing either of FIND-SYMBOL or INTERN, any attempt to call the other would be blocked. Given that user code is probably not expected to mess about with package objects directly, I think this is sufficient [*] and released after the same call returns; we are actually planning to keep the lock until we get back to the trap handler, not just to look at reg_PC on each context switch and see if it's in the unsafe range. The grungey bits, I'm expecting, would be data exported directly as data: anything that comes back from the kernel that you're allowed to mutate directly instead of using accessors. And of course there are places that you'd want to introduce locks by hand fairly early anyway: the obvious one being that if READ acquires a global lock, you can't do any kind of IO in a background process while the REPL is sitting there ... -dan -- http://ww.telent.net/cliki/ - Link farm for free CL-on-Unix resources |
From: Daniel B. <da...@te...> - 2001-08-18 22:54:20
|
Daniel Barlow <da...@te...> writes: > With regard to MP: I've found the messages from Raymond Wiker saying > that he's successfully ported the cmucl co-operative MP, and I've been > wondering how much work it would take to make it pre-empt. Raymond sent me a pointer to the files he was using, which after reading I decided to reimplement completely. No, wait ... I can explain ... * Low-level The guts of the CMUCL threading implementation are pretty much retrofitted onto a single-threaded CMUCL. A process is represented by a STACK-GROUP structure, which contains a bundle of storage areas for the binding stack, alien stack, interrupt contexts, control stack, eval stack, current catch block, current unwind-protect block and some other stuff (sizes, indices into stacks etc; see the code) that pertain to the given process. The thread switch operation does these things: 0) unwind the binding stack 1) copy the current stacks into the storage areas for the current thread. 2) copy from the storage areas for the new thread into the actual stack space. Except for the control stack - see (4) 3) rewind the binding stack 4) call (sb-vm:control-stack-resume control-stack new-control-stack) which is a VOP that copies the new thread's control stack into the stack area, then jump to the current pc of the new stack group. What do we see here? A lot of data copying. I can think of no convincing reason that we need to _copy the stack contents around_ on a context switch when we could just update the pointers instead. OK, I can think of a reason which some people might find appealing: almost no changes needed to the C runtime. All it seems to need from C-land is another variable in the *static-variables* list. * High-level On top of these stack groups are layered the user-visible processes and the scheduler that chooses which one to run next. It works, but: - it's a fairly unexciting round-robin scheduler - the only way for a process to tell the scheduler that it's waiting on some event is for it to provide an "i am ready" predicate that the scheduler runs periodically. So, if you have five threads waiting for file output (let's say you're running a web server serving five slow clients) you end up doing a zero-timeout select() call for _each process_ on each loop through the scheduler. * Platform dependencies Probably not _that_ much of a big deal. The x86 has an alien stack where other platforms have number stacks, and it has a control stack that gets filled top-to-bottom, but that's not insuperable * (My) preferred alternative I'm working on something which is a bit more intrusive at the basic level. The {number | alien} and control stacks will be separately allocated per thread, GC will be taught to scavenge in them/keep out of them explicitly, and the rest of C-land and the assembler interface glue will be told to use current_thread->current_{foo}_stack_pointer where it presently has global variables On top of that, I'm thinking of an implementation similar to the present one - but with a sane way to deal with processes waiting for timers or file io (effectively, one select() call in the idle thread which can deal with all the fds we're waiting for) Note that (a) this doesn't deal with locking. I still think my icky mprotect() scheme would work for that, but we'll have to see (b) this doesn't deal with native (OS-level) threads, which won't co-exist well with shallow binding. OS-level threads are (or should be ) slower, but will take advantage of SMP machines which we don't (can't) do with userland threads. Oh, and (c) it will probably be available for Alpha before x86 ;-) But I certainly don't _intend_ to make it unportable. Is anyone interested in (hacking on, using) this? -dan -- http://ww.telent.net/cliki/ - Link farm for free CL-on-Unix resources |