From: William H. N. <wil...@ai...> - 2001-08-30 23:15:45
|
On Thu, Aug 30, 2001 at 11:38:34PM +0100, Daniel Barlow wrote: > "Thomas F. Burdick" <tfb@OCF.Berkeley.EDU> writes: > > > There's a thread on comp.lang.lisp where someone's asking about how to > > micro-optimize his code in LispWorks, and it got me wondering about > > how to do that in SBCL. I don't remember ever seeing this in the > > CMUCL manual, but I apologize if it's covered there. So, if I had an > > inner loop that I couldn't get the compiler to produce sufficiently > > optimized code for, how would I go about writing it in assembly? > > It's not covered in the manual, and I doubt it's officially supported ^^^^^^^^^^^^^^^^^^^^ > either. But it's something I spent some time with recently, so here's > the short guide. Note that this is largely based on experimentation > and looking at the source code, so is quite likely all wrong. People > who know more than me are welcome to chime in. Officially supported? Not likely. Trying to ensure the stability of the interface to this kind of thing looks like a nightmare. However, some kinds of performance hacks could be officially supported. If there's a relevant optimization that the compiler could make and doesn't, and you can teach the compiler to make it, I'd probably be happy to make it a permanent part of SBCL. Examples of this kind of thing in the reasonably recent past include * transforms for TRUNCATE, FLOOR, and friends to avoid consing, added to CMU CL by Raymond Toy (I think) and ported to SBCL * my new transforms for MAP, for quantifiers (e.g. EVERY), and for adjustable arrays * my tweaks in the type system so that keywords and other intersection types can be handled better So if, e.g., you need to do some kind of operation on bit vectors and SBCL is currently doing it badly, consider teaching SBCL to optimize it better and submitting the optimization as a patch, rather than writing your optimized version in inline assembly language. (Occasionally this can be pretty straightforward, e.g. if CMU CL compiles your bottleneck code better than SBCL, figure out why and port it.) -- William Harold Newman <wil...@ai...> "Smooth duct tape: the mark of a true craftsman." -- someone at Bettis Lab, quoted by my father PGP key fingerprint 85 CE 1C BA 79 8D 51 8C B9 25 FB EE E0 C3 E5 7C |
From: Thomas F. B. <tfb@OCF.Berkeley.EDU> - 2001-08-31 01:59:06
|
William Harold Newman writes: > On Thu, Aug 30, 2001 at 11:38:34PM +0100, Daniel Barlow wrote: > > > It's not covered in the manual, and I doubt it's officially supported > ^^^^^^^^^^^^^^^^^^^^ > > either. But it's something I spent some time with recently, so here's > > the short guide. Note that this is largely based on experimentation > > and looking at the source code, so is quite likely all wrong. People > > who know more than me are welcome to chime in. > > Officially supported? Not likely. Trying to ensure the stability of > the interface to this kind of thing looks like a nightmare. This seems like it would be a really good thing to keep documented, even if marked as "this will likely change radically as time goes on". For some things (mostly systems-level stuff), it's vital to be able to drop into assembly. I've been really enjoying SBCL so far, and now I'm quite encouraged that I'll be able to use it for pretty much everything. I'd hate for anyone to be put off by the lack of a documented assembly interface. > However, some kinds of performance hacks could be officially > supported. If there's a relevant optimization that the compiler could > make and doesn't, and you can teach the compiler to make it, I'd > probably be happy to make it a permanent part of SBCL. Examples of > this kind of thing in the reasonably recent past include > * transforms for TRUNCATE, FLOOR, and friends to avoid consing, > added to CMU CL by Raymond Toy (I think) and ported to SBCL > * my new transforms for MAP, for quantifiers (e.g. EVERY), and > for adjustable arrays > * my tweaks in the type system so that keywords and other > intersection types can be handled better > So if, e.g., you need to do some kind of operation on bit vectors and > SBCL is currently doing it badly, consider teaching SBCL to optimize > it better and submitting the optimization as a patch, rather than > writing your optimized version in inline assembly language. Unless I'm mistaken, this is a somewhat orthogonal facility. If I'm worried about bit-twiddling in an inner loop, I'd probably try to get the compiler to generate better bit-twiddling code. However, some times knowing the exact circumstances you're implementing some algorithm in can allow you to write overly-clever assembly, often saving 5-10% in speed. I don't think I'd *want* this sort of intelligence in a compiler, even if it would be feasible to implement. Mostly I wanted to know that I *could* write stuff by hand on a per-case basis. I'll come back when/if I find myself actually doing this in SBCL :) |
From: William H. N. <wil...@ai...> - 2001-08-31 13:54:25
|
On Thu, Aug 30, 2001 at 06:59:02PM -0700, Thomas F. Burdick wrote: > William Harold Newman writes: > > > On Thu, Aug 30, 2001 at 11:38:34PM +0100, Daniel Barlow wrote: > > > > > It's not covered in the manual, and I doubt it's officially supported > > ^^^^^^^^^^^^^^^^^^^^ > > > either. But it's something I spent some time with recently, so here's > > > the short guide. Note that this is largely based on experimentation > > > and looking at the source code, so is quite likely all wrong. People > > > who know more than me are welcome to chime in. > > > > Officially supported? Not likely. Trying to ensure the stability of > > the interface to this kind of thing looks like a nightmare. > > This seems like it would be a really good thing to keep documented, > even if marked as "this will likely change radically as time goes on". > For some things (mostly systems-level stuff), it's vital to be able to > drop into assembly. I've been really enjoying SBCL so far, and now > I'm quite encouraged that I'll be able to use it for pretty much > everything. I'd hate for anyone to be put off by the lack of a > documented assembly interface. Yes, I'm glad that Dan wrote up a description and that it'll be in the mailing list archives, and if there's a lot of interest it might even end up more prominently archived somewhere (contrib/ or something). But it's IMHO a very small niche. The only things I can think of that I've ever been tempted to write as inline assembly are things like fiddling with processor flag bits. I can see writing a new VOP for %SET-FOO-INTERRUPT-STATE or something. But in my experience, the overwhelming majority of other stuff can and often should be written by calling out to a subroutine. Speaking of which, if you want to call out to an assembly language subroutine, check out the DEFINE-ASSEMBLY-ROUTINE macro (also unsupported). And finally, in case you, or anyone else reading this, doesn't know, calling out to C using SB-ALIEN and SB-C-CALL *is* supported. -- William Harold Newman <wil...@ai...> "Smooth duct tape: the mark of a true craftsman." -- someone at Bettis Lab, quoted by my father PGP key fingerprint 85 CE 1C BA 79 8D 51 8C B9 25 FB EE E0 C3 E5 7C |