From: Cyrus H. <ch...@bo...> - 2005-10-25 03:45:17
Attachments:
sbcl-20051024.3-ppc-callback-stack-alignment.patch
|
The following patch uses a lot less stack space by not creating on extra stack frame and only storing reg_NFP (r13 for darwin, r12 otherwise) and enough padding to align the portion of the stack used to store this register to a 16-byte boundary. Also, this patch is a bit less "invasive" than the last and is much shorter. But this leaves a couple questions: 1. should the linkage table stuff restore r13 so we don't have to do this? 2. is it ok not to build a whole extra stack frame for this? will it mess up debuggability to do it this way? 3. why is reg_NFP 13 for darwin and 12 for linux? Raymond Toy said (I think) that cmucl uses r13 for NFP on both platforms now. Can we do the same? Thanks, Cyrus |
From: Cyrus H. <ch...@bo...> - 2005-10-25 04:10:28
|
I forgot one more thing... in ppc-arch.c, LINKAGE_ADDR_REG isn't used. Can we just delete it or is it used somewhere else? Thanks, Cyrus On Oct 24, 2005, at 8:44 PM, Cyrus Harmon wrote: > > The following patch uses a lot less stack space by not creating on > extra stack frame and only storing reg_NFP (r13 for darwin, r12 > otherwise) and enough padding to align the portion of the stack > used to store this register to a 16-byte boundary. Also, this patch > is a bit less "invasive" than the last and is much shorter. But > this leaves a couple questions: > > 1. should the linkage table stuff restore r13 so we don't have to > do this? > > 2. is it ok not to build a whole extra stack frame for this? will > it mess up debuggability to do it this way? > > 3. why is reg_NFP 13 for darwin and 12 for linux? Raymond Toy said > (I think) that cmucl uses r13 for NFP on both platforms now. Can we > do the same? > > Thanks, > > Cyrus > > <sbcl-20051024.3-ppc-callback-stack-alignment.patch> > > |
From: Nikodemus S. <tsi...@cc...> - 2005-10-25 07:25:47
|
On Mon, 24 Oct 2005, Cyrus Harmon wrote: > The following patch uses a lot less stack space by not creating on extra > stack frame and only storing reg_NFP (r13 for darwin, r12 otherwise) and > enough padding to align the portion of the stack used to store this register > to a 16-byte boundary. Also, this patch is a bit less "invasive" than the > last and is much shorter. But this leaves a couple questions: > > 1. should the linkage table stuff restore r13 so we don't have to do this? See my previsous message: there should be no need for this in SBCL at all: normal calls to C aren't affected by this, and funcall3 doesn't go thru the linkage table. A test case is needed, however. > 2. is it ok not to build a whole extra stack frame for this? will it mess up > debuggability to do it this way? No idea. > 3. why is reg_NFP 13 for darwin and 12 for linux? Raymond Toy said (I think) > that cmucl uses r13 for NFP on both platforms now. Can we do the same? No idea. Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." |
From: Cyrus H. <ch...@bo...> - 2005-10-25 08:09:30
|
Great! So making sure that the callback stack is aligned on a 16-byte boundary should be sufficient. Here's a patch to do just that: Thanks, Cyrus --- c-call.lisp 19 Oct 2005 00:31:17 -0700 1.12 +++ c-call.lisp 25 Oct 2005 01:07:25 -0700 @@ -400,7 +400,8 @@ (flet ((make-gpr (n) (make-random-tn :kind :normal :sc (sc-or-lose 'any- reg) :offset n)) (make-fpr (n) - (make-random-tn :kind :normal :sc (sc-or-lose 'double- reg) :offset n))) + (make-random-tn :kind :normal :sc (sc-or-lose 'double- reg) :offset n)) + (round-up-16 (n) (* 16 (ceiling n 16)))) (let* ((segment (make-segment))) (assemble (segment) ;; To save our arguments, we follow the algorithm sketched in the @@ -460,7 +461,7 @@ (args-size (* 3 n-word-bytes)) ;; FIXME: n-frame-bytes? (frame-size - (+ n-foreign-linkage-area-bytes n-return-area- bytes args-size))) + (round-up-16 (+ n-foreign-linkage-area-bytes n- return-area-bytes args-size)))) (destructuring-bind (sp r0 arg1 arg2 arg3 arg4) (mapcar #'make-gpr '(1 0 3 4 5 6)) (flet ((load-address-into (reg addr) On Oct 25, 2005, at 12:25 AM, Nikodemus Siivola wrote: > On Mon, 24 Oct 2005, Cyrus Harmon wrote: > > >> The following patch uses a lot less stack space by not creating on >> extra stack frame and only storing reg_NFP (r13 for darwin, r12 >> otherwise) and enough padding to align the portion of the stack >> used to store this register to a 16-byte boundary. Also, this >> patch is a bit less "invasive" than the last and is much shorter. >> But this leaves a couple questions: >> >> 1. should the linkage table stuff restore r13 so we don't have to >> do this? >> > > See my previsous message: there should be no need for this in SBCL > at all: > normal calls to C aren't affected by this, and funcall3 doesn't go > thru the linkage table. > > A test case is needed, however. > > >> 2. is it ok not to build a whole extra stack frame for this? will >> it mess up debuggability to do it this way? >> > > No idea. > > >> 3. why is reg_NFP 13 for darwin and 12 for linux? Raymond Toy said >> (I think) that cmucl uses r13 for NFP on both platforms now. Can >> we do the same? >> > > No idea. > > Cheers, > > -- Nikodemus Schemer: "Buddha is small, clean, and > serious." > Lispnik: "Buddha is big, has hairy armpits, and > laughs." > > > ------------------------------------------------------- > This SF.Net email is sponsored by the JBoss Inc. > Get Certified Today * Register for a JBoss Training Course > Free Certification Exam for All Training Attendees Through End of 2005 > Visit http://www.jboss.com/services/certification for more information > _______________________________________________ > Sbcl-devel mailing list > Sbc...@li... > https://lists.sourceforge.net/lists/listinfo/sbcl-devel > |
From: Nikodemus S. <tsi...@cc...> - 2005-10-25 08:15:03
|
On Tue, 25 Oct 2005, Cyrus Harmon wrote: > Great! So making sure that the callback stack is aligned on a 16-byte > boundary should be sufficient. Here's a patch to do just that: Thanks. I'm still slightly confused as to why this matters: after the callback (modulo funcall3) we're in Lisp, which doesn't require the 16 byte alignment. Does the misaligned callback cause further calls out to C to be also misaligned, or where's the rub? Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." |
From: Cyrus H. <ch...@bo...> - 2005-10-25 08:20:47
|
Well, the last time I tried to fix the alignment when calling out to C, I was told to find the underlying problem and fix it. In that same vein, I'm trying to keep the number stack 16-byte aligned for callbacks as well. Yes, the only (known) danger is on further calls out to C, but my patch to fix the alignment down there was never incorporated. I'm not sure if it matters while we're strictly in lisp, but Christophe seemed to indicate a preference for making sure the stack stays aligned in the first place. Thanks again, Cyrus On Oct 25, 2005, at 1:13 AM, Nikodemus Siivola wrote: > On Tue, 25 Oct 2005, Cyrus Harmon wrote: > > >> Great! So making sure that the callback stack is aligned on a 16- >> byte boundary should be sufficient. Here's a patch to do just that: >> > > Thanks. I'm still slightly confused as to why this matters: after > the callback (modulo funcall3) we're in Lisp, which doesn't require > the 16 byte alignment. > > Does the misaligned callback cause further calls out to C to be > also misaligned, or where's the rub? > > Cheers, > > -- Nikodemus Schemer: "Buddha is small, clean, and > serious." > Lispnik: "Buddha is big, has hairy armpits, and > laughs." > > > ------------------------------------------------------- > This SF.Net email is sponsored by the JBoss Inc. > Get Certified Today * Register for a JBoss Training Course > Free Certification Exam for All Training Attendees Through End of 2005 > Visit http://www.jboss.com/services/certification for more information > _______________________________________________ > Sbcl-devel mailing list > Sbc...@li... > https://lists.sourceforge.net/lists/listinfo/sbcl-devel > |
From: Christophe R. <cs...@ca...> - 2005-10-25 08:34:40
|
Nikodemus Siivola <tsi...@cc...> writes: > Thanks. I'm still slightly confused as to why this matters: after the > callback (modulo funcall3) we're in Lisp, which doesn't require the 16 > byte alignment. > > Does the misaligned callback cause further calls out to C to be also > misaligned, or where's the rub? I think it's possible -- I'd certainly like to see a test case for lisp -> C -> lisp -> C, preferably in a way which would excite the 16-byte alignment (so maybe the second C call could be the equivalent of "return *reg_NFP;", if we can't think of another way). In any case, given that it's currently relatively straightforward, if pathological, to call out to C at more-or-less any time[*], I think it would be worth keeping the C stack 16-bit aligned at all times; I don't know if Cyrus' patch is required for that or not, but it plausibly is. Cheers, Christophe |
From: Cyrus H. <ch...@bo...> - 2005-10-25 08:39:02
|
I don't have a nice reproducible test to demonstrate that the patch is required, but I can say that empirical testing involving dropping into gdb at opportune times suggest that without the patch the stack is indeed misaligned in callbacks and C code called from callbacks. Clearly, some solid evidence would be better than this hearsay, but that's all I've got at the moment. Cyrus On Oct 25, 2005, at 1:34 AM, Christophe Rhodes wrote: > Nikodemus Siivola <tsi...@cc...> writes: > > >> Thanks. I'm still slightly confused as to why this matters: after the >> callback (modulo funcall3) we're in Lisp, which doesn't require >> the 16 >> byte alignment. >> >> Does the misaligned callback cause further calls out to C to be also >> misaligned, or where's the rub? >> > > I think it's possible -- I'd certainly like to see a test case for > lisp -> C -> lisp -> C, preferably in a way which would excite the > 16-byte alignment (so maybe the second C call could be the equivalent > of "return *reg_NFP;", if we can't think of another way). > > In any case, given that it's currently relatively straightforward, if > pathological, to call out to C at more-or-less any time[*], I think it > would be worth keeping the C stack 16-bit aligned at all times; I > don't know if Cyrus' patch is required for that or not, but it > plausibly is. > > Cheers, > > Christophe > |
From: Christophe R. <cs...@ca...> - 2005-10-25 08:16:04
|
Cyrus Harmon <ch...@bo...> writes: > Great! So making sure that the callback stack is aligned on a 16-byte > boundary should be sufficient. Here's a patch to do just that: > + (round-up-16 (n) (* 16 (ceiling n 16)))) I think the 'standard' sbcl idiom for doing this is (logandc2 (+ n 15) 15) which may or may not be microefficient, but in any case is vaguely consistently used over the codebase. Cheers, Christophe |
From: Cyrus H. <ch...@bo...> - 2005-10-25 08:21:21
|
Fine by me. I was just aping what rtoy had done for cmucl, minus the new stack frame and all. Cyrus On Oct 25, 2005, at 1:15 AM, Christophe Rhodes wrote: > Cyrus Harmon <ch...@bo...> writes: > > >> Great! So making sure that the callback stack is aligned on a 16-byte >> boundary should be sufficient. Here's a patch to do just that: >> > > >> + (round-up-16 (n) (* 16 (ceiling n 16)))) >> > > I think the 'standard' sbcl idiom for doing this is > (logandc2 (+ n 15) 15) > which may or may not be microefficient, but in any case is vaguely > consistently used over the codebase. > > Cheers, > > Christophe > |
From: Cyrus H. <ch...@bo...> - 2005-11-04 17:29:39
|
Here's an updated patch that uses the preferred SBCL idiom. Thanks, Cyrus --- c-call.lisp 19 Oct 2005 00:31:17 -0700 1.12 +++ c-call.lisp 04 Nov 2005 08:10:39 -0800 @@ -460,7 +460,9 @@ (args-size (* 3 n-word-bytes)) ;; FIXME: n-frame-bytes? (frame-size - (+ n-foreign-linkage-area-bytes n-return-area- bytes args-size))) + (logandc2 (+ (+ n-foreign-linkage-area-bytes n- return-area-bytes args-size) + +stack-alignment-bytes+) + +stack-alignment-bytes+))) (destructuring-bind (sp r0 arg1 arg2 arg3 arg4) (mapcar #'make-gpr '(1 0 3 4 5 6)) (flet ((load-address-into (reg addr) On Oct 25, 2005, at 1:15 AM, Christophe Rhodes wrote: > Cyrus Harmon <ch...@bo...> writes: > >> Great! So making sure that the callback stack is aligned on a 16-byte >> boundary should be sufficient. Here's a patch to do just that: > >> + (round-up-16 (n) (* 16 (ceiling n 16)))) > > I think the 'standard' sbcl idiom for doing this is > (logandc2 (+ n 15) 15) > which may or may not be microefficient, but in any case is vaguely > consistently used over the codebase. > > Cheers, > > Christophe > > > ------------------------------------------------------- > This SF.Net email is sponsored by the JBoss Inc. > Get Certified Today * Register for a JBoss Training Course > Free Certification Exam for All Training Attendees Through End of 2005 > Visit http://www.jboss.com/services/certification for more information > _______________________________________________ > Sbcl-devel mailing list > Sbc...@li... > https://lists.sourceforge.net/lists/listinfo/sbcl-devel |