|
From: Jim U. <ji...@3e...> - 2004-03-15 04:04:14
|
While experimenting with the tsunami Vector class, I ran into a C++
feature that will be old hat to C++ gurus but was not immediately
obvious to me.
Basically, instead of this: you should do this:
void compute(Vector &v) { void compute(Vector &v) {
Vector vc; for (int i = 0; i < 10; i++) {
for (int i = 0; i < 10; i++) { Vector vc = v.normalize();
vc = v.normalize(); printf("%f\n", vc.x);
printf("%f\n", vc.x); }
} }
}
The reason: v.normalize() constructs and returns a temporary object (the
normalized vector). This temporary can be optimized away, but only in
an initializer, not in an assignment. For the nitty-gritty, see
http://blogs.msdn.com/slippman/archive/2004/02/03/66739.aspx.
For the same reason, if you have a vector you need to normalize in
place, you should write "v.normalize_self()" instead of "v = v.normalize()",
if such a function is available.
Unfortunately, even with the temporary optimized out, gcc still writes
the normalized vector to vc (on the stack) every single iteration, even
though only vc.x is used, and only once. I don't know why. Built-in
types like floats could be faster, even though less convenient; the
stack may not be touched at all.
Following is the assembly output for the end of the two loops (after the
vector length has been put in fr1), showing the divide and the printf.
The extra code in the left column is where the temporary is copied from
the stack to vc (also on the stack).
.L16: .L16:
... ...
fdiv fr1,fr2 fdiv fr1,fr2
fdiv fr1,fr4 fdiv fr1,fr5
fmov.s fr2,@r14 fmov.s fr2,@r14
fdiv fr1,fr5 fdiv fr1,fr4
fmov.s fr4,@(r0,r15) fmov.s fr5,@(r0,r15)
mov #20,r0 mov #8,r0
fdiv fr1,fr3 fdiv fr1,fr3
fmov.s fr5,@(r0,r15) fmov.s fr4,@r15
mov #24,r0 jsr @r13 // printf
fmov.s fr3,@(r0,r15) fmov.s fr3,@(r0,r15)
mov.l @(16,r15),r1
mov.l r1,@r15
mov.l @(20,r15),r1
fmov.s @r15,fr4
mov.l r1,@(4,r15)
mov.l @(24,r15),r1
mov.l r1,@(8,r15)
mov.l @(28,r15),r1
jsr @r13 // printf
mov.l r1,@(12,r15)
--
"The FBI said al-Rabeei's aliases may include: Fawaz Yahia Hassan Aribii,
Fawaz al-Rubai, Fawaz Yehia Hassan al-Rabie, Fawaz Yahya al-Rabi'i, Fawaz Yahya
al-Ribi (al-Ruba'i, al-Rabia'i, al-Rabi'i), Forqan al-Tajiki, Furgan al-Tajiki,
Furqan the Chechen, Faris al-Baraq, Sa'id, Musharraf, and Salem al-Frahan."
ji...@3e... / 0x43340710 / 517B C658 D2CB 260D 3E1F 5ED1 6DB3 FBB9 4334 0710
|