|
From: Julian S. <js...@ac...> - 2005-12-04 19:40:57
|
> This seems like a good choice since the survey found that aroun 40% of > users clearly identified themselves as using P4s, as opposed to about 5% > for P3s. incl/decl are recommended don't-uses on P4s. Not that it made any measurable difference at all. I did write a small program to measure the branch-mispredict cost on P4, and found it to be 21 cycles. I also established that P4 can only predict one branch target address for an indirect jump (alternating between 2 different ones is worse, and cycling through 4 gives you the full 21-cycle hit). What this means is that each bb dispatch stalls for 21 cycles, which at an IPC of 0.8 is worth 16 ish insns. Bad. J |