|
From: Maynard J. <may...@us...> - 2012-09-19 18:34:50
|
On 09/19/2012 12:50 PM, Florian Krohm wrote: > On 09/19/2012 12:00 PM, Maynard Johnson wrote: >> On 09/08/2012 10:33 PM, Florian Krohm wrote: >>> On 09/08/2012 06:18 PM, Julian Seward wrote: >>> >>>> This is one of those things that has no effect for 99.99% of cases >>>> but may have a big impact for the cases it hits. I suspect it >>>> would be easy to write a test case that shows it -- a single >>>> basic block that contains a lot of both kinds of instructions, >>>> iterated over a lot, with suitable rounding mode setup beforehand. > >> Hi, Florian, >> Finally got back to this. Sorry for the delay. I wrote a simple testcase as described, >> using fmul and dmul in a loop, iterated over 500,000 times. Valgrind > executes the testcase >> in ~0.5 seconds. I did a trace with the VEX_TRACE_ASM flag and >> ensured that the mtfsf instruction was being done before the dmul and the fmul. >> So it seems to me this is not a big impact. I'm inclined to do >> nothing here, following the adage of "if it ain't broke, don't >> fix it." > > Hey Maynard, > > it would be interesting to see the difference between your experiment > and a modified test where the rounding mode is not set (just would have > to disable that in isel) as accuracy is irrelevant. But for runtime to > be meaningful for comparison you'd probably have to use many more > iterations, say 20x.. With 10,000,000 iterations, running under unaltered valgrind several times, I get an average runtime of ~1.81 seconds. If return immediately from host_ppc_isel.c:_set_FPU_rounding_mode, the average runtime is ~1.55 seconds. -Maynard > > Cheers, > Florian > |