|
From: Florian K. <br...@ac...> - 2012-09-07 14:01:03
|
On 09/07/2012 09:09 AM, Maynard Johnson wrote: > On 09/06/2012 11:34 AM, Florian Krohm wrote: >> I just ran into this while looking at ppc insn selection for inspiration >> as to how to handle floating point rounding per FPC bits efficiently. >> >> So, in essence, the insn selector remembers the rounding mode of the >> previous floating point operation in env->previous_rm and if the next >> floating point operation uses the same rounding mode it does not need to >> set the FPSCR bits again. This is a neat trick. >> What puzzled me is that there is only a single previous rounding mode. >> Wouldn't you want two of them? One for bfp and one for dfp? >> Consider a sequence of insns where bfp and dfp operations alternate. >> Assume further that all bfp insns use the same rounding mode and all dfp >> insns use the same rounding mode. You'd be reloading the FPSCR bits for >> every such insn because previous_rm would be different with each insn. >> If you had two of them you would not have to reload ever (for this >> particular sequence). >> >> The current implementation works, it's correct. I think it would be more >> efficient if you had two previous rounding modes in the env struct. > Florian, > I'll reply for Carl here. Yes, I agree that two cached rounding modes would be more efficient. > The improvement may be so small as to be negligible, but the only way I know to > find out is to implement it. :-) I'll dig around for a testcase > that exhibits the behavior you describe above so I can measure any > improvement gained with the additional cached rounding mode. It might be much less work to simply change the code :) I have mostly given up myself on measuring the win of micro optimizations. If there is a good argument in favour I go ahead and implement it. I just measure that runtime does not get worse. Florian |