From: shadoi <sh...@na...> - 2006-09-26 03:08:51
|
David Sharp wrote: > On 9/25/06, The Rasterman Carsten Haitzler <ra...@ra...> wrote: >> On Mon, 25 Sep 2006 10:50:09 -0700 Blake Barnett <sh...@na...> >> babbled: >> >> > >> > On Sep 25, 2006, at 10:44 AM, David Sharp wrote: >> > >> > > amd64 users, rejoice! >> > > >> > > i got tired of slowing down e with -mfpmath=387 all the time, so i >> > > finally dug in to this bug after realizing it was probably a problem >> > > with floating point cancelation (occurs when subtracting two numbers >> > > that are very near equal, and causes an extreme loss of precision). >> > > >> > > the problem is in _edje_part_recalc() when it linearly interpolates >> > > all the part parameters. all of the caclulations are of this form: >> > > p3.x = (p1.x * (1.0 - pos)) + (p2.x * (pos)); >> > > i believe there is some cancelation occuring in the (1.0-pos) part of >> > > this, especially as pos approaches 1.0. >> > > >> > > replacing the above line with the following fixes the problem: >> > > p3.x = p1.x + (p2.x - p1.x) * pos; >> > > mathematically equivalent, but, alas, computers aren't as good at >> math >> > > as we think. >> > >> > Awesome! My 3500+ CPU has felt fairly sluggish in E, my video card >> > has 128MB of video memory (ATI 9700PRO, using ATI's drivers) and I've >> > noticed that things just aren't as fluid as in Rasters video >> > captures. Hopefully this'll help. Nice work. >> >> it won't make any difference you can measure. the amount of fp math >> done is < >> 0.1% of the work - by using sse math you lose a bit of precision and >> maybe gain >> 50% speedup - on that < 0.01%. frankly- you will not be able to even >> measure >> the speedup letalone notice it. it's a fallacy to think it will help. >> trust me. > > yah, it will only save about 37 instructions (b/c the new formulas use > one less operation, and there are 37 of them), and the sse > instructions will only be slightly faster than the FPU ones. that's a > savings of about, say 25 ns each time _edje_part_recalc is run. > > btw, you gave blake my credit in the CVS log... :( Bummer. I guess ATI's drivers just aren't as fast as Nvidia's. At least we don't need a special case for amd64 when building packages now. -Blake |