[Math-atlas-devel] 3.7.8
Brought to you by:
rwhaley,
tonyc040457
From: <rw...@cs...> - 2004-07-24 21:33:38
|
Guys, I just released 3.7.8, where I get a little faster for double precision on the Efficeon. Here's an updated table: PEAK SSE2 dMM-ic dMM-oc dGEMM ==== ========= =========== =========== =========== 1.6Ghz Ham64 3200 3051(98%) 2984(93/98%) 2937(92/98%) 2805(88/96%) 2.8Ghz P4E 5600 5178(92%) 4492(80/87%) 4425(79/99%) 4303(77/97%) 1.0Ghz PIII 1000 -------- 933(93%) 840(84/90%) 760(76/90%) 1.0Ghz Eff3.7.8 2000 1790(90%) 1595(80/89%) 1371(69/86%) 1280(64/93%) " asymptotic 995(49/72%) 1.0Ghz Eff3.7.7 2000 1790(90%) 1514(76/85%) 1309(65/86%) 1201(60/92%) " asymptotic 970(49/74%) As you can see, the in-cache numbers are now quite good. Out-of-cache, as might be expected on this arch, goes from bad to terrible. As before, the main problem on this arch is the fact that large problems to not perform as well as small problems. For all other archs, the peak dGEMM number reported above is essentially the same as the asymptotic DGEMM speed, but for the Efficeon, as you see, it is way under. Cheers, Clint |