Re: [Math-atlas-devel] ATLAS 3.7.7: now more effici^Hent
Brought to you by:
rwhaley,
tonyc040457
From: <rw...@cs...> - 2004-07-18 20:53:58
|
Jeff, >I didn't know it had thermal throttling... something akin to the P4. >It hit 67C during your testing even with a box fan in front of it? >I think that's kind of hot. I hope it was the battery. If you get a >few mintues and feel inclined, could you try running it without >the battery in it? I yanked the battery, and ran the below timing run without the box fan, and you can heat this guy up: >[rwhaley@localhost throttle]$ cat /proc/acpi/thermal_zone/THRM/temperature >temperature: 69 C >[rwhaley@localhost throttle]$ cat /proc/acpi/thermal_zone/THRM/temperature >temperature: 73 C Was 49C on boot, and 65 on start of this stress test. >Also, I don't any laptops shipping with Efficeon have fans on >the CPU - just a simple heatsink. Perhaps with a small fan they >could get the temps down. Yeah, this is a 2 lb laptop, so no fan & no room for a lot of breathing. It amazes me it hasn't exploded under the load. >That's the weird thing about working on one - it's not a true x86. >I've been pushing Transmeta as best I can to consider a x86-64 >translation. i'm not sure if it's worth it, but the idea of being >able to run an x86 or an x86-64 OS on the exact same box is >kind of appealing. You mean like you can on a Hammer already? :) I'd also like to see x86-64, just 'cause then I'd have enough registers to fully pipeline the FPU, to see if that would get performance up. Even better, would be a real 3-operand assembly, like one of the RISC ISAs. >I was worried about the fail off in performance as well. Over the >range you test it looked pretty good but I did see the tail off as >N was increasing. Doesn't look like it's getting a lot worse as you get bigger, so it's just a lower plateu, not a continuing slope. Here's some larger sizes: TEST TA TB M N K alpha beta Time Mflop SpUp PASS ==== == == === === === ===== ===== ====== ===== ==== ==== 1 N N 2000 2000 2000 1.0 1.0 14.12 1133.3 1.00 --- 1 N N 2000 2000 2000 1.0 1.0 16.46 972.1 0.86 YES 2 N N 2200 2200 2200 1.0 1.0 21.60 986.1 1.00 --- 2 N N 2200 2200 2200 1.0 1.0 23.47 907.4 0.92 YES 3 N N 2400 2400 2400 1.0 1.0 28.80 960.2 1.00 --- 3 N N 2400 2400 2400 1.0 1.0 29.94 923.4 0.96 YES 4 N N 2600 2600 2600 1.0 1.0 36.58 960.9 1.00 --- 4 N N 2600 2600 2600 1.0 1.0 37.82 929.4 0.97 YES 5 N N 2800 2800 2800 1.0 1.0 45.43 966.4 1.00 --- 5 N N 2800 2800 2800 1.0 1.0 47.46 925.1 0.96 YES 6 N N 3000 3000 3000 1.0 1.0 56.99 947.6 1.00 --- 6 N N 3000 3000 3000 1.0 1.0 58.23 927.3 0.98 YES >One silly question for you. Is SSE down in hardware or through >the CMS in the Efficeon? I don't know what's where. They have two units, according to the data sheet, and I've heard that one is specialized to MUL, so I think they get their parallelism by doing a mul & add at same time, rather than multiple ops at a step . . . Cheers, Clint |