From: Harry M. <hj...@ta...> - 2005-11-23 20:29:04
|
Apologies for the somewhat OT post, but failing to be able to run perfsuite, I've had to fall back for now on oprofile. This may be of interest here as well. Advice and RTFM's welcome as long as they come with URLs. In examining the oprofile output for several related programs, I was struck by how much time as being wasted in 'poll_idle'. Because these apps typically consumed very large (1GB) files, I originally thought that this was all due to disk i/o, but now I'm not so sure. In trying to grasp how different programs behaved in terms of I/O and how that related to 'poll_idle', I ran a few other apps. Below are the oprofile outputs for 3 of them, one is for a compilation of a fairly large lib, 2 others are bioinformatics apps with which I'm familiar. Especially notable is the oprofile output for clustalw. It ingested and egested relatively tiny files, ran for about 8 minutes, but oprofile suggests that 50% of the time, the CPU was idle. This cannot be disk I/O (the system was essentially idle except for this test), so can memory operations account for this much idle time (cache misses, moving data to & from main mem, page faults, etc)? And if so, how would I go about quantifying this? Is this the point where I have to dive into tau? 1) compilation of the hdf5 library (which I needed to do anyway). CPU: AMD64 processors, speed 1990.96 MHz (estimated) Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000 samples % app name symbol name 1495583 30.8198 cc1 (no symbols) 870380 17.9361 vmlinux poll_idle 486332 10.0219 libc-2.3.2.so (no symbols) 241578 4.9782 bash (no symbols) 197392 4.0677 vmlinux unmap_vmas 160895 3.3156 vmlinux copy_page_range 127658 2.6307 vmlinux copy_page 108131 2.2283 ld-2.3.2.so (no symbols) 101516 2.0920 as (no symbols) 99807 2.0567 vmlinux clear_page 49755 1.0253 libbfd-2.15.so (no symbols) 48223 0.9937 vmlinux page_fault 2) a run of clustalw, a multiple alignment app from bioinformatics which takes a small amount of data (227KB) and performs a large amount of integer computation on it, and then emits a relatively small amount of data (3 files totalling 180KB). The program is fairly chatty tho (lots of screen output, redirected to a file in this case, but the total is included in the 180K output). samples % app name symbol name 8950726 48.0420 vmlinux poll_idle 7498500 40.2473 clustalw forward_pass 832632 4.4691 clustalw prfscore 382780 2.0545 clustalw pdiff 247423 1.3280 clustalw reverse_pass 99006 0.5314 clustalw diff 77909 0.4182 vmlinux timer_interrupt 77141 0.4140 clustalw open_penalty2 73940 0.3969 clustalw calc_prf1 59589 0.3198 clustalw open_penalty1 49189 0.2640 clustalw ext_penalty1 47153 0.2531 clustalw ext_penalty2 3) a run of tacg, another bioinformatics sequence matching app that consumes a larger amount of data (~5MB) and performs a fair amount of integer arithmetic on it and then emits a file of ~117MB) samples % app name symbol name 212135 53.1937 vmlinux poll_idle 111922 28.0649 libc-2.3.2.so (no symbols) 17894 4.4870 tacg hash 7601 1.9060 tacg Anti_Par 6468 1.6219 tacg Translate 3451 0.8654 tacg Cutting 2421 0.6071 tacg Degen_Cmp 2292 0.5747 libm-2.3.2.so (no symbols) 2034 0.5100 tacg HorribleAccounting 1913 0.4797 tacg LinearMap 1847 0.4631 tacg GetSequence2 1638 0.4107 vmlinux timer_interrupt -- Cheers, Harry Harry J Mangalam - 949 856 2847 (vox; email for fax) - hj...@ta... <<plain text preferred>> |