From: Andre R. <ari...@my...> - 2009-08-21 13:00:13
|
Hi all, this is a oprofile log of a 30 second run on 3 GHz Core2Duo with 100000 sample rate, CPU_CLK_UNHALTED. -- OProfile Statistics -- Nr. sample dumps: 45 Nr. non-backtrace samples: 163988 Nr. kernel samples: 153674 Nr. lost samples (no kernel/user): 0 Nr. lost kernel samples: 697857 Nr. incomplete code structs: 0 Nr. samples lost due to sample file open failure: 0 Nr. samples lost due to no permanent mapping: 37 Nr. event lost due to buffer overflow: 0 Nr. samples lost due to no mapping: 62 Nr. backtraces skipped due to no file mapping: 122 Nr. samples lost due to no mm: 30 ---- Statistics for cpu : 1 Nr. samples lost cpu buffer overflow: 0 Nr. samples received: 81039 Nr. backtrace aborted: 0 ---- Statistics for cpu : 0 Nr. samples lost cpu buffer overflow: 0 Nr. samples received: 86801 Nr. backtrace aborted: 0 What keeps me wondering is that "Nr. lost kernel samples" >> "Nr. kernel samples". Are these numbers within the expected ? Changing buffer sizes doesn't change those losses. I also calculated the expected sample count, 3GHz * 1 / 100000 * 30 seconds = 900000 samples. Thats approximately what you get from "Nr. lost kernel samples" + "Nr. kernel samples". How do the statistics of cpu0 and cpu1 "Nr. samples received" relate to my expected 900000 samples? Best regards, Andre |
From: Robert R. <rob...@am...> - 2009-08-25 15:45:01
|
On 21.08.09 14:59:45, Andre Richter wrote: > Hi all, > > this is a oprofile log of a 30 second run on 3 GHz Core2Duo with > 100000 sample rate, CPU_CLK_UNHALTED. > > -- OProfile Statistics -- Daemon counter (counted by the daemon): > Nr. sample dumps: 45 > Nr. non-backtrace samples: 163988 > Nr. kernel samples: 153674 > Nr. lost samples (no kernel/user): 0 > Nr. lost kernel samples: 697857 > Nr. incomplete code structs: 0 > Nr. samples lost due to sample file open failure: 0 > Nr. samples lost due to no permanent mapping: 37 In-kernel counter (maps with /dev/oprofile/stats): > Nr. event lost due to buffer overflow: 0 > Nr. samples lost due to no mapping: 62 > Nr. backtraces skipped due to no file mapping: 122 > Nr. samples lost due to no mm: 30 > > ---- Statistics for cpu : 1 > Nr. samples lost cpu buffer overflow: 0 > Nr. samples received: 81039 > Nr. backtrace aborted: 0 > > ---- Statistics for cpu : 0 > Nr. samples lost cpu buffer overflow: 0 > Nr. samples received: 86801 > Nr. backtrace aborted: 0 > > I can only talk for the in-kernel counters here: > What keeps me wondering is that "Nr. lost kernel samples" >> "Nr. > kernel samples". Are these numbers within the expected ? > Changing buffer sizes doesn't change those losses. > > I also calculated the expected sample count, 3GHz * 1 / 100000 * 30 > seconds = 900000 samples. > Thats approximately what you get from "Nr. lost kernel samples" + "Nr. > kernel samples". The kernel samples (Nr. samples received) the driver receives are triggered and counted with each interrupt and per cpu. Samples may get lost due to cpu buffer overflows. The event buffer then processes "(all cpu's received samples) - (all cpu's lost samples)". The buffer may also lose some events, so the daemon should finally receive: 81039 + 86801 - 62 - 122 - 30 = 167626 from the oprofile driver (the number of correct events is not counted in the kernel). I don't know where the number 697857 comes from, it seems, a little bit high. Maybe the value is not reset to zero between the sessions, or there are different units (I didn't check), or a bug. Also, using 2 cpus will double the expected sample count to 1800000 cycles. So, the sample count of 167626 differs very much compared to 1800000 cycles. The reasons for this could be that the interrupt handler is not able to rearm the counters as fast, or IO waits, or a setup error. -Robert -- Advanced Micro Devices, Inc. Operating System Research Center email: rob...@am... |
From: Andre R. <ari...@my...> - 2009-10-01 20:21:28
|
Hi Robert, I'm still investigating those lost kernel samples. Today i installed a new Debian Lenny 5.0 stable System. I tested 2 kernels, 2.6.18.8 under Xen 3.3.1. Stock kernel from the Xen repositories. And 2.6.26 that comes by default with lenny. There are two strange behaviors i observed. 2.6.26 doesn't lose a kernel sample. Whereas 2.6.18 under Xen does. The more interesting fact is that, when i increase the callgraph depth from 0 to 16, there is a huge shift in the received samples as well as in the lost kernel samples under 2.6.18. I did some math with my results, but I still can't figure out what's goin on. The buffer overflows seem to be a problem that i can get rid of by adapting the sample rate / buffer sizes. But is the huge gap in the sample counts an expected behavior when i increase the callgraph depth? 4 files are attached, filenames explain their setup. While profiling, the lenny box acted as an iperf server for 1 GBit UDP incoming traffic. |
From: luludede <den...@gm...> - 2009-11-23 04:59:30
|
Hi, I kept getting "Nr. samples lost cpu buffer overflow" when I use oprofile 0.9.6-rc1 to profile the kernel. Although I increased the "CPU buffer size" to be far greater than the (Nr. samples lost cpu buffer overflow) plus (Nr. samples received). Is it a bug with oprofile or I did something wrong? Can anyone give me some suggestion? Thans a lot ! -------------------------------- steps I took for profiling: # opcontrol --reset # opcontrol --start # my bench mark # opcontrol --dump # opcontrol --stop -------------------------------- the profiler setting: # opcontrol --status Daemon running: pid 11302 Event 0: GLOBAL_POWER_EVENTS:100000:1:1:1 Separate options: none vmlinux file: /usr/lib/debug/lib/modules/2.6.29.4-167.fc11.i686.PAEdebug/vmlinux Image filter: none Call-graph depth: 10 Buffer size: 80000000 CPU buffer watershed: 40000000 CPU buffer size: 400000000 ---------------------------------- -- OProfile Statistics -- Nr. sample dumps: 128 Nr. non-backtrace samples: 3153835 Nr. kernel samples: 3066290 Nr. lost samples (no kernel/user): 0 Nr. lost kernel samples: 0 Nr. incomplete code structs: 0 Nr. samples lost due to sample file open failure: 0 Nr. samples lost due to no permanent mapping: 120 Nr. event lost due to buffer overflow: 0 Nr. samples lost due to no mapping: 83 Nr. backtraces skipped due to no file mapping: 0 Nr. samples lost due to no mm: 330 ---- Statistics for cpu : 1 Nr. samples lost cpu buffer overflow: 44322 Nr. samples received: 73275 Nr. backtrace aborted: 0 Nr. samples lost invalid pc: 0 ---- Statistics for cpu : 0 Nr. samples lost cpu buffer overflow: 72740 Nr. samples received: 142325 Nr. backtrace aborted: 0 Nr. samples lost invalid pc: 0 -- View this message in context: http://old.nabble.com/Nr.-lost-kernel-samples-tp25079403p26472849.html Sent from the oprofile-list mailing list archive at Nabble.com. |
From: Maynard J. <may...@us...> - 2009-11-23 15:06:27
|
luludede wrote: > Hi, > Is there a reason you double-posted this problem? One posting per problem, please. > I kept getting "Nr. samples lost cpu buffer overflow" when I use oprofile > 0.9.6-rc1 to profile the kernel. I presume you're referring to the message "WARNING! The OProfile kernel driver reports sample buffer overflows." This message was added in 0.9.5 -- it's not new in 0.9.6. > > Although I increased the "CPU buffer size" to be far greater than the (Nr. > samples lost cpu buffer overflow) plus (Nr. samples received). > > Is it a bug with oprofile or I did something wrong? Can anyone give me some > suggestion? Thans a lot ! No, it's not a bug. As the rest of the warning message recommends, you should lower your sampling frequency. You're taking a sample every 100000 GLOBAL_POWER_EVENTS, which is a fairly high rate by itself. But when you throw in the fact that you're also asking oprofile to collect callgraph information on each sample, it simply can't keep up. A bigger buffer can help to a certain extent, but I suggest going back to your original buffer size (rm /root/.oprofile/daemonrc to remove the cached profiling values), and then double or triple the number of events per sample to lower your sampling frequency. If your ratio of overflows to non-backtrace samples is less than, say, 1 or 2%, you probably don't need to be concerned about it. Otherwise, you can adjust sampling frequency and/or buffer sizes. NOTE: The buffer watershed value is for the kernel-to-oprofile daemon event buffer, not the per-cpu buffers (which are internal to the oprofile kernel driver. And the value you've specified below is not a good one. As the man page states, it should be set between 0.25 and 0.5 of buffer size. If you're interested to know more about oprofile's double-buffering scheme, see http://oprofile.sourceforge.net/doc/internals/core-structure.html#id4685015 -Maynard > > -------------------------------- > steps I took for profiling: > # opcontrol --reset > # opcontrol --start > # my bench mark > # opcontrol --dump > # opcontrol --stop > -------------------------------- > the profiler setting: > # opcontrol --status > Daemon running: pid 11302 > Event 0: GLOBAL_POWER_EVENTS:100000:1:1:1 > Separate options: none > vmlinux file: > /usr/lib/debug/lib/modules/2.6.29.4-167.fc11.i686.PAEdebug/vmlinux > Image filter: none > Call-graph depth: 10 > Buffer size: 80000000 > CPU buffer watershed: 40000000 > CPU buffer size: 400000000 > ---------------------------------- > -- OProfile Statistics -- > Nr. sample dumps: 128 > Nr. non-backtrace samples: 3153835 > Nr. kernel samples: 3066290 > Nr. lost samples (no kernel/user): 0 > Nr. lost kernel samples: 0 > Nr. incomplete code structs: 0 > Nr. samples lost due to sample file open failure: 0 > Nr. samples lost due to no permanent mapping: 120 > Nr. event lost due to buffer overflow: 0 > Nr. samples lost due to no mapping: 83 > Nr. backtraces skipped due to no file mapping: 0 > Nr. samples lost due to no mm: 330 > > ---- Statistics for cpu : 1 > Nr. samples lost cpu buffer overflow: 44322 > Nr. samples received: 73275 > Nr. backtrace aborted: 0 > Nr. samples lost invalid pc: 0 > > ---- Statistics for cpu : 0 > Nr. samples lost cpu buffer overflow: 72740 > Nr. samples received: 142325 > Nr. backtrace aborted: 0 > Nr. samples lost invalid pc: 0 > |