From: SourceForge.net <no...@so...> - 2011-05-26 07:10:30
|
Bugs item #3069227, was opened at 2010-09-17 12:25 Message generated for change (Comment added) made by ssuthiku You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=116191&aid=3069227&group_id=16191 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: PJ Nee (pjnee) Assigned to: Suravee Suthikulpanit (ssuthiku) Summary: time discrepancy using oprofile Initial Comment: I'm using oprofile 0.9.3. I've noticed a time discrepancy which I can't understand. I'm running a function that I manually time at approx 50 secs. Yet when I run oprofile and do the maths on it, it works out at 23 secs. CPU speed is being reported as 1203MHz (estimated). (277796 * 100000)/1203MHz = 23secs. I'm running on a Core 2 with clock speed of 2.4GHz. root@pjnee-desktop:/home/pjnee/CRAY/libitpdriver/version_003/Linux_build/Test# opreport -c CPU: Core 2, speed 1203 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000 samples % symbol name ------------------------------------------------------------------------------- 277796 100.000 ai_ExecuteUserDiag 2 0.2002 CNehalemII::GetUserCodeStatus() 277793 99.9917 CNehalemII::CheckInDebugMode(bool*) 12 0.0043 CNehalemII::DebugMode(int) 3 0.0011 CNehalemII::UpdateLogicalCPU() 2 7.2e-04 CNehalemII::TapStatus(int, int*) 2 7.2e-04 CNehalemII::SetEnterExitMode(int, int) 2 7.2e-04 CNehalemII::GetUserCodeStatus() [self] 1 3.6e-04 .plt 1 3.6e-04 __i686.get_pc_thunk.bx ------------------------------------------------------------------------------- root@pjnee-desktop:/home/pjnee/CRAY/libitpdriver/version_003/Linux_build/Test# ---------------------------------------------------------------------- >Comment By: Suravee Suthikulpanit (ssuthiku) Date: 2011-05-26 02:10 Message: There are many factors which could affect the behavior reported here. - User used CPU_CLK_UNHALTED event and try to co-relate the profile result to the wall clock time. This only make sense if the target application is CPU-bound. - If the CPU power state changes at some point during the profile which normally affects the CPU frequency , this will affect the amount samples collected here. I don't have Intel Core2-based system. However, I experimented on the AMD family10h system using the current OProfile (0.9.6) profiling a CPU-bound (matrix multiplication benchmark) target workload. And I can see that the amount of samples corresponded to the wall clock time as long as the two conditions I mentioned above are met. ---------------------------------------------------------------------- Comment By: Maynard Johnson (maynardj) Date: 2011-01-23 10:20 Message: Suravee, can you find time to investigate this bugzilla, please? Thanks. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=116191&aid=3069227&group_id=16191 |