From: stephane e. <er...@go...> - 2008-11-17 15:02:25
|
Hello, I have not seen the beginning of that discussion so my comments may be slightly off. It seems Eric has problems with accuracy of instruction addresses when sampling with the PMU. This is an inherent limitation of the PMU. It can be mitigated but not completely eliminated. The core issue is that it takes several cycles between the moment a counter overflows and posting of the PMU interrupt. During that time, the CPU keeps on executing instructions. The interrupt IP you get, reflects the place you were when it triggered. That can be far away from where it was posted and where the counter actually overflowed. Of course, if you are stalled that distance is usually 0 or off by a small number of instructions. But it can be very large when overflow happens during a kernel critical section where interrupts are off. There is nothing SW can do about all of this. Andi mentioned PEBS. I don't know if you are familiar with what it does. Let me summarize. This is a hardware/microcode feature which implements a hardware-managed buffer where samples are stored. The OS points the CPU to a memory region where PMU samples are stored. No PMU interrupt is generated until the buffer becomes full. That part addresses some of the overhead associated with interrupt-based sampling. Unfortunately, PEBS does not point to the instruction where the counter overflowed, it will still be a few instructions off. But this time, you get the machine state at the last retired instruction. Furthermore, PEBS can record samples while in kernel critical sections. A limitation of PEBS is that it does not work with all the PMU events. Only a handful are available. As for perfmon, if you pull from the perfmon2 GIT tree, this should work. Don't know what happen in you case. Perfmon and the pfmon can do simple counting or also collect profiles. $ pfmon date Counts cycles at the user level only for the process date $ pfmon --system-wide -t10 Counts elapsed cycles at user level for all CPU for 10s. Results are per-cpu $ pfmon --long-smpl-periods=240000 date Collect a flat profile of process date. Period is 240,000 elapsed user cycle $ pfmon --system-wide --long-smpl-periods=240000 -t 10 Collect a flat profile on each online CPU during 10s. Period is 240,000 user elapsed cycles. Results are per-cpu You have a lot more examples on the perfmon web site, Following the documentation and pfmon users' guide. Perfmon/pfmon can use PEBS on Intel Core processors. First step is to insert the kernel module for it: # modprobe perfmon_pebs_core_smpl Then use pfmon, we use instruction_retired because elapsed cycles does not support PEBS: $ pfmon --smpl-module=pebs -einstructions_retired --long-smpl-periods=120000 date Hope this helps. On Sat, Nov 15, 2008 at 7:36 PM, Andi Kleen <an...@fi...> wrote: > On Sat, Nov 15, 2008 at 05:30:58PM +0100, Eric Dumazet wrote: >> Andi Kleen a écrit : >> >>>And no, blindly subtracting 16 from IP is not a fix. >> >>Who mentioned a fix ? I am only giving more fuel to Intel guys so they >> >>hopefully can give us a working oprofile. >> > >> >You would need to implement PEBS support to avoid that problem. But it's a >> >big >> >task. perfmon2 implements it already. >> > >> >> Thanks for the information. >> >> Hum, so I grabbed perfmon2 git tree, installed various tools... >> >> I am quite new to pfmon and tried : >> >> # pfmon --system-wide >> sizeof=64 44 >> <press ENTER to stop session> >> >> Then started "tbench 8", and got a kernel panic after 6 seconds. >> >> >> I was using oprofile like this >> >> opcontrol --vmlinux=/path/vmlinux --start >> // doing some benchmarking... >> opreport -l vmlinux | head -n 40 >> >> >> What would be a working equivalent for perfmon2 based tools ? > > Probably getting a perfmon tree that works. I guess Stephane > can help (cc'ed). Or just deal with imprecise events for now. > > -Andi > -- > ak...@li... > |