[perfmon2] Nehalem LATENCY_ABOVE_THRESHOLD
Status: Beta
Brought to you by:
seranian
From: Zoltán M. <zol...@in...> - 2010-07-26 14:51:26
|
Hi Stephane, I tried the pebs-ll sampling module of pfmon for the Intel Nehalem and the results I get are a bit unexpected. I sampled some programs using the following command: pfmon --smpl-module=pebs-ll -e MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD --ld-lat-threshold=64 --long-smpl-periods=500 --short-smpl-periods=250 --pebs-ll-dcmiss-reasons --with-header <my_program> I thought that by setting the threshold to 64, mostly L3 misses will be sampled (reason 0xA to 0xD). However, only around 10% of the samples I obtained list L3 misses as the source of the sampled instruction. The remaining 90% of the samples is mostly caused by reasons 0x1 to 0x4 (L1 hits, pending L2 hits, L2 hits and L3 hits). These events should normally have a latency less than 64 cycles so they should have been filtered by the threshold value I initially set. If I set the latency to 128 and decrease the sampling interval, the percentage of L3 misses increases to around 30%, but on-chip cache hits still don't disappear. Do you have any idea why this anomaly occurs? Many thanks. Best regards, Zoltan |