From: Abdel-Hameed Abdel-S. B. <sha...@gm...> - 2007-10-29 00:54:53
|
This is in reply to both Josef's and Nicholas's emails. I think the metric you are showing if it is the L2 misses per reference or instruction then it should be the standardized "mpki" or Misses per Kilo Instruction. I think naming this an the L2 miss rate is sort of misleading at first since it is calling something by something else's name. You can easily report both since it is just the division that make the difference. I came across valgrind as a cache profiler through the microarchitecture webpage and I am sure any person with an architecture background would be at first mislead by the numbers but it is easy to figure out that the miss rate doesn't reflect the misses nor the refs of both the L1 and the L2. Thanks. --Hameed. On 10/26/07, Josef Weidendorfer <Jos...@gm...> wrote: > > On Friday 26 October 2007, Abdel-Hameed Abdel-Salam Badawy wrote: > > ==18315== I refs: 561,873 > > ==18315== I1 misses: 3,089 > > ==18315== L2i misses: 1,497 > > ==18315== I1 miss rate: 0.54% > > ==18315== L2i miss rate: 0.26% > > ==18315== D refs: 295,619 (209,070 rd + 86,549 wr) > > ==18315== D1 misses: 4,752 ( 3,948 rd + 804 wr) > > ==18315== L2d misses: 2,584 ( 1,960 rd + 624 wr) > > ==18315== D1 miss rate: 1.6% ( 1.8% + 0.9% ) > > ==18315== L2d miss rate: 0.8% ( 0.9% + 0.7% ) > > ==18315== L2 refs: 7,841 ( 7,037 rd + 804 wr) > > ==18315== L2 misses: 4,081 ( 3,457 rd + 624 wr) > > ==18315== L2 miss rate: 0.4% ( 0.4% + 0.7% ) > > ... > > So, L2 miss rate should be 52% in this case. > > L2d miss rate should be also 54.4% assuming only we look at the data > misses > > and refs only. > > > > Why in the world, these are no the numbers reported by the valgrind? > > Hi, > > I think your point is valid; "miss rate" is "misses/refs". > > However, I think we should keep above numbers, but could add the real > "L2 miss rate" in addition. The current output actually talks about > the efficiency of combined cache levels. > So we should write e.g. "I1+L2 miss rate" instead of "L2i miss rate" > (the current output is kind of bogus as our cache model does not have > a separate L2 instruction cache). > > The reason for using the miss rate of combined cache levels at all > is because the value is also to be used in sorted lists of functions and > source annotation to pinpoint at code with bad cache efficiency, and > for this to be useful, you really need combined figures. > > Nick: If we change this (and IMHO we should), this should be done > exactly the same for cachegrind and callgrind. > > Josef > -- Hameed. |