Thanks Kristian,

I have some points to draw attaention on:

- One counts requests while the other counts line allocations (that is not
  the same, as it is possible for multiple cache misses to the same cache
  line to be serviced by the same single allocation).
 
==That means
LAST_LEVEL_CACHE_MISSES ( i.e L2_RQSTS:SELF:I_STATE ) should be more than L2_LINES_IN:SELF. While we see opposite in (k+u) mode and the two remains the same in u mode.

-One seems to exclude prefetches, while the other seems to include them

==That does not look like. See for example:

(this is to show the euivalence)
pfmon -u -eLAST_LEVEL_CACHE_MISSES,L2_RQSTS:SELF:I_STATE /bin/ls
611 LAST_LEVEL_CACHE_MISSES
611 L2_RQSTS:SELF:I_STATE

(this is  also to show the euivalence)
pfmon -u -eL2_RQSTS:SELF:I_STATE,L2_LINES_IN:SELF /bin/ls
638 L2_RQSTS:SELF:I_STATE
638 L2_LINES_IN:SELF

(This and next tries to draw 'exclusion of prefetch', 'only prefetch' and 'inclusion of prefetch' effect on the event )
pfmon -u -eL2_RQSTS:SELF:I_STATE,L2_RQSTS:SELF:I_STATE:PREFETCH /bin/ls
654 L2_RQSTS:SELF:I_STATE
142 L2_RQSTS:SELF:I_STATE:PREFETCH

pfmon -u -eL2_RQSTS:SELF:I_STATE,L2_RQSTS:SELF:I_STATE:ANY /bin/ls
668 L2_RQSTS:SELF:I_STATE
804 L2_RQSTS:SELF:I_STATE:ANY

From last two we can say that LAST_LEVEL_CACHE_MISSES does not include prefetch effect. (Which anyway is accepted)

Now for L2_LINES_IN:SELF:
(same as above to draw 'exclusion of prefetch', 'only prefetch' and 'inclusion of prefetch' effect on the event )
pfmon -u -eL2_LINES_IN:SELF,L2_LINES_IN:SELF:PREFETCH /bin/ls
648 L2_LINES_IN:SELF
152 L2_LINES_IN:SELF:PREFETCH

pfmon -u -eL2_LINES_IN:SELF,L2_LINES_IN:SELF:ANY /bin/ls
595 L2_LINES_IN:SELF
749 L2_LINES_IN:SELF:ANY

Here again it looks like that we can say that L2_LINES_IN:SELF does not include prefetch effect.


I again welcome any more hints on the difference between the values of LAST_LEVEL_CACHE_MISSES and L2_LINES_IN:SELF for kernel mode. For user mode both give same values.
This can be checked on any of the intel core microarch based machine.

Thanks again,
Regards,
JK

 


From: Kristian Nielsen <knielsen@knielsen-hq.org>
To: J K Rai <jk.anurag@yahoo.com>
Cc: perfmon2 <perfmon2-devel@lists.sourceforge.net>
Sent: Monday, 23 February, 2009 7:43:04 PM
Subject: Re: [perfmon2] Difference in LAST_LEVEL_CACHE_MISSES and L2_LINES_IN:SELF values for kernel +user v/s user mode on xeon x5482

J K Rai <jk.anurag@yahoo.com> writes:

> I get same values for the two events (i.e. LAST_LEVEL_CACHE_MISSES and L2_LINES_IN:SELF) in user mode while the two change when the same are collected for kernel+user mode. E.g.

FWIW, I see similar behaviour. I'm using libperfmon/pfmon 3.6 and kernel
2.6.27.9. Cpu is Intel Core 2 Duo.

In general, I would not expect these two events to be exactly the same, as
they count somewhat different things:

- One counts requests while the other counts line allocations (that is not
  the same, as it is possible for multiple cache misses to the same cache
  line to be serviced by the same single allocation).

- One seems to exclude prefetches, while the other seems to include them.

In any case, I agree it is interesting that the difference is so huge for
kernel mode.

I would suggest checking if you see the same behaviour with OProfile (or
another seperate tool) to learn more about whether this is something specific
to permon or a general behaviour of the CPU. This could also help you to
pinpoint exactly where in the kernel the difference occurs. And maybe use
something a bit heavier than ls, eg. I tried this:

    $ pfmon -k -u -eLAST_LEVEL_CACHE_MISSES,L2_LINES_IN:SELF perl -MData::Dumper -MSocket -MIO::Handle -MLWP::UserAgent -le 42
    20541 LAST_LEVEL_CACHE_MISSES
    66674 L2_LINES_IN:SELF

Here is some relevant information from showevtinfo and Intel manuals:

-----------------------------------------------------------------------
    Name    : LAST_LEVEL_CACHE_MISSES
    Desc    : count each cache miss condition for references to the last level cache. The event count may include speculation, but excludes cache line fills due to hardware prefetch. Alias to event L
    2_RQSTS:SELF_DEMAND_I_STATE
    Code    : 0x412e
    Counters : [ 0 1 ]


    Name    : L2_RQSTS
    Desc    : L2 cache requests
    Code    : 0x2e
    Counters : [ 0 1 ]
    Umask-00 : 0x0f : [MESI] : Any cacheline access
    Umask-01 : 0x01 : [I_STATE] : Invalid cacheline
    Umask-02 : 0x02 : [S_STATE] : Shared cacheline
    Umask-03 : 0x04 : [E_STATE] : Exclusive cacheline
    Umask-04 : 0x08 : [M_STATE] : Modified cacheline
    Umask-05 : 0x40 : [SELF] : This core
    Umask-06 : 0xc0 : [BOTH_CORES] : Both cores
    Umask-07 : 0x30 : [ANY] : All inclusive
    Umask-08 : 0x10 : [PREFETCH] : Hardware prefetch only

    This event counts all completed L2 cache
    demand requests from this core that miss
    the L2 cache.. This includes L1 data cache
    reads, writes, and locked accesses, L1 data
    prefetch requests, and instruction fetches..
    This is an architectural performance event.


-----------------------------------------------------------------------
    Name    : L2_LINES_IN
    Desc    : L2 cache misses
    Code    : 0x24
    Counters : [ 0 1 ]
    Umask-00 : 0x40 : [SELF] : This core
    Umask-01 : 0xc0 : [BOTH_CORES] : Both cores
    Umask-02 : 0x30 : [ANY] : All inclusive
    Umask-03 : 0x10 : [PREFETCH] : Hardware prefetch only

    This event counts the number of cache
    lines allocated in the L2 cache. Cache lines
    are allocated in the L2 cache as a result of
    requests from the L1 data and instruction
    caches and the L2 hardware prefetchers
    to cache lines that are missing in the L2
    cache.
    This event can count occurrences for this
    core or both cores. It can also count
    demand requests and L2 hardware
    prefetch requests together or separately.
-----------------------------------------------------------------------

>
> pfmon  -u -eLAST_LEVEL_CACHE_MISSES,L2_LINES_IN:SELF /bin/ls
> cpu_2006_mix_lim_random_order_pfmon_t.sh  pair_lim_cpu_2006_pfmon_Phasewise_improved.sh  pair_lim_cpu_2006_pfmon_t.sh
> cpu_2006_pfmon_t.sh                      pair_lim_cpu_2006_pfmon_Phasewise_prefetch.sh  pair_lim_cpu_2006_pfmon_t.sh_old
> 827 LAST_LEVEL_CACHE_MISSES
> 827 L2_LINES_IN:SELF
>
> pfmon -k -u -eLAST_LEVEL_CACHE_MISSES,L2_LINES_IN:SELF /bin/ls
> cpu_2006_mix_lim_random_order_pfmon_t.sh  pair_lim_cpu_2006_pfmon_Phasewise_improved.sh  pair_lim_cpu_2006_pfmon_t.sh
> cpu_2006_pfmon_t.sh                      pair_lim_cpu_2006_pfmon_Phasewise_prefetch.sh  pair_lim_cpu_2006_pfmon_t.sh_old
> 1808 LAST_LEVEL_CACHE_MISSES
> 2267 L2_LINES_IN:SELF
>
> Explanation for this is welcomed.
> I also request people with latest release installed to check it.
> Thanks and regards,
> JK

- Kristian.


Get perfect Email ID for your Resume. Get before others grab.