Stephane,

 

I believe the two events I mentioned do have the same event encoding but they have different event indexes (since they occupy different places in the event tables).  I was using the term event code to represent the event table index and it is this value that PAPI uses to work its way through the tables.  Sorry if my choice of a label was misleading.

 

You are correct that both event strings will lead to the same encoding and therefore produce the same results when using them to count events.  But the fact that they have different index values means that PAPI must be very careful how it uses these indexes.  In particular converting an index into an event string and then back into an index does not always give back the same index we started with.  Now that I understand this, I think I have a better chance of making the list code in PAPI work correctly.  Now it is just a matter of trying another approach.

 

Thanks

Gary

 

From: Stephane Eranian [mailto:eranian@googlemail.com]
Sent: Thursday, April 24, 2014 2:53 PM
To: Gary Mohr
Cc: Philip Mucci; Vince Weaver; Heike McCraw; <perfapi-devel@eecs.utk.edu>; perfmon2-devel
Subject: Re: [perfmon2] [Perfapi-devel] FW: Proposed enhancement to libpfm4.

 

 

 

On Thu, Apr 24, 2014 at 10:43 PM, Gary Mohr <Gary.Mohr@bull.com> wrote:

After spending most of yesterday trying to figure out why I was seeing an infinite loop in the list code, Vince’s comments are quite meaningful to me.  This morning I looked at the showevtinfo source in libpfm4.  It has a huge advantage that it can print the event output from within the loops that walk the event tables.

 

In the PAPI environment, there is an enumerate API which can be used to get the first or next event code for a given component.  The caller then passes the event code to a get event info API call to get information about the event which can be printed by the caller.  The problems that Vince discussed and that I am seeing are in the enumerate API calls.  The problem happens when trying to find the next event code to get information about.  The caller tells PAPI the event code of the last one it processed and PAPI is supposed to give back the next event code to do.

 

I think I now understand why the loop I was seeing yesterday happens but I am not sure yet how to best fix it. 

 

When PAPI calls pfm_get_os_event_encoding with an event name of perf::CYCLES using the OS 2 (PFM_OS_PERF_EVENT_EXT), it returns the following fully formatted event name:

 

perf::PERF_COUNT_HW_CPU_CYCLES:u=1:k=1:precise=0:excl=0:mg=0:mh=1

 

I think the event name is different than what was passed in because perf::CYCLES is defined in the event tables as being equivalent to perf::PERF_COUNT_HW_CPU_CYCLES. 

 

Stephane, is this result from libpfm4 pfm_get_os_event_encoding  considered correct ??

 

It is correct because it can be passed back for encoding and it will succeed.

 

These two events have different event codes so when PAPI does a ntv_name_to_code call with the perf::PERF_COUNT_HW_CPU_CYCLES name, it gets the event code of the previous one that was done.  It then adds 1 and ends up redoing the perf::CYCLES event causing the infinite loop.

 

Those two events should have the same event codes, they are aliases. If not, then this is a bug.

 

A possible solution could be to have libpfm4’s encode function always return a fully formatted event string using the event name it was passed.

 

Another maybe nicer solution could be to modify PAPI to not even do the encode and ntv_name_to_code calls when listing events.  The enumeration API was given the event code last processed, could we just add 1, check to make sure it represents a valid event (code_to_name) and if not get the first event code from the next pmu. 

 

PAPI currently does these calls because the code is used both when adding events to event sets and when listing events.  These calls are needed when adding events but seem to be just adding complexity when listing events.  Maybe if the PAPI API calls to add an event to an event set and the one to list events did not share the same code, then the code to do each service would be easier to manage.

 

I will see if I can change the PAPI enumeration API call to not use the functions used by the add event API call.

 

I know this sounds like it should be real simple but yesterday taught me it is anything but simple. 

Gary

 

 

From: Stephane Eranian [mailto:eranian@googlemail.com]
Sent: Thursday, April 24, 2014 6:49 AM
To: Philip Mucci
Cc: Vince Weaver; Gary Mohr; Heike McCraw; <perfapi-devel@eecs.utk.edu>; perfmon2-devel


Subject: Re: [perfmon2] [Perfapi-devel] FW: Proposed enhancement to libpfm4.

 

vince,

 

On Thu, Apr 24, 2014 at 3:38 PM, Philip Mucci <mucci@icl.utk.edu> wrote:

Hi Vince,

I'm just a guilty as i did a lot of the pfm work in the past. 8-)

As for the Linux portion, perf is already Linux only, plus configure or the compiler sets the LINUX preprocessing define.

So that should be fine, the issue now would be to fix the pfm enumeration. Vince do u have any specifics on the issues? I never saw this back in the day when I did the first pass at enumeration way back in the day.

Just like Phil, I am not sure I understand the enumeration problem you're alluding to.

Look at showevtinfo.c in libpfm4. It lists the events and does not end up in an infinite

loop when there are aliases.

 

Thanks


Apologies for brevity and errors as this was sent from my mobile device.

> On Apr 23, 2014, at 22:55, Vince Weaver <vincent.weaver@maine.edu> wrote:
>
>> On Wed, 23 Apr 2014, Gary Mohr wrote:
>>
>> I have the perf_events component restructured to eliminate the call to
>> pfm_find_event (plus several other libpfm4 calls) and it looks like it
>> is working when adding events to an event set.  I figured out how to get
>> the fully qualified event string back from the encode function and now
>> use that to fill in the information needed by the papi event table.
>> There is a lot less code executed now when doing an add event and it is
>> much easier to follow what is happening.  This new code supports the
>> cpu=x mask (I get the value back from the encode function but still have
>> not made papi changes to use it).
>
> Believe it or not the initial PAPI libpfm4 implementation was like this,
> but I had to abandon it because it just would not work with enumeration.
>
> The problem is libpfm4 has aliased events.  So if you convert to the
> canonical name for an event, it might map to another event name.  So when
> you try to enumerate "next" it takes you to next for the alias not the
> event you were on.  Best case you just get like 9 events like you're
> seeing, worst case you get stuck in infinite loops.
>
> As for why PAPI was using the raw CPU OS type rather than extended,
> that's because PAPI is in theory cross platform.  The "extended
> perf_event" umasks are nice if you're running on Linux, but they're
> not available on other platforms.  I guess we could declare that PAPI6 is
> Linux/perf_event only, but until then we have to support the traditional
> ways of specifying user/kernel and CPU number even in libpfm4 isn't
> involved.
>
> Vince