Menu

#232 opreport: incorrect event info when using named unit mask

None
closed
None
5
2015-05-26
2013-01-02
No

There is a problem in opreport where incorrect event information is printed out if using a named unit mask. Wrong event info is printed whether opcontrol or operf is used to collect the profile. For example, on an Intel Sandybridge, I profiled with uops_retired, using both "0x2" and "total_cycles" for unit mask values. When using '0x2' for the unit mask, the header info printed by opreport is:

---------------------------------
CPU: Intel Sandy Bridge microarchitecture, speed 2.401e+06 MHz (estimated)
Counted uops_retired events (uops that actually retired.) with a unit mask of 0x02
(retire_slots number of retirement slots used non PEBS) count 2000000
---------------------------------

When using "total_cycles" for the unit mask, the header info printed by opreport is:

---------------------------------
CPU: Intel Sandy Bridge microarchitecture, speed 2.401e+06 MHz (estimated)
Counted uops_retired events (uops that actually retired.) with a unit mask of 0x00 (multiple flags) count 2000000
Note the incorrect unit mask info--^
---------------------------------

This incorrect unit mask info of '0x00' is happening because opreport determines what event(s) was(were) being profiled by inspecting the header file created in the samples directory. It gets the event code and numerical unit mask values and calls op_find_event(). There's currently no way to save a unit mask string in the samples header; if there were, we would then need a new version of op_find-event() that would take a um string instead of a um number.

Besides the downside of opreport not showing correct event information, there's a more serious problem. The samples are stored in sample files named something like:
uops_retired.2000000.0.all.all.all
^-- unit mask 0

So if you were to do the following setup of oprofile:
opcontrol --event=uops_retired:2000000:stall_cycles --event=uops_retired:2000000:total_cycles
and then profile some app, *all* samples end up getting stored in the single sample file mentioned above (uops_retired.2000000.0.all.all.all), and the opreport header indicates that only one event was profiled -- the uops_retired event with unit mask 0.

Discussion

  • Maynard Johnson

    Maynard Johnson - 2013-01-24

    I posted a patch to the oprofile-list today (subject "[PATCH] Fix opreport header info on unit mask when operf is run without a UM specified) that partially addresses this issue.

     
  • Maynard Johnson

    Maynard Johnson - 2013-12-09
    • assigned_to: Carl Love
    • Group: -->
     
  • Maynard Johnson

    Maynard Johnson - 2013-12-09
    • assigned_to: Carl Love --> Suravee Suthikulpanit
     
  • Maynard Johnson

    Maynard Johnson - 2013-12-09

    Suravee,
    Carl Love was going to work on this issue, but has been pulled away to work on a different project for now. Since this bug only affects Intel, it probably makes more sense for you (our x86* sub-maintainer) to handle it anyway. Can you find time to look at this please? Thanks.

     
  • William Cohen

    William Cohen - 2015-05-26

    The patch was commit to the oprofile git repository 2013-01-28. Did a check to verify that the current oprofile keeps those events separate. According to the following output it appears to be working:

    $ uname -a
    Linux comotion 3.10.0-229.4.2.el7.x86_64 #1 SMP Fri Apr 24 15:26:38 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux

    $ operf --event=uops_retired:2000000:stall_cycles --event=uops_retired:2000000:total_cycles make
    ...
    $ opreport|more
    CPU: Intel Ivy Bridge microarchitecture, speed 3700 MHz (estimated)
    Counted uops_retired events (Retired uops.) with a unit mask of 0x1c00100 (mult
    iple flags) count 2000000
    Counted uops_retired events (Retired uops.) with a unit mask of 0x10c00100 (mul
    tiple flags) count 2000000
    uops_retired:2...|uops_retired:2...|
    samples| %| samples| %|


    45312 88.0443     75257 86.0296 cc1plus
    uops_retired:2...|uops_retired:2...|
      samples|      %|  samples|      %|
    ------------------------------------
        39087 86.2619     65140 86.5567 cc1plus
         3862  8.5231      6626  8.8045 libc-2.17.so
         2296  5.0671      3385  4.4979 kallsyms
           45  0.0993        57  0.0757 xfs
           15  0.0331        35  0.0465 libgmp.so.10.2.0
            3  0.0066         5  0.0066 kvm
            3  0.0066         8  0.0106 ld-2.17.so
            1  0.0022         0       0 libmpc.so.3.0.0
            0       0         1  0.0013 libata
     2774  5.3901      5741  6.5628 as
    
     
  • William Cohen

    William Cohen - 2015-05-26
    • status: open --> closed
     

Log in to post a comment.