## Re: [perfmon2] Multiplexing counters

 Re: [perfmon2] Multiplexing counters From: Leonardo Piga - 2012-04-28 17:40:20 ```Hi, tr(i) is the value that I would get on the value counter for "run" if the counters were restarted after printing the previous sample (i-1). The same for v(i) the value that I would get on the "raw" counter if the counters were restarted after printing the sample(i-1). The same applies for te(i), but for the enable value. Pv(i), Ptr(i), Pte(i) are the values reported by the tool, for these numbers, without reseting. Thus, at the first sample(i=1) we have: Pv(1) = v(1); Ptr(1) = tr(1); Pte(1)=te(1) At the second sample (i=2), since the counters are not being reseted, the values are accumulated, thus we have: Pv(2)=v(1)+v(2) Ptr(2)=tr(1)+tr(2) Pte(2)=te(2)+te(2) Where tr(2) is the value that we would get in the field corresponding to the "run" if we had reseted the counter after printing sample 1. The same applies for v(2) and te(2). At the third sample (i=3) Pv(3)=v(1)+v(2)+v(3) Ptr(2)=tr(1)+tr(2)+tr(3) Pte(2)=te(2)+te(2)+te(3) And so on for higher values of i We are interested in the v(i), tr(i), and te(i). These numbers should be used to scale and estimate the actual value of the counter at the sample i and not the Pv(i), Ptr(i), and Pte(i) values as the tool is doing currently. Is it clearer now? Leonardo On Sat, Apr 28, 2012 at 2:12 PM, stephane eranian wrote: > Hi, > > > > On Sat, Apr 28, 2012 at 11:00 AM, Leonardo Piga wrote: >> I figure out the problem what the problem is. >> >> I am not sure if I can consider it a bug on the syst.c or I was just >> making wrong assumption about its output. >> >> Anyway, I wrote a document explaining the problem and possibles solutions. >> > Something is not quite clear to me based on your paper. > What do you mean by tr(i), tr(k)? Is that the value of time_enable, time_running > at sample t? If so, then I don't understand the sum. > time_enable, time_running represent total time since origin of measurement. > At time t, you get t, at time t+1, you get t+1. But maybe I am not > reading this right. > >> If it is actually a bug I can send my patch that samples the actual >> scaled value related to the last sample instead of the value related >> to the whole measurement set. >> >> The document can be found here: >> >> http://lampiao.lsc.ic.unicamp.br/~piga/misc/libpfmSystInformation.pdf >> >> >> >> On Fri, Apr 27, 2012 at 12:51 AM, Leonardo Piga wrote: >>> Hello, >>> >>> I am using libpfm 4.2 to measure about 15 performance counters on a >>> AMD Barcelona CPU. I am able to collect 5 performance counter without >>> multiplexing. >>> >>> However, when multiplexing I am getting "negative" delta values using >>> the syst from perf_examples. >>> >>> Here is an example: >>> >>> Sample n >>> "core" : 0, >>>        "name" : "perf::PERF_COUNT_HW_CPU_CYCLES", >>>        "val"  : 131956093, >>>        "raw"  : 62161538, >>>        "ena"  : 1000618391, >>>        "run"  : 471368743, >>>        "ratio"  : 0.47, >>>        "delta"  : 131956093 >>> >>> >>> Sample n+1 >>> "core" : 0, >>>        "name" : "perf::PERF_COUNT_HW_CPU_CYCLES", >>>        "val"  : 118435137, >>>        "raw"  : 63447168, >>>        "ena"  : 2002214687, >>>        "run"  : 1072611170, >>>        "ratio"  : 0.54, >>>        "delta"  : 18446744073696030660 >>> >>> As you can see Sample_n\$val is greater than Sample_n+1\$val that is why >>> delta is so big (negative actually). (The raw value is growing >>> though). >>> >>> So, I think that I am not doing the measurements in the best way. >>> >>> My questions are: >>> >>> 1)How  does libpfm do the multiplexation? >>> >>> 2) What is the best way to multiplex 15 performance counters in time >>> window of 1 second? Is there any example available with the library to >>> do this? >>> >>> 3) What is the purpose of group (-g option on syst tool)? Can it help >>> on my issue? >>> >>> -- >>> >>> Leonardo >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> perfmon2-devel mailing list >> perfmon2-devel@... >> https://lists.sourceforge.net/lists/listinfo/perfmon2-devel ```

 [perfmon2] Multiplexing counters From: Leonardo Piga - 2012-04-27 03:52:14 ```Hello, I am using libpfm 4.2 to measure about 15 performance counters on a AMD Barcelona CPU. I am able to collect 5 performance counter without multiplexing. However, when multiplexing I am getting "negative" delta values using the syst from perf_examples. Here is an example: Sample n "core" : 0, "name" : "perf::PERF_COUNT_HW_CPU_CYCLES", "val" : 131956093, "raw" : 62161538, "ena" : 1000618391, "run" : 471368743, "ratio" : 0.47, "delta" : 131956093 Sample n+1 "core" : 0, "name" : "perf::PERF_COUNT_HW_CPU_CYCLES", "val" : 118435137, "raw" : 63447168, "ena" : 2002214687, "run" : 1072611170, "ratio" : 0.54, "delta" : 18446744073696030660 As you can see Sample_n\$val is greater than Sample_n+1\$val that is why delta is so big (negative actually). (The raw value is growing though). So, I think that I am not doing the measurements in the best way. My questions are: 1)How does libpfm do the multiplexation? 2) What is the best way to multiplex 15 performance counters in time window of 1 second? Is there any example available with the library to do this? 3) What is the purpose of group (-g option on syst tool)? Can it help on my issue? -- Leonardo ```
 Re: [perfmon2] Multiplexing counters From: Leonardo Piga - 2012-04-28 09:00:49 ```I figure out the problem what the problem is. I am not sure if I can consider it a bug on the syst.c or I was just making wrong assumption about its output. Anyway, I wrote a document explaining the problem and possibles solutions. If it is actually a bug I can send my patch that samples the actual scaled value related to the last sample instead of the value related to the whole measurement set. The document can be found here: http://lampiao.lsc.ic.unicamp.br/~piga/misc/libpfmSystInformation.pdf On Fri, Apr 27, 2012 at 12:51 AM, Leonardo Piga wrote: > Hello, > > I am using libpfm 4.2 to measure about 15 performance counters on a > AMD Barcelona CPU. I am able to collect 5 performance counter without > multiplexing. > > However, when multiplexing I am getting "negative" delta values using > the syst from perf_examples. > > Here is an example: > > Sample n > "core" : 0, >        "name" : "perf::PERF_COUNT_HW_CPU_CYCLES", >        "val"  : 131956093, >        "raw"  : 62161538, >        "ena"  : 1000618391, >        "run"  : 471368743, >        "ratio"  : 0.47, >        "delta"  : 131956093 > > > Sample n+1 > "core" : 0, >        "name" : "perf::PERF_COUNT_HW_CPU_CYCLES", >        "val"  : 118435137, >        "raw"  : 63447168, >        "ena"  : 2002214687, >        "run"  : 1072611170, >        "ratio"  : 0.54, >        "delta"  : 18446744073696030660 > > As you can see Sample_n\$val is greater than Sample_n+1\$val that is why > delta is so big (negative actually). (The raw value is growing > though). > > So, I think that I am not doing the measurements in the best way. > > My questions are: > > 1)How  does libpfm do the multiplexation? > > 2) What is the best way to multiplex 15 performance counters in time > window of 1 second? Is there any example available with the library to > do this? > > 3) What is the purpose of group (-g option on syst tool)? Can it help > on my issue? > > -- > > Leonardo ```
 Re: [perfmon2] Multiplexing counters From: stephane eranian - 2012-04-28 17:12:32 ```Hi, On Sat, Apr 28, 2012 at 11:00 AM, Leonardo Piga wrote: > I figure out the problem what the problem is. > > I am not sure if I can consider it a bug on the syst.c or I was just > making wrong assumption about its output. > > Anyway, I wrote a document explaining the problem and possibles solutions. > Something is not quite clear to me based on your paper. What do you mean by tr(i), tr(k)? Is that the value of time_enable, time_running at sample t? If so, then I don't understand the sum. time_enable, time_running represent total time since origin of measurement. At time t, you get t, at time t+1, you get t+1. But maybe I am not reading this right. > If it is actually a bug I can send my patch that samples the actual > scaled value related to the last sample instead of the value related > to the whole measurement set. > > The document can be found here: > > http://lampiao.lsc.ic.unicamp.br/~piga/misc/libpfmSystInformation.pdf > > > > On Fri, Apr 27, 2012 at 12:51 AM, Leonardo Piga wrote: >> Hello, >> >> I am using libpfm 4.2 to measure about 15 performance counters on a >> AMD Barcelona CPU. I am able to collect 5 performance counter without >> multiplexing. >> >> However, when multiplexing I am getting "negative" delta values using >> the syst from perf_examples. >> >> Here is an example: >> >> Sample n >> "core" : 0, >>        "name" : "perf::PERF_COUNT_HW_CPU_CYCLES", >>        "val"  : 131956093, >>        "raw"  : 62161538, >>        "ena"  : 1000618391, >>        "run"  : 471368743, >>        "ratio"  : 0.47, >>        "delta"  : 131956093 >> >> >> Sample n+1 >> "core" : 0, >>        "name" : "perf::PERF_COUNT_HW_CPU_CYCLES", >>        "val"  : 118435137, >>        "raw"  : 63447168, >>        "ena"  : 2002214687, >>        "run"  : 1072611170, >>        "ratio"  : 0.54, >>        "delta"  : 18446744073696030660 >> >> As you can see Sample_n\$val is greater than Sample_n+1\$val that is why >> delta is so big (negative actually). (The raw value is growing >> though). >> >> So, I think that I am not doing the measurements in the best way. >> >> My questions are: >> >> 1)How  does libpfm do the multiplexation? >> >> 2) What is the best way to multiplex 15 performance counters in time >> window of 1 second? Is there any example available with the library to >> do this? >> >> 3) What is the purpose of group (-g option on syst tool)? Can it help >> on my issue? >> >> -- >> >> Leonardo > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > perfmon2-devel mailing list > perfmon2-devel@... > https://lists.sourceforge.net/lists/listinfo/perfmon2-devel ```
 Re: [perfmon2] Multiplexing counters From: Leonardo Piga - 2012-04-28 17:40:20 ```Hi, tr(i) is the value that I would get on the value counter for "run" if the counters were restarted after printing the previous sample (i-1). The same for v(i) the value that I would get on the "raw" counter if the counters were restarted after printing the sample(i-1). The same applies for te(i), but for the enable value. Pv(i), Ptr(i), Pte(i) are the values reported by the tool, for these numbers, without reseting. Thus, at the first sample(i=1) we have: Pv(1) = v(1); Ptr(1) = tr(1); Pte(1)=te(1) At the second sample (i=2), since the counters are not being reseted, the values are accumulated, thus we have: Pv(2)=v(1)+v(2) Ptr(2)=tr(1)+tr(2) Pte(2)=te(2)+te(2) Where tr(2) is the value that we would get in the field corresponding to the "run" if we had reseted the counter after printing sample 1. The same applies for v(2) and te(2). At the third sample (i=3) Pv(3)=v(1)+v(2)+v(3) Ptr(2)=tr(1)+tr(2)+tr(3) Pte(2)=te(2)+te(2)+te(3) And so on for higher values of i We are interested in the v(i), tr(i), and te(i). These numbers should be used to scale and estimate the actual value of the counter at the sample i and not the Pv(i), Ptr(i), and Pte(i) values as the tool is doing currently. Is it clearer now? Leonardo On Sat, Apr 28, 2012 at 2:12 PM, stephane eranian wrote: > Hi, > > > > On Sat, Apr 28, 2012 at 11:00 AM, Leonardo Piga wrote: >> I figure out the problem what the problem is. >> >> I am not sure if I can consider it a bug on the syst.c or I was just >> making wrong assumption about its output. >> >> Anyway, I wrote a document explaining the problem and possibles solutions. >> > Something is not quite clear to me based on your paper. > What do you mean by tr(i), tr(k)? Is that the value of time_enable, time_running > at sample t? If so, then I don't understand the sum. > time_enable, time_running represent total time since origin of measurement. > At time t, you get t, at time t+1, you get t+1. But maybe I am not > reading this right. > >> If it is actually a bug I can send my patch that samples the actual >> scaled value related to the last sample instead of the value related >> to the whole measurement set. >> >> The document can be found here: >> >> http://lampiao.lsc.ic.unicamp.br/~piga/misc/libpfmSystInformation.pdf >> >> >> >> On Fri, Apr 27, 2012 at 12:51 AM, Leonardo Piga wrote: >>> Hello, >>> >>> I am using libpfm 4.2 to measure about 15 performance counters on a >>> AMD Barcelona CPU. I am able to collect 5 performance counter without >>> multiplexing. >>> >>> However, when multiplexing I am getting "negative" delta values using >>> the syst from perf_examples. >>> >>> Here is an example: >>> >>> Sample n >>> "core" : 0, >>>        "name" : "perf::PERF_COUNT_HW_CPU_CYCLES", >>>        "val"  : 131956093, >>>        "raw"  : 62161538, >>>        "ena"  : 1000618391, >>>        "run"  : 471368743, >>>        "ratio"  : 0.47, >>>        "delta"  : 131956093 >>> >>> >>> Sample n+1 >>> "core" : 0, >>>        "name" : "perf::PERF_COUNT_HW_CPU_CYCLES", >>>        "val"  : 118435137, >>>        "raw"  : 63447168, >>>        "ena"  : 2002214687, >>>        "run"  : 1072611170, >>>        "ratio"  : 0.54, >>>        "delta"  : 18446744073696030660 >>> >>> As you can see Sample_n\$val is greater than Sample_n+1\$val that is why >>> delta is so big (negative actually). (The raw value is growing >>> though). >>> >>> So, I think that I am not doing the measurements in the best way. >>> >>> My questions are: >>> >>> 1)How  does libpfm do the multiplexation? >>> >>> 2) What is the best way to multiplex 15 performance counters in time >>> window of 1 second? Is there any example available with the library to >>> do this? >>> >>> 3) What is the purpose of group (-g option on syst tool)? Can it help >>> on my issue? >>> >>> -- >>> >>> Leonardo >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> perfmon2-devel mailing list >> perfmon2-devel@... >> https://lists.sourceforge.net/lists/listinfo/perfmon2-devel ```
 Re: [perfmon2] Multiplexing counters From: stephane eranian - 2012-05-10 09:33:52 ```Hi, Please send me your patch and I'll look at it next week. Thanks. On Sat, Apr 28, 2012 at 7:39 PM, Leonardo Piga wrote: > Hi, > > tr(i) is the value that I would get on the value counter for "run" if > the counters were restarted after printing the previous sample (i-1). > > The same for v(i) the value that I would get on the "raw" counter if > the counters were restarted after printing the sample(i-1). > > The same applies for te(i), but for the enable value. > > Pv(i), Ptr(i), Pte(i) are the values reported by the tool, for these > numbers, without reseting. > > Thus, at the first sample(i=1) we have: Pv(1) = v(1); Ptr(1) = tr(1); > Pte(1)=te(1) > > At the second sample (i=2), since the counters are not being reseted, > the values are accumulated, thus we have: > Pv(2)=v(1)+v(2) > Ptr(2)=tr(1)+tr(2) > Pte(2)=te(2)+te(2) > > Where tr(2) is the value that we would get in the field corresponding > to the "run" if we had reseted the counter after printing sample 1. > > The same applies for v(2) and te(2). > > At the third sample (i=3) > > Pv(3)=v(1)+v(2)+v(3) > Ptr(2)=tr(1)+tr(2)+tr(3) > Pte(2)=te(2)+te(2)+te(3) > > And so on for higher values of i > > We are interested in the v(i), tr(i), and te(i). These numbers should > be used to scale and estimate the actual value of the counter at the > sample i and not the Pv(i), Ptr(i), and Pte(i) values as the tool is > doing currently. > > Is it clearer now? > > Leonardo > > > On Sat, Apr 28, 2012 at 2:12 PM, stephane eranian > wrote: >> Hi, >> >> >> >> On Sat, Apr 28, 2012 at 11:00 AM, Leonardo Piga wrote: >>> I figure out the problem what the problem is. >>> >>> I am not sure if I can consider it a bug on the syst.c or I was just >>> making wrong assumption about its output. >>> >>> Anyway, I wrote a document explaining the problem and possibles solutions. >>> >> Something is not quite clear to me based on your paper. >> What do you mean by tr(i), tr(k)? Is that the value of time_enable, time_running >> at sample t? If so, then I don't understand the sum. >> time_enable, time_running represent total time since origin of measurement. >> At time t, you get t, at time t+1, you get t+1. But maybe I am not >> reading this right. >> >>> If it is actually a bug I can send my patch that samples the actual >>> scaled value related to the last sample instead of the value related >>> to the whole measurement set. >>> >>> The document can be found here: >>> >>> http://lampiao.lsc.ic.unicamp.br/~piga/misc/libpfmSystInformation.pdf >>> >>> >>> >>> On Fri, Apr 27, 2012 at 12:51 AM, Leonardo Piga wrote: >>>> Hello, >>>> >>>> I am using libpfm 4.2 to measure about 15 performance counters on a >>>> AMD Barcelona CPU. I am able to collect 5 performance counter without >>>> multiplexing. >>>> >>>> However, when multiplexing I am getting "negative" delta values using >>>> the syst from perf_examples. >>>> >>>> Here is an example: >>>> >>>> Sample n >>>> "core" : 0, >>>>        "name" : "perf::PERF_COUNT_HW_CPU_CYCLES", >>>>        "val"  : 131956093, >>>>        "raw"  : 62161538, >>>>        "ena"  : 1000618391, >>>>        "run"  : 471368743, >>>>        "ratio"  : 0.47, >>>>        "delta"  : 131956093 >>>> >>>> >>>> Sample n+1 >>>> "core" : 0, >>>>        "name" : "perf::PERF_COUNT_HW_CPU_CYCLES", >>>>        "val"  : 118435137, >>>>        "raw"  : 63447168, >>>>        "ena"  : 2002214687, >>>>        "run"  : 1072611170, >>>>        "ratio"  : 0.54, >>>>        "delta"  : 18446744073696030660 >>>> >>>> As you can see Sample_n\$val is greater than Sample_n+1\$val that is why >>>> delta is so big (negative actually). (The raw value is growing >>>> though). >>>> >>>> So, I think that I am not doing the measurements in the best way. >>>> >>>> My questions are: >>>> >>>> 1)How  does libpfm do the multiplexation? >>>> >>>> 2) What is the best way to multiplex 15 performance counters in time >>>> window of 1 second? Is there any example available with the library to >>>> do this? >>>> >>>> 3) What is the purpose of group (-g option on syst tool)? Can it help >>>> on my issue? >>>> >>>> -- >>>> >>>> Leonardo >>> >>> ------------------------------------------------------------------------------ >>> Live Security Virtual Conference >>> Exclusive live event will cover all the ways today's security and >>> threat landscape has changed and how IT managers can respond. Discussions >>> will include endpoint security, mobile security and the latest in malware >>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >>> _______________________________________________ >>> perfmon2-devel mailing list >>> perfmon2-devel@... >>> https://lists.sourceforge.net/lists/listinfo/perfmon2-devel ```
 Re: [perfmon2] Multiplexing counters From: Drongowski, Paul - 2012-04-30 13:29:38 ```Hello Leonardo -- Barcelona (Family 10h) has only four physical performance counters. Thus, some other strategy such as multiplexing is required when collecting 5 performance events during a run on Barcelona. As an aside, if people are using Family 15h, 15h has six core performance counters and four Northbridge performance counters. Even though there are six core counters, event assignment is restricted on Family 15h. Thus, you will not always be able to measure six events in a single run without multiplexing. The Family 15h BKDG has the details about event-to-counter assignment and restrictions. -- pj -----Original Message----- From: Leonardo Piga [mailto:leonardo.piga@...] Sent: Thursday, April 26, 2012 11:52 PM To: perfmon2-devel@... Subject: [perfmon2] Multiplexing counters Hello, I am using libpfm 4.2 to measure about 15 performance counters on a AMD Barcelona CPU. I am able to collect 5 performance counter without multiplexing. However, when multiplexing I am getting "negative" delta values using the syst from perf_examples. Here is an example: Sample n "core" : 0, "name" : "perf::PERF_COUNT_HW_CPU_CYCLES", "val" : 131956093, "raw" : 62161538, "ena" : 1000618391, "run" : 471368743, "ratio" : 0.47, "delta" : 131956093 Sample n+1 "core" : 0, "name" : "perf::PERF_COUNT_HW_CPU_CYCLES", "val" : 118435137, "raw" : 63447168, "ena" : 2002214687, "run" : 1072611170, "ratio" : 0.54, "delta" : 18446744073696030660 As you can see Sample_n\$val is greater than Sample_n+1\$val that is why delta is so big (negative actually). (The raw value is growing though). So, I think that I am not doing the measurements in the best way. My questions are: 1)How does libpfm do the multiplexation? 2) What is the best way to multiplex 15 performance counters in time window of 1 second? Is there any example available with the library to do this? 3) What is the purpose of group (-g option on syst tool)? Can it help on my issue? -- Leonardo ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ perfmon2-devel mailing list perfmon2-devel@... https://lists.sourceforge.net/lists/listinfo/perfmon2-devel ```