perfctr-devel Mailing List for Linux Performance Counters Driver (Page 60)
Brought to you by:
mikpe
You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
(2) |
Apr
(2) |
May
(2) |
Jun
(2) |
Jul
(6) |
Aug
(12) |
Sep
(3) |
Oct
(4) |
Nov
(2) |
Dec
(1) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(1) |
Feb
|
Mar
(6) |
Apr
(11) |
May
(7) |
Jun
(8) |
Jul
(3) |
Aug
(11) |
Sep
(4) |
Oct
(14) |
Nov
(2) |
Dec
(7) |
2003 |
Jan
(6) |
Feb
(9) |
Mar
(32) |
Apr
(10) |
May
(13) |
Jun
(10) |
Jul
(16) |
Aug
(21) |
Sep
(3) |
Oct
(17) |
Nov
(2) |
Dec
(8) |
2004 |
Jan
(5) |
Feb
(12) |
Mar
(16) |
Apr
(8) |
May
(26) |
Jun
|
Jul
(9) |
Aug
(3) |
Sep
(10) |
Oct
(5) |
Nov
(5) |
Dec
(2) |
2005 |
Jan
(3) |
Feb
(9) |
Mar
(14) |
Apr
(6) |
May
(9) |
Jun
(5) |
Jul
(44) |
Aug
(68) |
Sep
(87) |
Oct
(55) |
Nov
(48) |
Dec
(34) |
2006 |
Jan
(33) |
Feb
(20) |
Mar
(30) |
Apr
(28) |
May
(13) |
Jun
(34) |
Jul
(17) |
Aug
(35) |
Sep
(12) |
Oct
(17) |
Nov
(8) |
Dec
(11) |
2007 |
Jan
(17) |
Feb
(19) |
Mar
(2) |
Apr
(12) |
May
(3) |
Jun
(6) |
Jul
(19) |
Aug
(47) |
Sep
(25) |
Oct
(26) |
Nov
(18) |
Dec
(19) |
2008 |
Jan
(15) |
Feb
(27) |
Mar
(53) |
Apr
(32) |
May
(21) |
Jun
(13) |
Jul
(29) |
Aug
(10) |
Sep
(16) |
Oct
(16) |
Nov
(2) |
Dec
(84) |
2009 |
Jan
(12) |
Feb
(22) |
Mar
(26) |
Apr
(38) |
May
(28) |
Jun
(18) |
Jul
(47) |
Aug
(14) |
Sep
(8) |
Oct
(25) |
Nov
(17) |
Dec
(20) |
2010 |
Jan
(12) |
Feb
(14) |
Mar
(25) |
Apr
(10) |
May
(5) |
Jun
(9) |
Jul
(14) |
Aug
(19) |
Sep
(10) |
Oct
(7) |
Nov
(4) |
Dec
(3) |
2011 |
Jan
(2) |
Feb
(5) |
Mar
(2) |
Apr
(4) |
May
(1) |
Jun
(4) |
Jul
(1) |
Aug
(2) |
Sep
|
Oct
(4) |
Nov
(1) |
Dec
(3) |
2012 |
Jan
(2) |
Feb
|
Mar
(3) |
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(1) |
Nov
(5) |
Dec
(1) |
2016 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
(1) |
Jun
(1) |
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
(1) |
Dec
|
2017 |
Jan
|
Feb
(1) |
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
(1) |
Dec
|
2018 |
Jan
|
Feb
(3) |
Mar
(11) |
Apr
|
May
(2) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
|
2019 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2020 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
2023 |
Jan
|
Feb
|
Mar
(5) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Mikael P. <mi...@cs...> - 2001-06-17 17:35:57
|
I've made a perfctr-1.6-update-2 release of my Linux/x86 performance-monitoring counters driver, for the benefit of PAPI users stuck with this version. It's available at the usual place: <http://www.csd.uu.se/~mikpe/linux/perfctr/>. Version 1.6-update-2, 2001-06-17 - Backported the CONFIG_PERFCTR_DEBUG configuration option from perfctr-2.0-pre5; this is intended to help debug a problem at one particular site. - New patches for kernels 2.4.3, 2.4.4, and 2.4.5. - Added information about the perfctr-devel mailing list to README. / Mikael Pettersson |
From: Mikael P. <mi...@cs...> - 2001-06-11 00:05:48
|
perfctr-2.0-pre5 is now available at the usual place: <http://www.csd.uu.se/~mikpe/linux/perfctr/>. The reason I'm not calling this "2.0 final" is that I have two problem reports which I wanted to try to resolve first: - One user has observed extreme variations in final counts when using perfex to monitor an application on an SMP box; the same application has deterministic counts on a UP box. (perfctr-2.0-pre4, 2.4 kernel, 4-way Xeon SMP, Xeon UP) - Another user has reported kernel crashes when using perfctr together with "pvfs" and Myrinet "gm" drivers. Perfctr works fine if run without pvfs/gm, and pvfs/gm work fine if run without perfctr. It looks as if pvfs/gm clobbers the current process' "perfctr" pointer, although we haven't yet been able to confirm this or explain why it happens. (perfctr-1.6-update-1, 2.2 kernel, cluster of P6 UP boxes) 2.0-pre5 has a CONFIG_PERFCTR_DEBUG option to enable some internal consistency checks which I hope will help debug the above-mentioned problems. I will make a perfctr-1.6-update-2 release later this week with the debug option added and new patches for recent 2.4 kernels. Version 2.0-pre5, 2001-05-11 - Structure layout changes to reduce sampling overheads. The ABI changed slightly, but I hope this is the last such change for some time. - Fixed two bugs related to the interaction of interrupt-mode perfctrs and the lazy EVNTSEL MSR update cache in the low-level driver. (Interrupt-mode support is still disabled in the high-level drivers, however.) - Fixed a bug in examples/perfex where it forgot to initialise the pmc_map[] control field. This caused the driver to refuse attempts to use more than one counter. The current fix is for P6/K7 only; a general "fixup" procedure will be added to the user-space library later. - Added a CONFIG_PERFCTR_DEBUG option to enable some internal consistency checking in the driver. This is a temporary measure intended to help debug two open problem reports. / Mikael Pettersson |
From: Philip J. M. <mu...@cs...> - 2001-05-01 05:49:55
|
Hi Mikael, I'm all in favor of making the TSC sampling an option. I'm assuming by 'sampling' you mean, reading the counter into the page shared by the kernel and the user process. I think it costs around 23 or 24 cycles for rdtsc(). My guess is that anyone doing performance measurements will want to get access to the 'actual' process time held by these counters. So if you access the device, it's fair game to start sampling the counters. My 2.3 cents. -Phil |
From: Mikael P. <mi...@cs...> - 2001-05-01 00:11:52
|
perfctr-2.0-pre4 is now available at the usual place: http://www.csd.uu.se/~mikpe/linux/perfctr/. The new API is in place and seems to work fine. The split of the old "status" field in three (tsc_on, nrctrs, and nrictrs) annoys me and I may change this particular detail slightly. Apart from this and the non-functioning signal-on-overflow facility, I consider this to be fairly close to perfctr-2.0 final. A question: should TSC sampling be mandatory? How often do people monitor the performance counters while ignoring the TSC? Sampling the TSC costs cycles, but not sampling it also has costs (it forces my library to fall back to the system call when sampling). Version 2.0-pre4, 2001-04-30 - Some module usage accounting changes which should make automatic module loading and unloading more robust in 2.2 kernels. - Internal cleanups and a few minor bug fixes. - Some API naming changes, and O_CREAT can now be used to control whether opening /proc/self/perfctr should create and attach a vperfctr or not. - The user-space library has been updated for the new API. pmc_map[] is used to map from "virtual counter i" to an actual PMC index to be used by RDPMC -- the VIA Cyrix III / C3 is now able to sample in user-space even though it has no PMC(0). The layout of pmc_map[] is CPU-specific; see x86.c for details. Since TSC sampling is specified explicitly now, perfctr_cpu_nrctrs() has been changed to return the number of performance counters _excluding_ the TSC. - The example programs have been updated for the new API, with the exception of signal.c which is still non-functional. - The perfex.c example works better now that the API has a consistent one-evntsel-per-counter model even for Intel P5-like CPUs. - The global.c example has been fixed to not cause a division by zero on WinChip CPUs lacking a working TSC. / Mikael Pettersson |
From: Mikael P. <mi...@cs...> - 2001-04-17 00:29:30
|
perfctr-2.0-pre3 is now available at the usual place: http://www.csd.uu.se/~mikpe/linux/perfctr/. This is a snapshot with a preliminary implementation of the new API in place. The user-space stuff has not been updated and will NOT work. Version 2.0-pre3, 2001-04-17 - Preliminary implementation of the new data structures and API is in place. The user-space components have not yet been updated. Interrupt-mode virtual perfctrs have been disabled pending completion of necessary CPU driver support. - Now uses "VIA_C3" as the family name for both the VIA C3 and the slightly older VIA Cyrix III processors. "VIA_CYRIX_III" was just too clumsy and confusing. (It's not a Cyrix at all.) - Fixed etc/perfctr-events.tab to make Cyrix' event codes agree with reality rather than with the Cyrix manuals. The manuals ignore the fact that the 7-bit event codes are stored in two distinct bit fields in the CESR. / Mikael Pettersson |
From: Mikael P. <mi...@cs...> - 2001-04-07 22:16:03
|
perfctr-2.0-pre2 is now available at the usual place: http://www.csd.uu.se/~mikpe/linux/perfctr/. This is primarily intended as a consistent snapshot before I put the new API in place. Version 2.0-pre2, 2001-04-07 - Removed automatic inheritance of per-process virtual perfctrs across fork(). Unless wait4() is modified, it's difficult to communicate the final values back to the parent: the now abandoned code did this in a way which made it impossible to distinguish one child's final counts from another's. Inheritance can be implemented in user-space anyway, so the loss is not great. The interface between the driver and the rest of the kernel is now smaller and simpler than before. - Dropped support for kernels older than 2.2.16. - Preliminary support for the VIA C3 processor. / Mikael Pettersson |
From: Mikael P. <mi...@cs...> - 2001-03-28 22:21:20
|
I've made a perfctr-1.6-update-1 release of my Linux/x86 performance-monitoring counters driver, for the benefit of PAPI users stuck with perfctr-1.6. It's available at the usual place: http://www.csd.uu.se/~mikpe/linux/perfctr/. Version 1.6-update-1, 2001-03-28 - Maintenance update for PAPI. - Updated patches for kernels 2.2.18, 2.2.19, 2.4.1, and 2.4.2. - Backported a few minor bug fixes from perfctr-2.0-pre1. [Including a fix for users of the Portland C compiler.] / Mikael Pettersson |
From: Mikael P. <mi...@cs...> - 2001-03-28 15:29:36
|
Ok, the perfctr-2.0-pre series has begun. Here are some proposed changes I'm planning to implement: 1. To support asymmetric CPUs better (think: future Pentium 4 support), enlarge the number of available counters, and introduce a mapping from virtual counter to actual PMC. Something like: #define PERFCTR_MAX_COUNTERS 18 /* for P4 */ struct perfctr_sum_counters { unsigned long long tsc; /* no longer a pseudo-PMC */ unsigned long long pmc[18]; }; struct perfctr_desc { unsigned pmc; /* which physical PMC to map to */ unsigned evntsel; /* one per counter, even on Pentium Classic */ unsigned evntsel_aux; /* ESCR val if P4 */ unsigned ireset; /* init/reset val if ctr is interrupt mode */ }; struct perfctr_control { /* Public fields: */ unsigned tsc; /* boolean on or off */ unsigned nrctrs; /* # of PMCs */ unsigned nrictrs; /* # of PMCs in interrupt mode, nrictrs <= nrctrs */ struct perfctr_desc ctrs[18]; /* first nrctrs-nrictrs are in accumulate mode, last nrictrs are in interrupt mode */ /* HW driver private fields last: we use this to "compile" parts of the public setup above to easier to access CPU-specific data here. For instance, precompute a single CESR value for P5, or precompute PMC -> ESCR MSR mapping for Pentium 4. */ ... }; struct vperfctr_state { /* what you see in an mmap():ed vperfctr */ int status; /* on (+1), off (0), or dead (-1) */ struct perfctr_sum_ctrs sum; struct perfctr_low_ctrs start; unsigned long pc; /* where interrupt was detected */ int si_signo; /* signal to send on interrupt */ int si_code; /* si_code for signal */ struct perfctr_control control; /* last element due to varying size */ }; struct vperfctr_control_arg { /* for vperfctr's CONTROL ioctl */ int si_signo; int si_code; unsigned nrctrs, nrictrs; struct perfctr_desc *ctrs; }; + You can skip TSC or any PMC(i) if you don't want it. Useful for P4. - There are several fixed-size arrays in the data structures. OTOH, making them variable-sized causes other problems so I'm not sure I want to go that way. + Once this change is in place, I don't think Pentium 4 support will require any further API changes -- good for compatibility. - This still doesn't do buffering of overflow interrupts. Due to the variable-sized arrays I don't know where to put that buffer. Perhaps a second mmap():ed page? 2. Drop the automagic inheritance of perfctrs across fork(). There is a new example/perfex/perfex.c in perfctr-1.9.x/2.0-pre which shows how inheritance can be done in user-space. (Will be cleaned up and put in the library.) The kernel-based inheritance has the advantage of letting you run an unmodified binary and it will tell you the # of events that occurred in it and its child processes. The user-space mechanism tells you the number of events that occurred in the top-most process, but nothing about any child processes -- UNLESS you modify the application's code or play LD_PRELOAD tricks. Dropping the kernel-based mechanism would eliminate a major ugly wart from the interface between the driver and the rest of the kernel. 3. Phase out support for really old kernels, first < 2.2.16, then < 2.2.18. To maximise the chance of getting this into 2.4 or 2.5, I want to minimise the kludges present to accomodate old 2.2 kernels. Comments? /Mikael |