You can subscribe to this list here.
2001 |
Jan
(1) |
Feb
|
Mar
(7) |
Apr
(3) |
May
(3) |
Jun
(7) |
Jul
(10) |
Aug
(1) |
Sep
(50) |
Oct
(74) |
Nov
(28) |
Dec
(32) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(63) |
Feb
(27) |
Mar
(88) |
Apr
(21) |
May
(59) |
Jun
(41) |
Jul
(61) |
Aug
(89) |
Sep
(179) |
Oct
(152) |
Nov
(190) |
Dec
(92) |
2003 |
Jan
(140) |
Feb
(160) |
Mar
(193) |
Apr
(107) |
May
(84) |
Jun
(60) |
Jul
(97) |
Aug
(97) |
Sep
(42) |
Oct
(105) |
Nov
(99) |
Dec
(52) |
2004 |
Jan
(99) |
Feb
(97) |
Mar
(62) |
Apr
(73) |
May
(94) |
Jun
(37) |
Jul
(32) |
Aug
(89) |
Sep
(87) |
Oct
(72) |
Nov
(114) |
Dec
(35) |
2005 |
Jan
(25) |
Feb
(42) |
Mar
(120) |
Apr
(151) |
May
(71) |
Jun
(36) |
Jul
(35) |
Aug
(92) |
Sep
(19) |
Oct
(57) |
Nov
(77) |
Dec
(61) |
2006 |
Jan
(107) |
Feb
(114) |
Mar
(66) |
Apr
(101) |
May
(74) |
Jun
(64) |
Jul
(42) |
Aug
(51) |
Sep
(106) |
Oct
(118) |
Nov
(138) |
Dec
(162) |
2007 |
Jan
(148) |
Feb
(222) |
Mar
(73) |
Apr
(160) |
May
(166) |
Jun
(125) |
Jul
(184) |
Aug
(58) |
Sep
(41) |
Oct
(102) |
Nov
(111) |
Dec
(52) |
2008 |
Jan
(104) |
Feb
(67) |
Mar
(48) |
Apr
(125) |
May
(114) |
Jun
(98) |
Jul
(206) |
Aug
(89) |
Sep
(88) |
Oct
(163) |
Nov
(115) |
Dec
(113) |
2009 |
Jan
(131) |
Feb
(85) |
Mar
(157) |
Apr
(198) |
May
(202) |
Jun
(154) |
Jul
(156) |
Aug
(75) |
Sep
(80) |
Oct
(148) |
Nov
(88) |
Dec
(83) |
2010 |
Jan
(78) |
Feb
(59) |
Mar
(89) |
Apr
(54) |
May
(92) |
Jun
(66) |
Jul
(38) |
Aug
(73) |
Sep
(84) |
Oct
(91) |
Nov
(52) |
Dec
(62) |
2011 |
Jan
(86) |
Feb
(68) |
Mar
(129) |
Apr
(121) |
May
(154) |
Jun
(81) |
Jul
(55) |
Aug
(55) |
Sep
(58) |
Oct
(115) |
Nov
(88) |
Dec
(95) |
2012 |
Jan
(105) |
Feb
(62) |
Mar
(52) |
Apr
(54) |
May
(103) |
Jun
(89) |
Jul
(152) |
Aug
(73) |
Sep
(58) |
Oct
(60) |
Nov
(52) |
Dec
(90) |
2013 |
Jan
(102) |
Feb
(63) |
Mar
(68) |
Apr
(128) |
May
(82) |
Jun
(94) |
Jul
(87) |
Aug
(29) |
Sep
(24) |
Oct
(25) |
Nov
(40) |
Dec
(51) |
2014 |
Jan
(41) |
Feb
(60) |
Mar
(33) |
Apr
(22) |
May
(38) |
Jun
(23) |
Jul
(86) |
Aug
(113) |
Sep
(23) |
Oct
(22) |
Nov
(18) |
Dec
(13) |
2015 |
Jan
(40) |
Feb
(12) |
Mar
(28) |
Apr
(32) |
May
(53) |
Jun
(65) |
Jul
(27) |
Aug
(6) |
Sep
(13) |
Oct
(25) |
Nov
(48) |
Dec
(19) |
2016 |
Jan
(5) |
Feb
(10) |
Mar
(23) |
Apr
(31) |
May
(19) |
Jun
(28) |
Jul
(19) |
Aug
(2) |
Sep
(9) |
Oct
(18) |
Nov
(10) |
Dec
(4) |
2017 |
Jan
(23) |
Feb
(42) |
Mar
(13) |
Apr
(5) |
May
(7) |
Jun
(26) |
Jul
(13) |
Aug
(8) |
Sep
(1) |
Oct
(3) |
Nov
(27) |
Dec
(4) |
2018 |
Jan
(9) |
Feb
(22) |
Mar
(27) |
Apr
(16) |
May
(7) |
Jun
(5) |
Jul
(7) |
Aug
(1) |
Sep
(36) |
Oct
(17) |
Nov
(1) |
Dec
(5) |
2019 |
Jan
(1) |
Feb
|
Mar
(11) |
Apr
(4) |
May
(7) |
Jun
(6) |
Jul
(9) |
Aug
(4) |
Sep
(6) |
Oct
(4) |
Nov
(5) |
Dec
(13) |
2020 |
Jan
(60) |
Feb
(57) |
Mar
(4) |
Apr
(71) |
May
(1) |
Jun
(1) |
Jul
(7) |
Aug
(11) |
Sep
(6) |
Oct
|
Nov
(2) |
Dec
|
2021 |
Jan
(42) |
Feb
(2) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
(4) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: John L. <mo...@co...> - 2001-06-03 01:33:47
|
On Sun, 3 Jun 2001, Philippe Elie wrote: > > + { 5, utm_bitmask, { 0x1, 0x2, 0x4, 0x8, 0xf, 0x0 }, }, > > would be > > > + { 4, utm_bitmask, { 0x1, 0x2, 0x4, 0x8, 0x0, 0x0 }, }, > > but I prefer to keep the original code, the two version work but > your code is more near the intel documentation. Note that I've done very minimal testing of these bitmasks etc. so your testing is much appreciated. > I have rather tested than utm_bitmask are really bitmask (for segment > related events and MMX insn related events) and by a carefull reads of > the intel documentation. Other bitmask allowed (MESI state) are more > clearly documented in IA manual as real bit mask wich can be or'ed. OK > I cannot make test for PIII KNI events, my poor PII have some problems > with KNI insns :) if you have time to write a simple test file I can check it on my Celeron Coppermine. Unfortunately I don't have time myself to verify this by hand. > I send in a separate mail with a few comment. great, thanks > I have also patch for interpretation of results, in a few days probably, > (mixing asm and/or source with the samples). This need adding some > new output option to oprofpp cool > I have think about 16384 + but result are more strange. > For example use of counter0, counter1 for count mmx ops > with the same um show results like (real example from oprofpp) : > > counter0 256 counter1 49372. > > not really near 16384 + 256 ... This is odd. It might be something to do with NMIs arriving during an NMI routine ... don't know, this part of the h/w is black magic In my tests with CPU_CLK_UNHALTED I was getting practically identical results from both counters. > but probably your cvs update will fix the problem. it should fix the most obvious problem ;) > I'll will retry and send more precise information,or at least the c > source file wich allow to reproduce the problem, if it persist. thanks > > I'm about to commit the change to CVS, thanks for the bug report. > > uhum, I must learn how to use cvs under windows (no linux modem's driver) sourceforge also provide daily tarballs of the CVS tree I believe, have a look. > Philippe Elie, with a little apprehension for the patch : with non-fixed > font under windows it's look like a very dirty patch... The indentation looks a little screwed, I can easily fix that when applying. The patch looks OK to me, I'll apply and test soon. Thanks ! looking forward to further changes, john -- "Please crack down on the Chinaman's friends and Hitler's commander. Mother is the best bet and don't let Satan draw you too fast. A boy has never wept ... nor dashed a thousand kim. Did you hear me?" - Dutch Schultz |
From: Philippe E. <ph...@cl...> - 2001-06-03 01:11:22
|
> > I prefer diff -u (and also inline in the mail rather than attachment). ok > > Why this : > > + { 7, utm_bitmask, { 0x1, 0x2, 0x4, 0x8, 0x10, 0x20 }, }, > > Why have you changed allowed from 6 to 7 in the second entry here ? That > looks wrong to me. I'll look at the rest of the patch after you respond. A stupid "last minute change" error. Note than : > + { 5, utm_bitmask, { 0x1, 0x2, 0x4, 0x8, 0xf, 0x0 }, }, would be > + { 4, utm_bitmask, { 0x1, 0x2, 0x4, 0x8, 0x0, 0x0 }, }, but I prefer to keep the original code, the two version work but your code is more near the intel documentation. > I would also prefer to leave out utm_error altogether - there is too small > an amount of code here for such a coding error to happen I think. I agree, utm_mandatory could be suppressed also and changed to utm_exclusive, but I think than mandatory um would be silently added by the profiler. > If you're doing this checking, how about adding a check that a utm_exclusive > um really is exclusive ? That would be nice ... /* KNI PIII events */ was utm_bitmask, a typo error. + { 4, utm_exclusive, { 0x0, 0x1, 0x2, 0x3, 0x0, 0x0 }, }, + { 2, utm_bitmask, { 0x0, 0x1, 0x0, 0x0, 0x0, 0x0 }, }, utm_exclusive are really exclusive, because value allowed in exclusive mode entry cannot be or'ed without loss of information. I have rather tested than utm_bitmask are really bitmask (for segment related events and MMX insn related events) and by a carefull reads of the intel documentation. Other bitmask allowed (MESI state) are more clearly documented in IA manual as real bit mask wich can be or'ed. I cannot make test for PIII KNI events, my poor PII have some problems with KNI insns :) I will post two C test-file in near future for segment related event and mmx events but I have no automatic way (script makefile) to make the real test. I use the tcl/tk for preparing the test and some other thing to complete the test. > > This patch looks good otherwise though, thanks for the fix. > > > I have checked also than the documentation from intel that say > > FS, FS (twice time stated) is a typo error it is really FS GS in fact. > > Do you mean you have actually tested this on a real box ? If so, can you > put the change in your next patch ? ;) It's tested (I used "checked" for "tested" in the previous mail) > > I have a tcl/tk interface to start the profiler, and some patch > > in preparation to allow interpretation of profiler result (mixing asm > > or source file with profiling data). This needs some new output > > format from oprofpp. The tcl/tk interface is ready to work I think. > > This sounds really cool, I'd love to see this ! I send in a separate mail with a few comment. I have also patch for interpretation of results, in a few days probably, (mixing asm and/or source with the samples). This need adding some new output option to oprofpp > > I do not really understand why but I get really strange counter 1 > > result from oprofpp it seems to be always : > > 16384 * (real counter 1 result). > > I think you mean 16384 + (real counter 1 result) ? This was a really > stupid bug - I had a function opd_get_count() and never actually > used it ! I have think about 16384 + but result are more strange. For example use of counter0, counter1 for count mmx ops with the same um show results like (real example from oprofpp) : counter0 256 counter1 49372. not really near 16384 + 256 ... but probably your cvs update will fix the problem. I'll will retry and send more precise information,or at least the c source file wich allow to reproduce the problem, if it persist. > I'm about to commit the change to CVS, thanks for the bug report. uhum, I must learn how to use cvs under windows (no linux modem's driver) > > I do not know if it is a good thing to speak about more than > > one subjects in this mail list ? Must I repost more clearly or > > just seperate subject in future? > > no, this is fine ;) just put your patches inline if you can Yes, but I post from windows and I hope than there is no problem of carriage return/linefeed/tabulation translation, I follow the gcc convention, all stuff after my signature is the patch, In case of problem I will send a link to the diff file. cat patch_name | patch -p1 make install have work at home. regards, -- Philippe Elie, with a little apprehension for the patch : with non-fixed font under windows it's look like a very dirty patch... diff -u oprofile-0.0.3/op_events.c oprofile-0.0.3.phe/op_events.c --- oprofile-0.0.3/op_events.c Fri Apr 6 16:16:46 2001 +++ oprofile-0.0.3.phe/op_events.c Sun Jun 3 00:36:09 2001 @@ -38,6 +38,15 @@ #define u8 unsigned char #define uint unsigned int +enum unit_mask_type { + /* useless but required by the hardaware */ + utm_mandatory, + /* one of the value are correct */ + utm_exclusive, + /* value can be combined by bitwise or to form new value */ + utm_bitmask +}; + struct op_event { uint allowed; u8 val; /* event number */ @@ -47,6 +56,7 @@ struct op_unit_mask { uint num; /* number of possible unit masks */ + enum unit_mask_type unit_type_mask; /* up to six allowed unit masks */ u8 um[6]; }; @@ -59,19 +69,19 @@ static struct op_unit_mask op_unit_masks[] = { /* not used */ - { 1, { 0x0, 0x0, 0x0, 0x0, 0x0, 0x0 }, }, + { 1, 0, { 0x0, 0x0, 0x0, 0x0, 0x0, 0x0 }, }, /* MESI counters */ - { 5, { 0x1, 0x2, 0x4, 0x8, 0xf, 0x0 }, }, + { 5, utm_bitmask, { 0x1, 0x2, 0x4, 0x8, 0xf, 0x0 }, }, /* EBL self/any */ - { 2, { 0x0, 0x20, 0x0, 0x0, 0x0, 0x0 }, }, + { 2, utm_exclusive, { 0x0, 0x20, 0x0, 0x0, 0x0, 0x0 }, }, /* MMX PII events */ - { 1, { 0xf, 0x0, 0x0, 0x0, 0x0, 0x0 }, }, - { 6, { 0x1, 0x2, 0x4, 0x8, 0x10, 0x20 }, }, - { 2, { 0x0, 0x1, 0x0, 0x0, 0x0, 0x0 }, }, - { 5, { 0x1, 0x2, 0x4, 0x8, 0xf, 0x0 }, }, + { 1, utm_mandatory, { 0xf, 0x0, 0x0, 0x0, 0x0, 0x0 }, }, + { 6, utm_bitmask, { 0x1, 0x2, 0x4, 0x8, 0x10, 0x20 }, }, + { 2, utm_exclusive, { 0x0, 0x1, 0x0, 0x0, 0x0, 0x0 }, }, + { 5, utm_bitmask, { 0x1, 0x2, 0x4, 0x8, 0xf, 0x0 }, }, /* KNI PIII events */ - { 4, { 0x0, 0x1, 0x2, 0x3, 0x0, 0x0 }, }, - { 2, { 0x0, 0x1, 0x0, 0x0, 0x0, 0x0 }, }, + { 4, utm_exclusive, { 0x0, 0x1, 0x2, 0x3, 0x0, 0x0 }, }, + { 2, utm_bitmask, { 0x0, 0x1, 0x0, 0x0, 0x0, 0x0 }, }, }; static struct op_event op_events[] = { @@ -198,11 +208,27 @@ */ static int op_check_unit_mask(struct op_unit_mask *allow, u8 um) { - u8 i; + uint i, mask; + + switch (allow->unit_type_mask) { + case utm_exclusive: + case utm_mandatory: + for (i=0; i < allow->num; i++) + if (allow->um[i]==um) + return 0; + break; + + case utm_bitmask: + /* Must reject 0 bit mask because it can count nothing */ + if (um != 0) { + mask = 0; + for (i=0; i < allow->num; i++) + mask |= allow->um[i]; - for (i=0; i < allow->num; i++) { - if (allow->um[i]==um) - return 0; + if ((mask & um) == um) + return 0; + } + break; } return 1; @@ -407,7 +433,8 @@ "DS register", "FS register", /* IA manual says this is actually FS again - no mention in errata */ - "FS register", + /* but test show that is really a typo error from IA manual */ + "GS register", "ES,DS,FS,GS registers", NULL, }, }, { { "prefetch NTA", "prefetch T1", |
From: John L. <mo...@co...> - 2001-06-01 23:01:06
|
On Thu, 31 May 2001, Philippe Elie wrote: > Hi, Hi Philippe ! > I suggest the attached patch, it corrects the fact than > in some case the unit_mask is really a bit mask and > all combination of unit mask are acceptable and in other > case unit mask are exclusive value. > > It's a diff -u format. Have you other preference on patch > format ? I prefer diff -u (and also inline in the mail rather than attachment). Why this : - { 1, { 0xf, 0x0, 0x0, 0x0, 0x0, 0x0 }, }, - { 6, { 0x1, 0x2, 0x4, 0x8, 0x10, 0x20 }, }, - { 2, { 0x0, 0x1, 0x0, 0x0, 0x0, 0x0 }, }, - { 5, { 0x1, 0x2, 0x4, 0x8, 0xf, 0x0 }, }, + { 1, utm_mandatory, { 0xf, 0x0, 0x0, 0x0, 0x0, 0x0 }, }, + { 7, utm_bitmask, { 0x1, 0x2, 0x4, 0x8, 0x10, 0x20 }, }, + { 2, utm_exclusive, { 0x0, 0x1, 0x0, 0x0, 0x0, 0x0 }, }, + { 5, utm_bitmask, { 0x1, 0x2, 0x4, 0x8, 0xf, 0x0 }, }, Why have you changed allowed from 6 to 7 in the second entry here ? That looks wrong to me. I'll look at the rest of the patch after you respond. I would also prefer to leave out utm_error altogether - there is too small an amount of code here for such a coding error to happen I think. If you're doing this checking, how about adding a check that a utm_exclusive um really is exclusive ? That would be nice ... This patch looks good otherwise though, thanks for the fix. > I have checked also than the documentation from intel that say > FS, FS (twice time stated) is a typo error it is really FS GS in fact. Do you mean you have actually tested this on a real box ? If so, can you put the change in your next patch ? ;) > I have a tcl/tk interface to start the profiler, and some patch > in preparation to allow interpretation of profiler result (mixing asm > or source file with profiling data). This needs some new output > format from oprofpp. The tcl/tk interface is ready to work I think. This sounds really cool, I'd love to see this ! > I do not really understand why but I get really strange counter 1 > result from oprofpp it seems to be always : > 16384 * (real counter 1 result). I think you mean 16384 + (real counter 1 result) ? This was a really stupid bug - I had a function opd_get_count() and never actually used it ! I'm about to commit the change to CVS, thanks for the bug report. > I do not know if it is a good thing to speak about more than > one subjects in this mail list ? Must I repost more clearly or > just seperate subject in future? no, this is fine ;) just put your patches inline if you can regards, john -- "Faced with the prospect of rereading this book, I would rather have my brains ripped out by a plastic fork." - Charles Cooper on "Business at the Speed of Thought" |
From: Philippe E. <ph...@cl...> - 2001-05-31 20:35:58
|
English is not my natural language... Hi, first thanks for your nice work. ----- I suggest the attached patch, it corrects the fact than in some case the unit_mask is really a bit mask and all combination of unit mask are acceptable and in other case unit mask are exclusive value. It's a diff -u format. Have you other preference on patch format ? ----- I have checked also than the documentation from intel that say FS, FS (twice time stated) is a typo error it is really FS GS in fact. ----- I have a tcl/tk interface to start the profiler, and some patch in preparation to allow interpretation of profiler result (mixing asm or source file with profiling data). This needs some new output format from oprofpp. The tcl/tk interface is ready to work I think. ----- I do not really understand why but I get really strange counter 1 result from oprofpp it seems to be always : 16384 * (real counter 1 result). A features or a problem in packing / unpacking samples ? ----- I do not know if it is a good thing to speak about more than one subjects in this mail list ? Must I repost more clearly or just seperate subject in future? -- Philippe Elie |
From: John L. <mo...@co...> - 2001-05-22 17:15:26
|
On Mon, 21 May 2001, Ray Bryant wrote: > At some point in the past, I recall looking at the oprofile web page and > finding there a discussion about using > the Pentium Performance monitor to generate NMI interrupts. However, in > perusing the oprofile code, I can find > no obvious place where this is enabled (everything seems to be dependent on > having an IO APIC). And the current > web page or documentation makes no mention of this trick. > > Did I make this up or does oprofile support profiling via NMI on UP systems? that's ALL it supports ;) It's dependent on the /local/ APIC, not an IO-APIC - two very different beasts. A local APIC is present in every P6-class CPU. The code sets up the correct registers for delivering the performance counter overflow interrupt in NMI mode. So profiling interrupt handlers is supported out of the box. The most relevant code is smp_apic_setup(). Perhaps that routine is badly named ;) john -- "This is just the kind of crackpot scheme I've been looking to champion!!!" - P.M. Hartke on 6U campaign |
From: Ray B. <ra...@ti...> - 2001-05-21 23:57:31
|
At some point in the past, I recall looking at the oprofile web page and finding there a discussion about using the Pentium Performance monitor to generate NMI interrupts. However, in perusing the oprofile code, I can find no obvious place where this is enabled (everything seems to be dependent on having an IO APIC). And the current web page or documentation makes no mention of this trick. Did I make this up or does oprofile support profiling via NMI on UP systems? Best Regards, Ray Bryant ----------------------------- Linux Performance Analyst Times N Systems 1908 Kramer Ln., Bld.B, Ste.P Austin, TX 78758 512-977-5366 ----------------------------- |
From: John L. <mo...@co...> - 2001-04-20 15:09:38
|
On Fri, 20 Apr 2001, Tobias Hunger wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi there! > > I have a problem running oprofile. It loads fine on my desktop (which is a > SMP mashine, so it won't work), but not on my laptop. Both have PII CPUs. > Here is what may laptop complains about doing a modprobe oprofile: > > /lib/modules/2.4.3/oprofile/oprofile.o: init_module: No such device > Hint: insmod errors can be caused by incorrect module parameters, including > invalid IO or IRQ parameters > /lib/modules/2.4.3/oprofile/oprofile.o: insmod > /lib/modules/2.4.3/oprofile/oprofile.o failed > /lib/modules/2.4.3/oprofile/oprofile.o: insmod oprofile failed Very possibly this is due to the laptop CPUs not having sufficient APIC support. I think I mention this in the docs somewhere. Basically, oprofile can't work because the h/w support is not in the Mobile P6 series. Sorry ! > Any ideas what's wrong? Please CC me in any reply as I can't subscribe to > this list on SF :-( youch ! I'll have to check this myself and file a support request if it doesn't work ... thanks john -- "Do you mean to tell me that "The Prince" is not the set textbook for CS1072 Professional Issues ? What on earth do you learn in that course ?" - David Lester |
From: Tobias H. <to...@be...> - 2001-04-19 23:05:51
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi there! I have a problem running oprofile. It loads fine on my desktop (which is a SMP mashine, so it won't work), but not on my laptop. Both have PII CPUs. Here is what may laptop complains about doing a modprobe oprofile: /lib/modules/2.4.3/oprofile/oprofile.o: init_module: No such device Hint: insmod errors can be caused by incorrect module parameters, including invalid IO or IRQ parameters /lib/modules/2.4.3/oprofile/oprofile.o: insmod /lib/modules/2.4.3/oprofile/oprofile.o failed /lib/modules/2.4.3/oprofile/oprofile.o: insmod oprofile failed Any ideas what's wrong? Please CC me in any reply as I can't subscribe to this list on SF :-( - -- Gruss, Tobias - ------------------------------------------------------------------- Tobias Hunger The box said: 'Windows 95 or better' to...@be... So I installed Linux. - ------------------------------------------------------------------- -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.4 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE6329CVND+cGpk748RApTVAJ9OwA7E8tIB8Z1bQ5K5itk2mfyzmQCfcEhU tGeA8EGmLBpCgIYMhQbh1u4= =4sks -----END PGP SIGNATURE----- |
From: John L. <mo...@co...> - 2001-04-06 15:12:32
|
OProfile 0.0.3 is now available for download at sourceforge. This release of oprofile should be stable, and fixes many bugs from the previously released version. It is still in need of wider testing in different environments however. Additionally, SMP support is still not present. Changes: Many enhancements and bug-fixes have been put in place since oprofile 0.0.2. A partial list is here : Bug fixes --------- * works on Linus mainline 2.4.0 - 2.4.3, and the ac patch series. * kernel thread zombie problem fixed * fixes when binaries are not found * turn off kernel/user counting properly * silly bug fixed to allow setting both counters at once * gprof output format added * daemon is much more resilient now Enhancements ------------ * ioctl replaced with sysctl * oprofpp can be used as "oprofpp -l /path/to/binary" * --verbose option for debugging * much improved map collection - considerably faster * safer defaults * totally re-worked documentation * some example results added * sample files from previous runs are retained unless out of date * option for compiling with kgcc * move to new module location scheme * friendlier op_start script * compiled for i686 now instead of generic i386 * dump performs a /full/ dump * thread wakes up less often * binaries are md5sum checksummed to prevent broken profiles from being created. Also handle this silently when re-starting the profiler. * manpage added thanks john |
From: John L. <mo...@co...> - 2001-03-19 11:54:06
|
On Fri, 16 Mar 2001, Osiris Pedroso wrote: > Now that I have the ability to gather this profile samples, I need to > improve the reporting of it. > > What I was looking for was something like a hierarchical report of usage, > one function name per line, with time taken in samples (totaled per > function) and percentage (of total samples). It should also have the ability > to report the time spent on functions that were called from a function (lets > call them children functions). Something in a format usable in a spreadsheet > program (tab separated maybe). > > All percents adding to a total 100%, all samples adding to total of recorded > samples. > > Something like this: > > Call Count Time (clks) +Kids (clks) % +Kids% > Symbol > ---------------------------------------------------------------------------- > ----------- > > So, before I start, I would like to know if there is anybody already working > on such a project. > > I would like to have some pointers if the data (like total number of > samples) already exists in some form or if it would have to be calculated > from oprofpp output. It would probably make more sense to give this tool the > ability to read the raw sample files. It would certainly be more efficient. > > Appreciate any ideas, > > Osiris Before I release 0.0.3 I intend to get gprof-format output working from oprofpp. I suppose this would give you some of what you ask for above. I'm not too clear on what you're asking for above. Would it be OK to have this output as a static call graph ? It is unfortunately impossible to generate dynamic call graph data, because the profiler doesn't (and cannot) generate information on function calls (remember oprofile doesn't instrument the binaries so it can't tell when a call graph arc is being travelled). Instrumentation of call sites in the binary incurs some definite performance drop. It might be an interesting experiment to use gcc's instrumenting facilities (I think it's called "-finstrument-arcs" or similar) to only track call sites (gprof will track call sites and also do the signal-based profiling). It may turn out this is sufficiently fast ... There is no reason (except performance) not to use both gprof and oprofile at once, by the way ... this can be handy if you need call graph info for your binary but are OK with the shared library data produced by oprofile thanks john -- "For 93 million miles, there is nothing between the sun and my shadow except me. I'm always getting in the way of something..." - Matthew Vanecek |
From: Osiris P. <OPe...@wm...> - 2001-03-16 23:46:31
|
Now that I have the ability to gather this profile samples, I need to improve the reporting of it. What I was looking for was something like a hierarchical report of usage, one function name per line, with time taken in samples (totaled per function) and percentage (of total samples). It should also have the ability to report the time spent on functions that were called from a function (lets call them children functions). Something in a format usable in a spreadsheet program (tab separated maybe). All percents adding to a total 100%, all samples adding to total of recorded samples. Something like this: Call Count Time (clks) +Kids (clks) % +Kids% Symbol ---------------------------------------------------------------------------- ----------- So, before I start, I would like to know if there is anybody already working on such a project. I would like to have some pointers if the data (like total number of samples) already exists in some form or if it would have to be calculated from oprofpp output. It would probably make more sense to give this tool the ability to read the raw sample files. It would certainly be more efficient. Appreciate any ideas, Osiris |
From: Osiris P. <OPe...@wm...> - 2001-03-16 17:00:43
|
Appendix A - Example of how to run OProfiler My machine is a Dell PIII 700Mhz, 128Mb memory running Linux 2.4.2 with OProfile-0.0.3. After downloading OProfile I made sure I had the libraries libpopt.a, libbfd.a libiberty.a and libdl.a, which where located at: /usr/lib/libpopt.a /usr/lib/libbfd.a /usr/lib/libiberty.a /usr/lib/libdl.a I then configured OProfile with the following command: cd /usr/src/linux/oprofile-0.0.3 ./configure --prefix=/usr --with-linux=/usr/src/linuxX86-2.4.2 --with-extra-libs=/usr/lib followed by commands to build the profiler: make make install I then proceeded to start the daemon with the following command: /usr/src/linux/oprofile-0.0.3/dae/op_start --kernel-only=1 --use-cpu=2 \ --ctr0-event=CPU_CLK_UNHALTED --ctr0-count=70000 --ctr0-user=0 --ctr1-event=HW_INT_RX \ --ctr1-count=70000 --ctr1-kernel=1 --ctr1-user=0 --map-file=/usr/src/linuxX86-2.4.2/System.map\ --vmlinux=/usr/src/linuxX86-2.4.2/vmlinux --ignore-myself --verbose I ran a program to calculate prime numbers #include <stdio.h> #include <stdint.h> int main(int argc, char* argv[]) { int i; int j; int l; printf("Prime numbers:\n"); printf("0\n1\n2\n"); for (i = 4; i < UINT32_MAX; i++) { // printf("Checking if %d is prime\n", i); l = i >> 1; for (j = 2; j <= l; j++) { // printf("Dividing by %d\n", j); if ((i % j) == 0) { // printf("%d is not a prime, next!\n", i); goto not_prime; } } printf("%d\n", i); not_prime: } return 0; } I then stopped the daemon and generated results for this run # force dump of profiler results echo 1 >/proc/sys/dev/oprofile/dump sleep 1 # stop the daemon /usr/src/linux/oprofile-0.0.3/dae/op_stop # generate report of functions used /usr/src/linux/oprofile-0.0.3/pp/oprofpp -l -c 1 \ -f /var/opd/samples/}usr}src}linuxX86-2.4.2}vmlinux -V > /tmp/vmlinux.txt As a parting comment, I strongly suggest to save your commands to scripts, since you might get tired and confused which options go where. Also, when profiling things, I always prefer that my steps be reproducible. My slow typing could be enough to skew the data if the scenario being profiled is short enough. |
From: Osiris P. <OPe...@wm...> - 2001-03-16 16:09:23
|
I added the command echo 1 >/proc/sys/dev/oprofile/dump before stopping the daemon and I now got data! Thanks, Osiris |
From: Osiris P. <OPe...@wm...> - 2001-03-16 16:02:22
|
On Thu, 15 Mar 2001, Osiris Pedroso wrote: > Hi, > > I got the CVS version, built it (had a few problems with config along the > way, but worked around them). I am currently running the it under 2.4.2 with > the following command on a Dell 700Mhz PIII: > > op_start --use-cpu=2 --kernel-only=1 --ctr0-event=INST_RETIRED > --ctr0-count=70000 --ctr0-user=0 --ctr1-event=HW_INT_RX --ctr1-count=70000 > --ctr1-kernel=1 --ctr1-user=0 --map-file=/usr/src/linuxX86-2.4.2/System.map > --vmlinux=/usr/src/linuxX86-2.4.2/vmlinux --ignore-myself --verbose > > The Linux 2.4.2 version was built with "-g -ggdb" replacing > "-fomit-frame-pointer" (the way I normally build to get symbols and gdb > support). This image is over 40Mb in size. Sorry, which image ? You do not need to build the kernel with debugging symbols in order to profile it (a special case due to the availability of /proc/ksyms) > Notice though, that although the files show some large number of bytes, the > number of blocks is pretty much minimum. this is normal. The files will fill up as more samples are collected across the binary image. > I then proceed to run > > /usr/src/linux/oprofile-0.0.3/pp/oprofpp -l -c 1 -f > /var/opd/samples/}usr}src}linuxX86-2.4.2}vmlinux -V > /tmp/vmlinux.txt > > This will print all symbols from vmlinux, but all symbols show "(0 samples)" > by them. This is because oprofile is being efficient :) Perhaps the documentation is somewhat lax, but you need to tell oprofile to dump sample data. It will do this periodically anyway, but you can force a dump with : echo 1 >/proc/sys/dev/oprofile/dump Now you should find a *lot* of debugging data in /var/opd/oprofiled.log (since you turned on --verbose) I assume you are using the CVS version now. john p.s. you might like to ask on the oprofile-list instead (subscribe at sourceforge) so this stuff is recorded for posterity (And for others trying to run oprofile) -- "The path you specified contains too many directories. Delete one or more directories or clear the Include Subdirectories checkbox." - Microsoft Office |
From: Osiris P. <OPe...@wm...> - 2001-03-16 15:58:42
|
Hi, I got the CVS version, built it (had a few problems with config along the way, but worked around them). I am currently running the it under 2.4.2 with the following command on a Dell 700Mhz PIII: op_start --use-cpu=2 --kernel-only=1 --ctr0-event=INST_RETIRED --ctr0-count=70000 --ctr0-user=0 --ctr1-event=HW_INT_RX --ctr1-count=70000 --ctr1-kernel=1 --ctr1-user=0 --map-file=/usr/src/linuxX86-2.4.2/System.map --vmlinux=/usr/src/linuxX86-2.4.2/vmlinux --ignore-myself --verbose The Linux 2.4.2 version was built with "-g -ggdb" replacing "-fomit-frame-pointer" (the way I normally build to get symbols and gdb support). This image is over 40Mb in size. Now, this Linux was actually booted from a floppy image, built with "make bzdisk" which I believe is a stripped version of the 40Mb above. Since I booted from floppy, I could not specify "no-hlt" as an option (no lilo in the floppy). During the few minutes I ran the profiler, I did some web surfing, just to gather some data. Sample files are created, as you see below: [root@oplinux samples]# ls -lsa total 226 4 drwx------ 2 root root 4096 Mar 15 07:59 . 1 drwxr-xr-x 3 root root 1024 Mar 15 07:59 .. 3 -rw-r--r-- 1 root root 2534794 Mar 15 07:59 }bin}bash 2 -rw-r--r-- 1 root root 163626 Mar 15 07:59 }bin}login 3 -rw-r--r-- 1 root root 2725314 Mar 15 07:59 }lib}ld-2.1.3.so Notice though, that although the files show some large number of bytes, the number of blocks is pretty much minimum. I then proceed to run /usr/src/linux/oprofile-0.0.3/pp/oprofpp -l -c 1 -f /var/opd/samples/}usr}src}linuxX86-2.4.2}vmlinux -V > /tmp/vmlinux.txt This will print all symbols from vmlinux, but all symbols show "(0 samples)" by them. I also tried some different options like this [root@oplinux mini]# /usr/src/linux/oprofile-0.0.3/pp/oprofpp -s do_anonymous_page -f /var/opd/samples/}usr}src}linuxX86-2.4.2}vmlinux Counter 0 counted INST_RETIRED events (number of instructions retired) with a unit mask of 0x00 (Not set) Counter 1 counted HW_INT_RX events (number of hardware interrupts received) with a unit mask of 0x00 (Not set) Samples for symbol "do_anonymous_page" in image /usr/src/linuxX86-2.4.2/vmlinux Grep'ping for the do_anonymous_page in the the "-l" results yields [root@oplinux mini]# grep anonymous_page /tmp/vmlinux.txt Symbol do_anonymous_page, value 0x1fe90 do_anonymous_page (0 samples) So, I seem to be getting close to it, but not cigar yet. |
From: John L. <mo...@co...> - 2001-03-14 13:44:30
|
Hi, I have just removed the 0.0.2 oprofile tarball from the downloads page. I've done this basically because this old version is very very buggy (less stable than current CVS). For now the CVS tarball is the best version available. I've been busy recently, but when I get a moment to fix gprof-format output, then I will release 0.0.3 Note that SMP will remain broken for some time yet thanks john -- "There once was a hacker from Haifa Who wrote generator of haiku. But an error he made, And the program instead Generates bad limericks. Gosh, how come ?" - AC |
From: John L. <mo...@co...> - 2001-01-10 13:57:47
|
test -- "It is well to remember, my son, that the entire population of the universe, with one trifling exception, is composed of others." - John Andrew Holmes |