You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
1
(1) |
2
(8) |
3
(7) |
4
(16) |
5
|
|
6
(3) |
7
(4) |
8
(1) |
9
(1) |
10
(4) |
11
(5) |
12
(1) |
|
13
|
14
(4) |
15
(2) |
16
|
17
(2) |
18
(9) |
19
(5) |
|
20
(9) |
21
(7) |
22
(9) |
23
(5) |
24
|
25
(1) |
26
|
|
27
|
28
(1) |
29
(11) |
30
(6) |
31
|
|
|
|
From: Jeremy F. <je...@di...> - 2002-10-02 20:59:49
|
I've noticed that on my laptop the rdtsc calibration often fails with
"impossible MHz". I think this is because the TSC only advances when
there's something actually happening, as part of the power management.
I've attached a patch (against HEAD) to make it spin rather than sleep
for 20ms as part of the calibration. This makes the MHz estimate
accurate and solves the panics, but it does imply that the TSC is not a
reliable timebase for other time measurements, which is a larger problem
to solve.
J
|
|
From: Jeremy F. <je...@go...> - 2002-10-02 20:53:04
|
On Wed, 2002-10-02 at 11:33, Josef Weidendorfer wrote:
What do you think about making data structure cost centers, and
relating them to the functions? Even much more information available ;-)
You mean storing which code touches what memory as part of the profile?
An excellent idea.
More serious: With C++, you have constructors, and that's a nice way
to name malloced areas. Together with some debug info, it should be easy
to give out a list of all C++ classes, and read/write access numbers for each
offset (or with annotated class definition from source).
If the constructor is defined in a shared lib (as for all QT/KDE classes), you
don't even need debug info for this: The object start address is always the
first arg to the constructor, only question: how to detect the object size?
Well, it seems to me that the best name for an allocated block is some
portion of the stack backtrace leading to its allocation. If you want
to parse the mangled names, you can easily tell what the class is, and
group all class instances together. The object size should be easy - it
is the size of the allocated memory, surely?
> I looked at the screenshots and decided it is very pretty, but I haven't
> actually tried it out yet.
>
> I've actually done a first cut of a gprof skin now, which generates
> correctly formed gprof gmon.out files. Unfortunately gprof itself is
> too broken to deal with them (it wants a single histogram array for the
> whole address space; I'm teaching it to work with a sparse array).
Cool.
Sorry, I couldn't follow the discussion.
Can gmon.out hold other events than sample counts?
Do you log calls, too?
Yes, it can record a histogram (in any units/event types you like, but
the standard tools only generate time histograms), entry counts for each
basic block and BB to BB control flow counts. gprof can display output
either on a function-by-function basis or at the basic block level
(including annotating source)
I'm not quite sure I understand the benefit of creating gmon.out files.
Are there other frontends for this format than gprof?
(There's a KProf, but that "only" shows the info from gprof).
I want to add a gmon.out reader for KCachegrind some day for
quick browsing and TreeMap generation for gprof-profiled apps.
I'm doing this work to instrument a piece of software which lots of
developers are working on, most of whom are familiar with gprof.
I also think most developers would welcome a friendly UI like
kcachegrind, so I'm very keen to try it out soon - I just want to get
the basics working first.
I think the cachegrind.out format is quite nice:
Although I added a lot, I still can read the original cachegrind.out
files without problem.
Nick: Can you add some versioning to this format to distinguish
some format variants?
(I added a line "version: xxx").
The gprof format could have been nice, but its somewhat broken. They
extended it to be a tagged format so you can add extra sections - but
forgot to include a length with each tag, so you can't parse the file
unless you understand all the tag types. A lost opportunity there.
> I'm also going to extend the core slightly; I'd like to add some way of
> extracting more information about the segments described in the SegInfo
> list. I'd like to be able to walk the list so I can include a table of
> mapped shared objects and what address range they cover.
A problem here could be the dynamic behaviour of mappings...
Yes, but for now the code I'm instrumenting loads a lot of shared
libraries, but doesn't really unload or reload on the fly.
J
|
|
From: Jeremy F. <je...@go...> - 2002-10-02 20:41:31
|
On Wed, 2002-10-02 at 12:25, Nicholas Nethercote wrote:
On 2 Oct 2002, Jeremy Fitzhardinge wrote:
> At present I'm using a single global, which means that I'll be creating
> spurious edges when there's context switches between threads. The
> obvious place to store the information is in the baseBlock, and have it
> copied to/from the thread state on context switch. I didn't see a
> mechanism for allocating variable space in the baseBlock, nor a way of
> conveniently addressing baseBlock offsets directly. Should I add it?
> Or some other way of storing per-thread information?
Cachegrind stores variable-sized basic-block information. It is pretty
low-level and dirty: it allocates a flat array in which cost centres of
different sizes are all packed in together, with different cost centre
types distinguished by a tag. The basic blocks' arrays are stored in a
hash table.
Yes, I've got that. I have a hash which keeps per-basic-block
information. But what I also want it a hash which keeps a count of
control flow edges between basic blocks. That is, the key of the hash
is not orig_eip, but the tuple (from_bb, to_bb). The way I maintain
this is by inserting an assignment to a global variable "prev_bb" (ie,
code to do prev_bb = cur_eip) just before each JMP instruction
(conditional or otherwise). Then, at the start of each basic block, I
update the edge count structure by looking up (and possibly creating)
the tuple (prev_bb, cur_eip).
The trouble with this scheme is that if the dispatch loop decides that
it is time to switch threads, prev_bb will have been set by the previous
thread, and therefore the control flow graph will have spurious edges
which represent context switches. While this isn't completely
undesirable, it isn't what I want to measure at the moment.
To solve this, prev_bb needs to be a per-thread value rather than a
global one. It seems to me that a clean way of solving this is to
introduce a mechanism which is analogous to VG_(register_*_helper) which
allows a skin to allocate space in the baseBlock, with a change to the
scheduler to save and restore the values on context switch and some way
to generate uInstr code to load and store them.
J
|
|
From: Nicholas N. <nj...@ca...> - 2002-10-02 19:25:22
|
On 2 Oct 2002, Jeremy Fitzhardinge wrote:
> At present I'm using a single global, which means that I'll be creating
> spurious edges when there's context switches between threads. The
> obvious place to store the information is in the baseBlock, and have it
> copied to/from the thread state on context switch. I didn't see a
> mechanism for allocating variable space in the baseBlock, nor a way of
> conveniently addressing baseBlock offsets directly. Should I add it?
> Or some other way of storing per-thread information?
Cachegrind stores variable-sized basic-block information. It is pretty
low-level and dirty: it allocates a flat array in which cost centres of
different sizes are all packed in together, with different cost centre
types distinguished by a tag. The basic blocks' arrays are stored in a
hash table.
Josef's patch uses the same basic mechanisms, but does more complicated
stuff with the hash tables.
So there's not really any built-in mechanism, but you can certainly
allocate yourself some space for each basic block in SK_(instrument). As
for addressing baseBlock offsets directly, I'm not sure what you mean --
the orig_addr is passed in to SK_(instrument); is that not enough? I'm
also not sure how threads ("per-thread information") relate to this.
N
|
|
From: Josef W. <Jos...@gm...> - 2002-10-02 18:32:36
|
Hi, just want to say hello to the Valgrind Developers mailing list... On Wednesday 02 October 2002 18:04, Jeremy Fitzhardinge wrote: > On Wed, 2002-10-02 at 04:31, Nicholas Nethercote wrote: > As for gprof stuff, have you seen Josef Wiedendorfer's Cachegrind pat= ch > and KCachegrind visualisation tool?=20 > (www.weidendorfers.de/kcachegrind/) It contains loads of that sort of > thing, more than my brain can handle in one sitting :) What do you think about making data structure cost centers, and relating them to the functions? Even much more information available ;-) More serious: With C++, you have constructors, and that's a nice way to name malloced areas. Together with some debug info, it should be easy to give out a list of all C++ classes, and read/write access numbers for ea= ch=20 offset (or with annotated class definition from source). If the constructor is defined in a shared lib (as for all QT/KDE classes), = you don't even need debug info for this: The object start address is always the first arg to the constructor, only question: how to detect the object size? > I looked at the screenshots and decided it is very pretty, but I haven't > actually tried it out yet. > > I've actually done a first cut of a gprof skin now, which generates > correctly formed gprof gmon.out files. Unfortunately gprof itself is > too broken to deal with them (it wants a single histogram array for the > whole address space; I'm teaching it to work with a sparse array). Cool. Sorry, I couldn't follow the discussion.=20 Can gmon.out hold other events than sample counts? Do you log calls, too? I'm not quite sure I understand the benefit of creating gmon.out files. Are there other frontends for this format than gprof? (There's a KProf, but that "only" shows the info from gprof). I want to add a gmon.out reader for KCachegrind some day for quick browsing and TreeMap generation for gprof-profiled apps. I think the cachegrind.out format is quite nice: Although I added a lot, I still can read the original cachegrind.out files without problem. Nick: Can you add some versioning to this format to distinguish some format variants? (I added a line "version: xxx"). > I'm also going to extend the core slightly; I'd like to add some way of > extracting more information about the segments described in the SegInfo > list. I'd like to be able to walk the list so I can include a table of > mapped shared objects and what address range they cover. A problem here could be the dynamic behaviour of mappings... > > J > J :-) |
|
From: Jeremy F. <je...@go...> - 2002-10-02 16:24:33
|
On Wed, 2002-10-02 at 04:23, Nicholas Nethercote wrote:
Best way I can think of doing it, which only requires skin changes rather
than core changes, is this: using the `extended_UCode' need, add a new
UInstr PRE_JCC, which gets inserted by SK_(instrument) before conditional
JMPs, evaluates the condition, and calls a C function (or whatever) if
it's true. This would duplicate the condition evaluation but that
shouldn't matter since they're trivial (just checking an EFLAGS bit I
think).
It's a bit nasty that something as simple as this requires a new UInstr...
Well, I've actually come up with a simpler approach. Since what I want
is to get the (from, to) pair for a BB graph edge, I'm simply updating a
global (bb_from) with %EIP before each jump, and then create/update the
edge at the entry to each BB (bb_from, %EIP).
At present I'm using a single global, which means that I'll be creating
spurious edges when there's context switches between threads. The
obvious place to store the information is in the baseBlock, and have it
copied to/from the thread state on context switch. I didn't see a
mechanism for allocating variable space in the baseBlock, nor a way of
conveniently addressing baseBlock offsets directly. Should I add it?
Or some other way of storing per-thread information?
J
|
|
From: Jeremy F. <je...@go...> - 2002-10-02 16:05:05
|
On Wed, 2002-10-02 at 04:31, Nicholas Nethercote wrote:
As for gprof stuff, have you seen Josef Wiedendorfer's Cachegrind patch
and KCachegrind visualisation tool? (www.weidendorfers.de/kcachegrind/)
It contains loads of that sort of thing, more than my brain can handle in
one sitting :)
I looked at the screenshots and decided it is very pretty, but I haven't
actually tried it out yet.
I've actually done a first cut of a gprof skin now, which generates
correctly formed gprof gmon.out files. Unfortunately gprof itself is
too broken to deal with them (it wants a single histogram array for the
whole address space; I'm teaching it to work with a sparse array).
I'm also going to extend the core slightly; I'd like to add some way of
extracting more information about the segments described in the SegInfo
list. I'd like to be able to walk the list so I can include a table of
mapped shared objects and what address range they cover.
J
|
|
From: Nicholas N. <nj...@ca...> - 2002-10-02 11:23:31
|
On 30 Sep 2002, Jeremy Fitzhardinge wrote: > I'm writing a skin to generate gprof-like output, so I need to see all > the edges in the control flow graph. In particular, I'd like to insert > some instrumentation code which is run IFF a conditional branch is > taken. > > I see a few options: > * something to properly represent uInstr sequences with > conditionals within the ucode for one real instruction (ie, > some way of representing jumps to real addresses rather than > simulated addresses). Sounds messy. > * Intercept the jump target address and generate a completely > new piece of code at some place within the simulated address > space. Ugly. > * Introduce a new exceptional value for ebp when it is passed > back into the dispatcher to trigger a call into the skin. > Would need some way to attach some kind of argument values for > the call (encode in %edx?). Seems like the least nasty. > > Any opinions? Best way I can think of doing it, which only requires skin changes rather than core changes, is this: using the `extended_UCode' need, add a new UInstr PRE_JCC, which gets inserted by SK_(instrument) before conditional JMPs, evaluates the condition, and calls a C function (or whatever) if it's true. This would duplicate the condition evaluation but that shouldn't matter since they're trivial (just checking an EFLAGS bit I think). It's a bit nasty that something as simple as this requires a new UInstr... Oh, and apologies for the delay in replying. N |