You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
1
(1) |
2
(8) |
3
(7) |
4
(16) |
5
|
|
6
(3) |
7
(4) |
8
(1) |
9
(1) |
10
(4) |
11
(5) |
12
(1) |
|
13
|
14
(4) |
15
(2) |
16
|
17
(2) |
18
(9) |
19
(5) |
|
20
(9) |
21
(7) |
22
(9) |
23
(5) |
24
|
25
(1) |
26
|
|
27
|
28
(1) |
29
(11) |
30
(6) |
31
|
|
|
|
From: Jeremy F. <je...@go...> - 2002-10-10 19:27:23
|
At the moment, not many skins have client callbacks (only memcheck, I
think). However, if other skins follow memcheck's example of starting
callbacks at VG_USERREQ__FINAL_DUMMY_CLIENT_REQUEST + 1, they will all
end up with overlapping numbers.
This means that if a particular program under study has callbacks for
different skins, they will end up doing the wrong thing if run with the
wrong skin. It seems to me that there needs to be a systematic way for
skins to distinguish their callback numbers from each other. Maybe
encode some kind of skin ID into the top 16 bits of the request number?
J
|
|
From: Jeremy F. <je...@go...> - 2002-10-10 19:20:27
|
My skin needs to register baseblock helpers after parsing its command
line options (so that it knows what it needs to do). Unfortunately the
baseblock is currently being set up before calling SK_(post_clo_init).
This patch rearranges things.
J
|
|
From: Josef W. <Jos...@gm...> - 2002-10-10 10:31:32
|
On Thursday 10 October 2002 03:15, Jeremy Fitzhardinge wrote: > On Mon, 2002-10-07 at 03:38, Josef Weidendorfer wrote: > > So my question: Do you think this exact logging is usefull ? > > It seems to me that you're confusing two levels of abstraction: the > general call-graph summary of program execution vs. an exact trace of > program execution. If I understand what you're saying correctly, you I'm still not convinced that I'm mixing these levels ;-) I simple have a more exact call graph than gprof as input: In addition, I already get the weights for the call arcs (directed edges)= =2E The gprof algorithm calculates estimations for these by using accumulated= self=20 costs and distributing these costs to the graph edges using call counts (= by=20 using a virtual value "cost per call"). Why should I throw away the more=20 exact weights and recalculate an estimation? Cycles need special handling. And this was the source of my problem, as I= =20 didn't handle it specially. This is the postprocessing I have to do in KCachegrind: Cycles are in fac= t=20 superfunctions. I have to handle the real function in them like BBs of no= rmal=20 functions, and I'm fine again. That is, I throw away call arcs inside of=20 cycles. A nice side effect of logging call costs already in Cachgrind: I let the user create trace parts: He can decide when to make a dump (I k= now=20 of at least one successful use of this feature because of a user report..= =2E). That way I get call arcs with call count 0 (these are "active" calls). Th= e=20 gprof algorithm would calculate a weight of 0 for this arc, which is=20 obviously wrong... > say that for each invocation of function A, you record the exact counte= r > deltas for that invocation, and presumably the caller. You can > therefore say that every time B calls A, the sum of the counter deltas > is X. However, by only recording the immediate caller, you're already X is the weight of the call arc from B to A in the general call graph. > throwing information away, and I suspect this lack of information is > causing the trouble. If I visualize the children of a call A=3D>B, I use the cumulative cost o= f this=20 call for the area size and scale down the inner visualisation of whole A = as I=20 don't have the call costs from B when called from A. This of course is a = pure=20 estimation... If a would log call costs not only depending on the immediate caller, but= also=20 on the caller of caller, I still don't think I'm mixing abstraction level= s: For each function, I simple have many call graph nodes, depending on the=20 caller: I don't have the self costs of these nodes, but these can be=20 calculated from the edge weights. > I think you may be better off by either just adopting the gprof > algorithm (certainly for the purposes of visualising what's going on), > and/or implement a complete program trace for more detailed > post-processing. You may find the paper "Encoding Program Executions" = a > useful source of ideas > (http://citeseer.nj.nec.com/reiss01encoding.html). Thanks. That could be a project for the future... You will need ways to keep the traced data low, and that's another topic. And this has nothing to do with KCachegrind, as I can't see a usage for a= =20 TreeMap. > > Note: All the hassle with recursion detection in my calltree patch wo= uld > > not be needed if I don't log exact call costs. But on the other side:= I > > already have everything in place now in my patch. > > How much effort do you put into detecting recursion? I still think > you'd be better off just recording everything and leave all analysis to > post-processing. For each active function (i.e. currently called somewhere on the stack), = I=20 have a CallEntry struct, put into a hash table with function start addres= s as=20 key. This way I can lookup very fast on each call happening if this call = is a=20 recursive one. In a CallEntry, I have a recursion count for the function,= and=20 thus can remove the CallEntry if the the recursion count reaches zero on=20 function return. The problem with this concept are returns from functions: I can't be sure= =20 these are done by executing a RET instruction. A program can do a longjmp= =20 (e.g. C++ exception handling) and modify the stack pointer in arbitrary w= ays,=20 thus returning implicit from active functions. So the above algorithm is a little fragile: So I need to have my own call= =20 stack (an array of elements, consisting of a pointer to a CallEntry struc= t=20 and the real ESP at call time). At each CALL/RETURN instruction I keep th= e=20 real stack with my stack in sync by looking at the stored/real ESP value = and=20 popping (returning from functions) as needed. > > Regarding profiling performance: Even without a need for recursion > > detection I think I will need hash table lookups... > > Is that the expensive part of doing the trace? Don't know ;-) I'm adding around 20% to pure cachegrind. > TreeMap doesn't need to be exact, it just needs to be an accurate guide > for a programmer to tell where the time is going. Yes. But it can make a difference if weigths are distributed the wrong wa= y=20 among call arcs, and the TreeMap shows call weights, not self costs. > > I'm wondering if I should abstract from the term "function" in > > KCachegrind. It makes a lot of sense to handle BBs the same way (Jere= my: > > I wondered why you do profiling on a BB granularity!): Jumps between = BBs > > are in fact calls with implicit return on funtion return: If there ar= e > > loops in the function, we have lot of "cycles" (i.e. recursive BB cal= ls) > > for the BBs of a function. "Cumulative cost" of a BB is the cost unti= l > > the end of the function. We even can extend this abstraction, not onl= y > > down (from functions to BBs), but also up to C++ classes and ELF obje= cts: > > It makes sense to see cumulative costs of all methods of a class, or > > calls among classes; the same holds for ELF objects. > > What do you think about this? > > Well, I've been putting some effort into my skin to try to distinguish > function calls from other kinds of jumps, simply because recording What's the difficulty? > everything at the BB level generates too much information to be > reasonably used, and despite the claims in the documentation, gprof > generates almost useless output as a result (if you record the call and > return of A calling B as separate edges, prof shows it as A calls B who > calls A). If you handle BBs as seperate nodes in a call graph of BBs, you get a LOT= of=20 cycles, and not really much in addition to the coarser function call grap= h, I=20 suppose? > So, I'm now just recording calls, and also only instrumenting basic > blocks in the text section (so that I don't get spurious edges calling = a > shared library, because it calls into the PLT, which then jumps into th= e > function in the target library). I already solved this by trapping jumps to "_dl_runtime_resolve" (this is= the=20 function first called from a PLT stub), and attributing all the cost of t= he=20 PLT entrie to the real function. After relocation, you have a jump to the= =20 shared library function from the PLT entry: I have an alias hash which ha= s=20 the relations between this stubs and sh.lib. functions... > I think recording to the BB level makes a lot more sense with better > tools to interpret the output, so it definitely makes sense for you to > consider the idea. A TreeMap visualisation of the BBs of a function could be cool... You would see all the loops happening, and the loop bodies as nested BBs. But you always need correct mapping of BB to source lines: So entitling t= he=20 areas for BBs with the source code lines ?! > But don't get too elaborate. The idea of a profiling tool is to > generate a comprehensible abstraction of program execution, not a > complete encoding of what happened. I'm sure not looking at exact tracing of program execution ;-) Josef |