|
From: Van S. <Van...@jp...> - 2014-08-16 01:19:34
|
It seems that cachegrind could be extended to do complete precise profiling. This would be more accurate than the traditional statistical profiling. Is it feasible? I don't care if it's slower than watching paint dry. I wouldn't be doing it every day. |
|
From: Philippe W. <phi...@sk...> - 2014-08-16 20:49:37
|
On Fri, 2014-08-15 at 18:19 -0700, Van Snyder wrote: > It seems that cachegrind could be extended to do complete precise > profiling. This would be more accurate than the traditional statistical > profiling. Is it feasible? cachegrind (and callgrind) are not statistical profilers. They a.o. can count each instruction executed. Philippe |
|
From: Julian S. <js...@ac...> - 2014-08-17 07:47:56
|
On 08/16/2014 03:19 AM, Van Snyder wrote: > It seems that cachegrind could be extended to do complete precise > profiling. What do you mean by "complete precise profiling"? Can you clarify? J |
|
From: Philippe W. <phi...@sk...> - 2014-08-17 11:06:15
|
On Sun, 2014-08-17 at 02:47 -0700, Van Snyder wrote: > On Sun, 2014-08-17 at 09:48 +0200, Julian Seward wrote: > > On 08/16/2014 03:19 AM, Van Snyder wrote: > > > It seems that cachegrind could be extended to do complete precise > > > profiling. > > > > What do you mean by "complete precise profiling"? Can you clarify? > > Trace execution of every instruction and account for its execution time, > including the analyses that cachegrind does to model caches. > > Produce two reports, one sorted in decreasing order of the time spent on > each line of source, and another sorted in decreasing order of the time > spent in each basic block, at last basic blocks as they are discovered > by tracing execution. > > I used such a tool to productive advantage thirty years ago. It didn't > matter that the cost was eighty times the cost of running the program on > its own. > > Maybe cachegrind already does this, but the description isn't clear. See cachegrind user manual, indicating e.g. "... I cache reads (Ir, which equals the number of instructions ..." http://www.valgrind.org/docs/manual/cg-manual.html#cg-manual.overview For callgrind, see e.g. the description of --cache-sim=<yes|no> What is not clear is how to compute exact execution time from the various stats (Ir, I1mr, Ilmr, Dr, D1mr, ...). (or maybe rather than exact execution time, correct relative execution time of each program piece). Philippe |
|
From: Van S. <van...@jp...> - 2014-08-17 09:47:27
|
On Sun, 2014-08-17 at 09:48 +0200, Julian Seward wrote: > On 08/16/2014 03:19 AM, Van Snyder wrote: > > It seems that cachegrind could be extended to do complete precise > > profiling. > > What do you mean by "complete precise profiling"? Can you clarify? Trace execution of every instruction and account for its execution time, including the analyses that cachegrind does to model caches. Produce two reports, one sorted in decreasing order of the time spent on each line of source, and another sorted in decreasing order of the time spent in each basic block, at last basic blocks as they are discovered by tracing execution. I used such a tool to productive advantage thirty years ago. It didn't matter that the cost was eighty times the cost of running the program on its own. Maybe cachegrind already does this, but the description isn't clear. Statistical profilers such as gprof put random numbers into a timer, and analyze where the interrupts occur. This isn't usually as accurate. |
|
From: Howard C. <hy...@sy...> - 2014-08-17 15:16:26
|
Philippe Waroquiers wrote: > On Sun, 2014-08-17 at 02:47 -0700, Van Snyder wrote: >> On Sun, 2014-08-17 at 09:48 +0200, Julian Seward wrote: >>> On 08/16/2014 03:19 AM, Van Snyder wrote: >>>> It seems that cachegrind could be extended to do complete precise >>>> profiling. >>> >>> What do you mean by "complete precise profiling"? Can you clarify? >> >> Trace execution of every instruction and account for its execution time, >> including the analyses that cachegrind does to model caches. >> >> Produce two reports, one sorted in decreasing order of the time spent on >> each line of source, and another sorted in decreasing order of the time >> spent in each basic block, at last basic blocks as they are discovered >> by tracing execution. >> >> I used such a tool to productive advantage thirty years ago. It didn't >> matter that the cost was eighty times the cost of running the program on >> its own. >> >> Maybe cachegrind already does this, but the description isn't clear. > See cachegrind user manual, indicating e.g. > "... I cache reads (Ir, which equals the number of instructions ..." > http://www.valgrind.org/docs/manual/cg-manual.html#cg-manual.overview > For callgrind, see e.g. the description of --cache-sim=<yes|no> > > What is not clear is how to compute exact execution time from the > various stats (Ir, I1mr, Ilmr, Dr, D1mr, ...). > (or maybe rather than exact execution time, correct relative execution > time of each program piece). You would need a cycle-accurate machine model, which valgrind is not. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/ |
|
From: Francis G. <fra...@gm...> - 2014-08-17 23:02:11
|
Le 17 août 2014 11:21, "Howard Chu" <hy...@sy...> a écrit : > > Philippe Waroquiers wrote: > > On Sun, 2014-08-17 at 02:47 -0700, Van Snyder wrote: > >> On Sun, 2014-08-17 at 09:48 +0200, Julian Seward wrote: > >>> On 08/16/2014 03:19 AM, Van Snyder wrote: > >>>> It seems that cachegrind could be extended to do complete precise > >>>> profiling. > >>> > >>> What do you mean by "complete precise profiling"? Can you clarify? > >> > >> Trace execution of every instruction and account for its execution time, > >> including the analyses that cachegrind does to model caches. > >> > >> Produce two reports, one sorted in decreasing order of the time spent on > >> each line of source, and another sorted in decreasing order of the time > >> spent in each basic block, at last basic blocks as they are discovered > >> by tracing execution. > >> > >> I used such a tool to productive advantage thirty years ago. It didn't > >> matter that the cost was eighty times the cost of running the program on > >> its own. > >> > >> Maybe cachegrind already does this, but the description isn't clear. > > See cachegrind user manual, indicating e.g. > > "... I cache reads (Ir, which equals the number of instructions ..." > > http://www.valgrind.org/docs/manual/cg-manual.html#cg-manual.overview > > For callgrind, see e.g. the description of --cache-sim=<yes|no> > > > > What is not clear is how to compute exact execution time from the > > various stats (Ir, I1mr, Ilmr, Dr, D1mr, ...). > > (or maybe rather than exact execution time, correct relative execution > > time of each program piece). > > You would need a cycle-accurate machine model, which valgrind is not. You may be interested into marss86 x86 cycle accurate simulator. Cheers, Francis > > -- > -- Howard Chu > CTO, Symas Corp. http://www.symas.com > Director, Highland Sun http://highlandsun.com/hyc/ > Chief Architect, OpenLDAP http://www.openldap.org/project/ > > ------------------------------------------------------------------------------ > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users |