Re: [Valgrind-developers] suggestion for cachegrind/callgrind: finer-grained profiling

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

> of myfunc() influence its runtime. Callgrind would allow you to embed
> information for bins of this parameter setting into the function name.
> E.g. for myfunc(int a), the profile results would show you two functions:
>
>  "myfunc:a>=5" : 90 calls, 1 billion instructions executed
>  "myfunc:a<5"  :  5 calls, 2 million instructions executed

Thanks, that's very helpful, and is actually another extension I
wanted to suggest :)
But, I want to bin functions by their actual execution cost (whether
in Ir, Dr etc),
which is only known when the function has run; and then, print the arguments
of a few representative function invocations falling into a given _cost_ bin.

Here's a simpler request that would let me manually simulate what I want to do:

Is it possible to create a client request that returns the current
value of a particular statistic (Ir, Dr, ...) gathered by
cachegrind/callgrind -- just as I
can get the "processor time spent so far" using standard function calls?
(I know I can dump this to a file, but that's too slow.  Can I get
this with a simple client request/API function call?)

I could then do with that statistic what I now do with runtime (which
isn't too reliable) -- both make histograms and print the args for
representative invocations
falling into a given bin.

Thanks a lot,

ilya

On Tue, Sep 20, 2011 at 8:54 AM, Josef Weidendorfer
<Jos...@gm...> wrote:
> On Monday 19 September 2011, ilya shlyakhter wrote:
>> I've been doing this manually by manually timing myFunc(), building a
>> histogram of its runtimes, finding which bins
>> dominate the histogram; then, looking at the inputs of some
>> representative invocations that fall in these runtime bins,
>> by: putting a static counter inside the function so you know when
>> you're on its N'th invocation; printing the counter
>> value for the first few invocations whose runtime falls into the bins
>> of interest; then adding code at the start of function,
>> "if counter == one of the printed counter values, print the function arguments".
>> Building this into the profiler would let me 1) avoid doing the above
>> for each function I want to optimize, and 2) do this
>> not just for runtime but for all other statistics gathered by
>> cachegrind/callgrind.
>>
>> Does this explanation help, or should I write some more detailed examples?
>
> I think I got the idea of this 2-stage procedure.
>
> What about the following approach: you expect that some parameter settings
> of myfunc() influence its runtime. Callgrind would allow you to embed
> information for bins of this parameter setting into the function name.
> E.g. for myfunc(int a), the profile results would show you two functions:
>
>  "myfunc:a>=5" : 90 calls, 1 billion instructions executed
>  "myfunc:a<5"  :  5 calls, 2 million instructions executed
>
> I think in the end you should be able to reach the same goal you get with
> your 2-stage approach.
>
> We did such an extension in the past, see
>        http://www.lrr.in.tum.de/~kuestner/proper09.pdf
>
> Attached is a patch for that (not sure it applies against current SVN or
> 3.6.1), with the following short doc:
>
>  Format is
>   --separate-par=<fn_pattern>':'<intpar_num>[':'<intval>(','<intval>)*]
>  If no value is given, separate by every different value
>  Otherwise, this allows for multiple buckets:
>   x < intval1, intval1 < x <= intval2, ... , intvalX <= x
>  with x being the <intpar_num>s int parameter of functions
>  matching fn_pattern.
>
>  Multiple seppar requests for same function are allowed
>
> It never made it into a release, as the usage was too low-level and
> the feature is platform-dependent (only works with 32bit x86 for now!):
> you have to know the stack layout. But it proved to be quite useful.
>
>> >If writing millions of files is to expensive,
>> it would be too expensive in my case.
>>
>> >easier to think about a way to pass measurements to a script
>> >in a more light-weight way (e.g. via pipes). You should
>> >be able to implement your suggestions in your own script then.
>>
>> thanks, that might work for making histograms (esp. if I limit
>> data-gathering to that one function).
>> but it wouldn't work for printing the arguments of some representative
>> function invocations falling into a given range of runtimes.
>
> If one sends the measurement data for every function invocation via
> pipe to an external process, one can also send a string identifying
> the parameters with every invocation.
>
> Josef
>
>>
>> ilya
>>
>> On Mon, Sep 19, 2011 at 4:24 PM, Josef Weidendorfer
>> <Jos...@gm...> wrote:
>> > Hi Ilya,
>> >
>> > On Monday 19 September 2011, ilya shlyakhter wrote:
>> >> how hard would it be to implement the following extension to
>> >> cachegrind/callgrind:
>> >> ...
>> >
>> > this list (which I do not understand completely) suggests that
>> > you have some concrete use case in mind. What do you want to
>> > accomplish in the end?
>> >
>> >> right now, when reporting the cost of a function, the cost of all
>> >> invocations is aggregated together.
>> >> (callgrind separates the invocations by caller, but that's as
>> >> fine-grained as it goes).
>> >
>> > Hmm.. you can also dump counters to a file anytime you want,
>> > e.g. with "--dump-after=<func>". You then can post-process
>> > the files, and do your own statistics on it.
>> > For how to parse the files in PERL, see "callgrind_annotate".
>> >
>> > If writing millions of files is to expensive, it probably is
>> > easier to think about a way to pass measurements to a script
>> > in a more light-weight way (e.g. via pipes). You should
>> > be able to implement your suggestions in your own script then.
>> >
>> >> the proposed extension would do the following:
>> >>  - for each function (or for requested functions), for a chosen
>> >> statistic (Ir, DLmr, etc), produce a _histogram_ of invocation costs.
>> >>    in each bin, keep the # of invocations and their total cost.
>> >
>> >> an additional extension would let the user print their own debug
>> >> information for representative
>> >> invocations falling into a given bin.  this would work only for
>> >> completely deterministic user programs.
>> >>   - there would be a valgrind client request, VALGRIND_SHOULD_PRINT,
>> >> which the user would
>> >>     add to their function to tell them whether to print debug
>> >> information for this invocation:
>> >>
>> >>  void myFunc( ComplexStructure *arg ) {
>> >>     if( VALGRIND_SHOULD_PRINT )
>> >>          arg->print();
>> >>     ...
>> >>  }
>> >>   - there would be an option to cachegrind/callgrind to record and
>> >> save in an output file, for a given function and
>> >>   a given range of costs, the "invocation ids" of some small number
>> >> of invocations of that function in that cost range.
>> >>   "invocation id" of a function is simply the order number of its
>> >> invocation (e.g. the 134701st invocation of this function).
>> >>
>> >>   - there would be another option to cachegrind/callgrind to read the
>> >> file recorded above and to have VALGRIND_SHOULD_PRINT return 1
>> >>   for the recorded invocations, and 0 for all others.
>> >
>> > I don't understand the benefit. How would you use that feature?
>> >
>> >> this would allow finer-grained profiling than is currently possible.
>> >> how hard would this be to add?
>> >
>> > If you come up with a patch, you still need to prove that it is worth
>> > merging and maintaining. It seems easier to me to try to come up
>> > with a general solution which allows users to implement their own
>> > post-processing/statistics, as mentioned above.
>> >
>> > E.g. for adding histograms, you not only need to change cachegrind/callgrind,
>> > but also extend the format and parsers, such as {cg,callgrind}_annotate,
>> > and the KCachegrind GUI.
>> >
>> > That said, it would be cool to have histograms.
>> >
>> > Josef
>> >
>> >>
>> >> thanks,
>> >>
>> >> ilya
>> >>
>> >> ------------------------------------------------------------------------------
>> >> BlackBerry&reg; DevCon Americas, Oct. 18-20, San Francisco, CA
>> >> Learn about the latest advances in developing for the
>> >> BlackBerry&reg; mobile platform with sessions, labs & more.
>> >> See new tools and technologies. Register for BlackBerry&reg; DevCon today!
>> >> http://p.sf.net/sfu/rim-devcon-copy1
>> >> _______________________________________________
>> >> Valgrind-developers mailing list
>> >> Val...@li...
>> >> https://lists.sourceforge.net/lists/listinfo/valgrind-developers
>> >>
>> >
>> >
>> >
>>
>> ------------------------------------------------------------------------------
>> All the data continuously generated in your IT infrastructure contains a
>> definitive record of customers, application performance, security
>> threats, fraudulent activity and more. Splunk takes this data and makes
>> sense of it. Business sense. IT sense. Common sense.
>> http://p.sf.net/sfu/splunk-d2dcopy1
>> _______________________________________________
>> Valgrind-developers mailing list
>> Val...@li...
>> https://lists.sourceforge.net/lists/listinfo/valgrind-developers
>>
>
>