#35 odd results with callgrind and KCachegrind

open
nobody
5
2008-05-08
2008-05-08
Brandon Ehle
No

While inspecting a profile, I noticed an area in the call map that didn't appear to be correct. I tracked this down to the way that callgrind appears to handle unique call stacks.

Following up with the reproduction case.

Discussion

  • Brandon Ehle
    Brandon Ehle
    2008-05-08

    Test program

     
    Attachments
  • Brandon Ehle
    Brandon Ehle
    2008-05-08

    Logged In: YES
    user_id=1616
    Originator: YES

    File Added: callgrind.out.17563

     
  • Brandon Ehle
    Brandon Ehle
    2008-05-08

    Callgrind out from the test program

     
    Attachments
  • Logged In: YES
    user_id=621915
    Originator: NO

    I see no obvious bug in the attached result file.
    What did you expect to be different?

    Remember that in a call graph every function appears only
    once, and you get a superposition of all call paths. Callgrind
    collects the exact cost of functions and call arcs between
    functions. This is different from path profiling, where cost is
    attributed to call chains. However, callgrind can do this by
    e.g. running it with "--separate-callers=2". Then, a node in the
    graph actually is a call chain.

     
  • Logged In: NO

    I was more or less expecting --seperate-callers=infinity to be the default, but then combine the samples in the per-function view.

    The area based profile is completely incorrect if you have complex call trees without supplying --seperate-callers. It's incorrect enough that it might even be better to completely disable the area based view when seperate callers is not supplied in the profile.

    The call graph view could probably go either way and make sense (which is why I was suggesting a toggle for it). The same could be said for the per-function statistics (callers / callees / top cost / etc...).

     
  • Logged In: YES
    user_id=621915
    Originator: NO

    > I was more or less expecting --seperate-callers=infinity to be the
    > default, but then combine the samples in the per-function view.

    The exponentially growing number of different call paths forbids
    such behavior in general. Callgrind observes every path executed,
    not just some samples. E.g. just a startup of firefox already
    gives a few million call paths with --seperate-callers=20. And
    as every event counter for every instruction is counted separately,
    you can imagine that this results in out-of-memory for 32bit
    architectures quite fast.

    So there is some need for adaptive data collection. The problem
    is that space for counters has to be allocated before one knows
    if a given counter is important for later analysis or not.
    My idea is to refine the context when it can be seen that some
    function/piece of a call path is executed often in the run of
    a program.

    > The area based profile is completely incorrect if you have
    > complex call trees without supplying --seperate-callers. It's
    > incorrect enough that it might even be better to completely
    > disable the area based view when seperate callers is not supplied
    > in the profile.

    It just very much depends on the problem at hand. IMHO instead of
    generally disabling the view, it often is quite useful for a
    quick overview, even if the data comes from heuristics.
    They same is true about cycle detection: therefore, there is a
    button to switch it off. A lot of code can show spurious false
    cycles, making the call graph visualization totally useless.

    GProf has a similar problem: on the one side, it only shows
    a butterfly statistic, and cycle detection can not be switched
    off. Therefore, you would assume that it is quite strict to
    always show correct data.
    But on the other hand, its data are based on time sampling, and
    inclusive costs are heuristically derived from call counts,
    which can produce totally wrong data. In addition, if a shared
    library is not instrumented, you do not even get a warning about
    the bogus measurement.

    So, in an ideal world, the user of a performance analysis tool / visualization is aware of the good and bad sides of a given
    measurement method, and he knows what are exact measurements, and
    what are derived from heuristics. IMHO a tool should never prohibit
    ways of usage by "intelligent" assumptions.

    > The call graph view could probably go either way and make sense
    > (which is why I was suggesting a toggle for it).
    > The same could be said for the per-function statistics (callers /
    > callees / top cost / etc...).

    The data about direct callers/callees (and inclusive cost) is always
    correct as Callgrind does measure it directly.

    Perhaps it would make sense to show hints about what values are
    direct measurements, and what are based on heuristics...