Thread: [Valgrind-users] The syntax of jump and jcnd in callgrind profile files.

Brought to you by: njn, sewardj, wielaard

valgrind-users

[Valgrind-users] The syntax of jump and jcnd in callgrind profile files.

From: Dan L. <da...@su...> - 2014-08-22 18:13:02

Hi,

I'm hoping to use the callgrind profile file format for an interpreter
I'm working on so I can visualise various parameters using
KCacheGrind.

So far this has looked very promising but I've run into an issue with
jumps (jump= and jcnd=)

The documentation [1] doesn't define these very clearly. The grammar
for ``JumpSpecification`` doesn't seem to be specified properly.

Later on the docs note the following.

```

* jump=count target position [Callgrind]

Unconditional jump, executed count times, to the given target position.

* jcnd=exe.count jumpcount target position [Callgrind]

Conditional jump, executed exe.count times with jumpcount jumps to the
given target position.
```

It's not defined what ``target`` is but after looking at what
callgrind produces it seems that is some sort of offset (e.g. +5).
What I see produced by callgrind doesn't really seem to match the
above.

After playing around with a simple profile file and KCacheGrind it
seemed that to get what I want (KCachegrind to show jump arrows for my
source code without emitting parser warnings) that I had to do this

```
positions: instr line
events: Instructions

fl=s.bpl
fn=main

# Specify cost for jump
14 14 9
# Unconditional jump happens 5 times, +1 is a dummy offset, to line 19
jump=5 +1 19
14 14

# Specify cost for unconditional jump
21 21 9
# Conditional jump to line 25 happens 1 out of 2 times (+1 is dummy offset)
jcnd=1/2 +1 25
21 21

# Conditional jump to line 27 happens 1 out of 2 times (+1 is dummy offset)
jcnd=1/2 +1 27
21 21
```

So it seems jump the syntax is **actually** something like...

```
jump=<count> <target> <target_line>
<position>
```

Where:
<target> is ???
<target_line> is the line that we will jump to
<position> is the location of the jump instruction. Note it's on the
next line after jump=


and for jcnd the syntax is **actually** something like...

```
jcnd=<followed_count>/<total_count> <target> <target_line>
<position>
```

Where:
<followed_count> - Number of times <target> was followed from this jump
<total_count> - Number of times this jump was executed
<target> is ???
<target_line> - line that this branch from the jump instruction jumps to
<position> is location of the jump instruction. Note it's on the next
line after jcnd=

Is what I've guessed about the syntax correct?


[1] http://valgrind.org/docs/manual/cl-format.html

Thanks,
-- 
Dan Liew
PhD Student - Imperial College London

Re: [Valgrind-users] The syntax of jump and jcnd in callgrind profile files.

From: Josef W. <Jos...@gm...> - 2014-08-22 19:01:53

Am 22.08.2014 um 19:48 schrieb Dan Liew:
> The documentation [1] doesn't define these very clearly. The grammar
> for ``JumpSpecification`` doesn't seem to be specified properly.

Oops, indeed.
Thanks for the notification.


> * jump=count target position [Callgrind]
> 
> Unconditional jump, executed count times, to the given target position.
> 
> * jcnd=exe.count jumpcount target position [Callgrind]
> 
> Conditional jump, executed exe.count times with jumpcount jumps to the
> given target position.
> ```
> 
> It's not defined what ``target`` is but after looking at what
> callgrind produces it seems that is some sort of offset (e.g. +5).
> What I see produced by callgrind doesn't really seem to match the
> above.

The "target position" is a "SubPositionList" in the grammar, the
same as in the beginning of a CostLine. The "positions:" header line
defines which subpositions there are, and in your example there
are two: "instr line". The first number is an address for the machine
code, and the second number is the line in a source file.
Subpositions can be specified to be relative to the a subposition
given directly before, by prefixing it with "-" or "+".

> After playing around with a simple profile file and KCacheGrind it
> seemed that to get what I want (KCachegrind to show jump arrows for my
> source code without emitting parser warnings) that I had to do this
> 
> ```
> positions: instr line

Why not just "positions: line" ?
The instruction address you specify below is always the same as the line
number. That makes not much sense. The instruction address is needed
for KCachegrind to show annotated machine code, using "objdump <binary>",
but your example does not specify a binary file using "ob=" anyway.

> events: Instructions
> 
> fl=s.bpl
> fn=main
> 
> # Specify cost for jump
> 14 14 9
> # Unconditional jump happens 5 times, +1 is a dummy offset, to line 19
> jump=5 +1 19

"+1" is not a dummy offset, but is the instruction address of the jump
target, which is relative to the previous position specifed in your
example in the line above, ie. the first two numbers of "14 14 9".
Thus, your line is the same as "jump=5 15 19" or "jump=5 +1 +5".

> 14 14

Yes. Similar to calls, "jump="/"jcnd=" lines need to be followed by a line
specifying the source position of the jump.

By the way, if your jump crosses a source file or a function,
you may specify the source file of the jump target with "jfi=" and
the target function name with "jfn=" before a the "jump="/"jcnd" line.

> jcnd=<followed_count>/<total_count> <target> <target_line>
> <position>

Better:

=============
jcnd=<followed_count>/<total_count> <jump target position>
<jump source position>
==============

where a <position> is one or two numbers, depending on the "positions:"
header line. If source line numbers are enough for you, just do

==============
positions: line

....

# jump 2 of 8 times, from line 10 to 20
jcnd=1/8 20
10

...
===============


Cheers,
Josef



> ```
> 
> Where:
> <followed_count> - Number of times <target> was followed from this jump
> <total_count> - Number of times this jump was executed
> <target> is ???
> <target_line> - line that this branch from the jump instruction jumps to
> <position> is location of the jump instruction. Note it's on the next
> line after jcnd=
> 
> Is what I've guessed about the syntax correct?
> 
> 
> [1] http://valgrind.org/docs/manual/cl-format.html
> 
> Thanks,
>

Re: [Valgrind-users] The syntax of jump and jcnd in callgrind profile files.

From: Dan L. <da...@su...> - 2014-08-23 12:57:24

Thanks for the prompt response :)

<snip>

>> It's not defined what ``target`` is but after looking at what
>> callgrind produces it seems that is some sort of offset (e.g. +5).
>> What I see produced by callgrind doesn't really seem to match the
>> above.
>
> The "target position" is a "SubPositionList" in the grammar, the
> same as in the beginning of a CostLine. The "positions:" header line
> defines which subpositions there are, and in your example there
> are two: "instr line". The first number is an address for the machine
> code, and the second number is the line in a source file.
> Subpositions can be specified to be relative to the a subposition
> given directly before, by prefixing it with "-" or "+".

Thanks. That makes things significantly clearer.

>> After playing around with a simple profile file and KCacheGrind it
>> seemed that to get what I want (KCachegrind to show jump arrows for my
>> source code without emitting parser warnings) that I had to do this
>>
>> ```
>> positions: instr line
>
> Why not just "positions: line" ?

The reason was that when I tried using just "positions: line" that the
jump= and jcnd= stopped being displayed in KCacheGrind.  Now that
you've explained that "target position" is "SubPositionList" in the
grammar I've managed to use jump= and jcnd= with only using lines and
it works fine :)

> The instruction address you specify below is always the same as the line
> number. That makes not much sense. The instruction address is needed
> for KCachegrind to show annotated machine code, using "objdump <binary>",
> but your example does not specify a binary file using "ob=" anywayl

On a slightly related note is it possible to specify an assembly file
rather than an object file? For example If I had a C program (say
foo.c) and its corresponding LLVM IR (foo.ll). I would like to display
both the C source code and the foo.ll file. When I tried using
obj=foo.ll however KCachegrind tried using objdump on that file and
gave up because foo.ll is just text and not a binary.

>> events: Instructions
>>
>> fl=s.bpl
>> fn=main
>>
>> # Specify cost for jump
>> 14 14 9
>> # Unconditional jump happens 5 times, +1 is a dummy offset, to line 19
>> jump=5 +1 19
>
> "+1" is not a dummy offset, but is the instruction address of the jump
> target, which is relative to the previous position specifed in your
> example in the line above, ie. the first two numbers of "14 14 9".
> Thus, your line is the same as "jump=5 15 19" or "jump=5 +1 +5".

Thanks for explaining that. My comment was actually trying to say I
was considering +1 to be a dummy offset because I didn't care what it
was set.

>> 14 14
>
> Yes. Similar to calls, "jump="/"jcnd=" lines need to be followed by a line
> specifying the source position of the jump.

That probably ought to be documented. It's not obvious that it is
needed. I'm also surprised that the position of the jump isn't
specified on the same line.

> By the way, if your jump crosses a source file or a function,
> you may specify the source file of the jump target with "jfi=" and
> the target function name with "jfn=" before a the "jump="/"jcnd" line.

Thanks. Like ``cfn=`` does that change the function that subsequent
cost lines are in?

I don't see these in the documentation either so it would be nice if
they were documented.

-- 
Dan Liew
PhD Student - Imperial College London

Re: [Valgrind-users] The syntax of jump and jcnd in callgrind profile files.

From: Josef W. <Jos...@gm...> - 2014-08-24 16:20:30

Am 23.08.2014 um 14:56 schrieb Dan Liew:
>> The instruction address you specify below is always the same as the line
>> number. That makes not much sense. The instruction address is needed
>> for KCachegrind to show annotated machine code, using "objdump <binary>",
>> but your example does not specify a binary file using "ob=" anywayl
> 
> On a slightly related note is it possible to specify an assembly file
> rather than an object file? For example If I had a C program (say
> foo.c) and its corresponding LLVM IR (foo.ll). I would like to display
> both the C source code and the foo.ll file. When I tried using
> obj=foo.ll however KCachegrind tried using objdump on that file and
> gave up because foo.ll is just text and not a binary.

Currently there is no such option, and the "instr" subposition was meant
to be used with objdump.

Perhaps it would be interesting to have a KCachegrind config option
(or specified in the header of a callgrind file)
which specifies what to do with a specific subposition type, instead of
running it through "objdump". Not sure yet what would cover most use
cases. For you, just the file and line number would be enough?

>> ...
> That probably ought to be documented.

Sure. The jumps were one of the last additions to the file format, and
it just never was documented correctly.

> It's not obvious that it is
> needed. I'm also surprised that the position of the jump isn't
> specified on the same line.

I wanted that to be similar to the calls= spec, and was not sure
if some cost attributes may be useful to jumps as well (which currently
is empty).
But details do not matter; documentation is important.

>> By the way, if your jump crosses a source file or a function,
>> you may specify the source file of the jump target with "jfi=" and
>> the target function name with "jfn=" before a the "jump="/"jcnd" line.
> 
> Thanks. Like ``cfn=`` does that change the function that subsequent
> cost lines are in?
> 
> I don't see these in the documentation either so it would be nice if
> they were documented.

See above. Currently the documentation is in the (KCachegrind) source :(

Josef


>

Re: [Valgrind-users] The syntax of jump and jcnd in callgrind profile files.

From: Dan L. <da...@su...> - 2014-08-25 08:39:54

On 24 August 2014 17:20, Josef Weidendorfer <Jos...@gm...> wrote:
> Am 23.08.2014 um 14:56 schrieb Dan Liew:
>>> The instruction address you specify below is always the same as the line
>>> number. That makes not much sense. The instruction address is needed
>>> for KCachegrind to show annotated machine code, using "objdump <binary>",
>>> but your example does not specify a binary file using "ob=" anywayl
>>
>> On a slightly related note is it possible to specify an assembly file
>> rather than an object file? For example If I had a C program (say
>> foo.c) and its corresponding LLVM IR (foo.ll). I would like to display
>> both the C source code and the foo.ll file. When I tried using
>> obj=foo.ll however KCachegrind tried using objdump on that file and
>> gave up because foo.ll is just text and not a binary.
>
> Currently there is no such option, and the "instr" subposition was meant
> to be used with objdump.
>
> Perhaps it would be interesting to have a KCachegrind config option
> (or specified in the header of a callgrind file)
> which specifies what to do with a specific subposition type, instead of
> running it through "objdump". Not sure yet what would cover most use
> cases. For you, just the file and line number would be enough?

Sorry. I've probably confused things slightly. For the main project
I'm currently working yes using only the line number is fine because
my interpreter only works at the source level. With your help I now
have jump information being shown correctly in Kcachegrind so I will
definitely be outputting profiling information in the callgrind format
:)

The reason I was asking was because I was looking at another project
(which was what inspired me play with KCachegrind) called KLEE[1].
This tool executes LLVM IR that comes from C programs, so there is an
original source (e.g. foo.c) and the corresponding LLVM IR (e.g.
foo.ll). Both of these are text files and I observed that the
callgrind files that KLEE was emitting set ``ob=`` to the LLVM IR file
and ``fl=`` to the original C source file. As I mentioned before this
doesn't work as the author of KLEE intended because Kcachegrind tries
to treat foo.ll as a object file rather than as a text file.

In this case I like to think of LLVM IR as being similar to assembly
so using the instr subposition with absolute line numbers feels fairly
natural. KCachegrind could be taught to try and infer if ``ob=`` is
really an object file or if it is text by looking for magic numbers
(like the file command does) and then doing the appropriate thing. I
think it would be nicer if the Callgrind file could optionally specify
what to do with the instruction subposition type as you suggested.

Just allowing obj to be plain text might be enough flexibility because
this allows someone who wants to use their own disassembler (e.g.
llvm-objdump) or no disassembler at all to run whatever tools they
want on their object file to produce a text file before profiling.
Then they can use the instruction subposition in the callgrind file
can just refer to lines in this text file. You could have an option
like

# Objects are plain text
obj_is_plain_text=1

in the callgrind file (which is by default 0) to specify that all
object files are plain text files.

You could go further and specify the tool to use for object files in
the callgrind file but then you run into the problem that kcachegrind
might not know how to parse the output so I think just allowing the
object file to be plain text might be the simplest thing to do.

There certainly isn't a rush for the feature like this but I think it
would be a nice addition.

Thanks,
Dan

[1] http://klee.llvm.org