|
From: Chris E. <chr...@gm...> - 2007-01-26 07:23:40
|
Hello, I'm attempting to write a tool that performs simple branch prediction. The problem I'm running in to is figuring which of the Exit IR statements represent branches in the original user code; I hope this is not digging too deep into the guest program because I know that that's not really the point of VEX. There seem to be many different kinds of Exit statements, "jumpkinds," but I don't understand from the documentation in libvex_ir.h which refer exclusively (if that is the case for any) to branches in the guest code. In lackey each Exit statement is surrounded by helper calls that count total branches and total taken branches, but I'm afraid of counting branches that don't pertain to the guest program, or should I not be concerned? If I were to consider every last IRstmt of a superblock which is also an Exit, assuming of course I set Valgrind to not chase across blocks and to translate the smallest possible basic blocks (taking the performance hit along with it), would that be a sensible way of looking at branches that represent (or are at least indicative of) original guest branches? Thanks for the help, - Chris Eberz |
|
From: Tom H. <to...@co...> - 2007-01-26 08:50:27
|
In message <337...@ma...>
Chris Eberz <chr...@gm...> wrote:
> I'm attempting to write a tool that performs simple branch prediction. The
> problem I'm running in to is figuring which of the Exit IR statements
> represent branches in the original user code; I hope this is not digging too
> deep into the guest program because I know that that's not really the point
> of VEX. There seem to be many different kinds of Exit statements,
> "jumpkinds," but I don't understand from the documentation in libvex_ir.h
> which refer exclusively (if that is the case for any) to branches in the
> guest code.
I think you will find they are Ijk_Boring actually - the important
question is what the next address is. In other words where VEX will
be told to resume execution, and whether it follows the end of the
previous block.
Obviously if you're interesting in function calls as well then you
would need to look at Ijk_Call and Ijk_Return as well, and if you
want to know about system calls then the Ijk_Sys_* ones.
I don't know whether you could, in principle, get something like
an Ijk_ClientReq or Ijk_Yield that was effectively also a branch by
virtue of having a next execution address that didn't follow on from
the previous block or not. Julian will probably know.
> In lackey each Exit statement is surrounded by helper calls that count total
> branches and total taken branches, but I'm afraid of counting branches that
> don't pertain to the guest program, or should I not be concerned?
All branches taken by VEX will pertain to the guest program - it won't
suddenly decide to jump to non-guest code as it knows nothing about it.
The only way to get host code executed is either to generate IR that
does a host call (which will be within a block) or for VEX to encounter
a client request in which case it will stop with Ijk_ClientReq and the
surrounding program will have to execute the client request and then
restart VEX at the indicated continuation address.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Nicholas N. <nj...@cs...> - 2007-01-27 02:03:02
|
On Fri, 26 Jan 2007, Tom Hughes wrote: >> I'm attempting to write a tool that performs simple branch prediction. The >> problem I'm running in to is figuring which of the Exit IR statements >> represent branches in the original user code; I hope this is not digging too >> deep into the guest program because I know that that's not really the point >> of VEX. There seem to be many different kinds of Exit statements, >> "jumpkinds," but I don't understand from the documentation in libvex_ir.h >> which refer exclusively (if that is the case for any) to branches in the >> guest code. > > I think you will find they are Ijk_Boring actually - the important > question is what the next address is. In other words where VEX will > be told to resume execution, and whether it follows the end of the > previous block. > > Obviously if you're interesting in function calls as well then you > would need to look at Ijk_Call and Ijk_Return as well, and if you > want to know about system calls then the Ijk_Sys_* ones. In case it's not clear: there are two ways to exit from a block. Every block has an exit at the end. Some blocks also have one or more a conditional exits in the middle -- these are the Ist_Exit statements. Nick |
|
From: Josef W. <Jos...@gm...> - 2007-01-26 14:14:27
|
On Friday 26 January 2007 08:23, Chris Eberz wrote: > Hello, > > I'm attempting to write a tool that performs simple branch prediction. The > problem I'm running in to is figuring which of the Exit IR statements > represent branches in the original user code; I hope this is not digging too > deep into the guest program because I know that that's not really the point > of VEX. There seem to be many different kinds of Exit statements, > "jumpkinds," but I don't understand from the documentation in libvex_ir.h > which refer exclusively (if that is the case for any) to branches in the > guest code. With "branches" you mean control flow changes? These would be boring/call/return exits. The distinction between these jump types unfortunately is not very reliable with a PPC guest, as VEX only can estimate about the type (there are no exlicit CALL/RETURN instructions in the PPC ISA). You can check for such changes by checking the sequence of instruction addresses, even at basic-/superblocks boundaries to get rid of "false" branches you worry about. This way, you can even allow VEX to chain basic blocks, and you see the control flow change inside of a superblock. Unfortunately, chasing gets rid of the jump kind information, but I am not sure you need it. The most naive way would be to call a helper at every IMark, which of course is very very slow. Better is to call a helper at every start of a superblock, but you have to know the previously executed block and which exit was taken from it, to reconstruct the sequence of instruction addresses. This already is a little tricky. Currently, VEX does not allow to insert instructions which are executed only after a conditional exit in the middle of a block. But this would be the best place to note the exit number, and the guest address of the next block already would be known. So you have to do something before any exit, which is pure overhead if a conditional branch falls trough (similar to lackey). Callgrind does it the following way: at instrumentation time, it collects information for all (conditional) exits in a block, and inserts write instructions which put the "exit number" in a global variable while passing the exits in the block. This way, after a block was run, in this variable you have the exit taken and can reconstruct the instruction stream taken. An interesting case are x86 REP/string instructions, as there, VEX can give you two exits for one guest instruction. And you will see an control flow change, ie. a branch where jump target and jump source have the same guest address. Josef > In lackey each Exit statement is surrounded by helper calls that count total > branches and total taken branches, but I'm afraid of counting branches that > don't pertain to the guest program, or should I not be concerned? > > If I were to consider every last IRstmt of a superblock which is also an > Exit, assuming of course I set Valgrind to not chase across blocks and to > translate the smallest possible basic blocks (taking the performance hit > along with it), would that be a sensible way of looking at branches that > represent (or are at least indicative of) original guest branches? > > Thanks for the help, > > - Chris Eberz > |
|
From: Chris E. <chr...@gm...> - 2007-02-03 02:37:27
|
Thanks, you're right, it looks like most everything is an lkj_Boring. I ran a small c program and showed 1,549 ljk_Boring, 2 ljk_EmWarn, and 28 ljk_MapFail. Well, actually I spent a couple of days thinking that either Valgrind was out of it's mind or I was just completely lost when what was really happening was that my little c program called "test" wasn't being run, but instead something else that gets run when you pass the string "test" as the executable in the valgrind command line. I wonder though why none of the jumpkinds were call or return. Lackey confirms that my foo() function is being called as many times as call it in the source code (I'm using no optimizations: gcc -O0), but none of jumpkinds of either an Ist_Exit or an IRSB are ljk_Call. I thought that the calls to foo() may have been hidden by BB chasing, but I've --vex-guest-chase-thresh=0 and I still see no ljk_Calls. Since a function call in this case is unconditional, is it wrong for me to expect to see it as an Ist_Stmt in the middle of an ISRB, as I think Nicholas said in this thread that unconditional jumps become the implicit jumps within the IRSB data structure that point to the next block? Also, just an aside, is it right to assume that I can determine forward and backward branching in the guest code by whether or not the jumps in IR are forward or backwards (comparing the IR statement address to the jump target I guess)? Thanks, - Chris On 1/26/07, Tom Hughes <to...@co...> wrote: > > In message <337...@ma...> > Chris Eberz < chr...@gm...> wrote: > > > I'm attempting to write a tool that performs simple branch > prediction. The > > problem I'm running in to is figuring which of the Exit IR statements > > represent branches in the original user code; I hope this is not digging > too > > deep into the guest program because I know that that's not really the > point > > of VEX. There seem to be many different kinds of Exit statements, > > "jumpkinds," but I don't understand from the documentation in > libvex_ir.h > > which refer exclusively (if that is the case for any) to branches in the > > guest code. > > I think you will find they are Ijk_Boring actually - the important > question is what the next address is. In other words where VEX will > be told to resume execution, and whether it follows the end of the > previous block. > > Obviously if you're interesting in function calls as well then you > would need to look at Ijk_Call and Ijk_Return as well, and if you > want to know about system calls then the Ijk_Sys_* ones. > > I don't know whether you could, in principle, get something like > an Ijk_ClientReq or Ijk_Yield that was effectively also a branch by > virtue of having a next execution address that didn't follow on from > the previous block or not. Julian will probably know. > > > In lackey each Exit statement is surrounded by helper calls that count > total > > branches and total taken branches, but I'm afraid of counting branches > that > > don't pertain to the guest program, or should I not be concerned? > > All branches taken by VEX will pertain to the guest program - it won't > suddenly decide to jump to non-guest code as it knows nothing about it. > > The only way to get host code executed is either to generate IR that > does a host call (which will be within a block) or for VEX to encounter > a client request in which case it will stop with Ijk_ClientReq and the > surrounding program will have to execute the client request and then > restart VEX at the indicated continuation address. > > Tom > > -- > Tom Hughes (to...@co...) > http://www.compton.nu/ I see. I think my concern with the "undesirable branches" was because I looked at many of the different |
|
From: Josef W. <Jos...@gm...> - 2007-02-04 01:55:54
|
On Saturday 03 February 2007, Chris Eberz wrote: > I wonder though why none of the jumpkinds were call or return. Which architecture is this on? > Lackey > confirms that my foo() function is being called as many times as call it in > the source code (I'm using no optimizations: gcc -O0), but none of jumpkinds > of either an Ist_Exit or an IRSB are ljk_Call. I thought that the calls to > foo() may have been hidden by BB chasing, Yes. > but I've > --vex-guest-chase-thresh=0 and I still see no ljk_Calls. Since a function > call in this case is unconditional, Yes, again. > is it wrong for me to expect to see it > as an Ist_Stmt in the middle of an ISRB, Why? If there is no chasing, a call should always conclude a BB. > as I think Nicholas said in this > thread that unconditional jumps become the implicit jumps within the IRSB > data structure that point to the next block? > > Also, just an aside, is it right to assume that I can determine forward and > backward branching in the guest code by whether or not the jumps in IR are > forward or backwards (comparing the IR statement address to the jump target > I guess)? I am not sure what you mean with "IR statement address", as IR is only intermediate. Why can you not just look at the address of the guest instructions for the jump (using data from IMark)? Josef > > Thanks, > > - Chris > > On 1/26/07, Tom Hughes <to...@co...> wrote: > > > > In message <337...@ma...> > > Chris Eberz < chr...@gm...> wrote: > > > > > I'm attempting to write a tool that performs simple branch > > prediction. The > > > problem I'm running in to is figuring which of the Exit IR statements > > > represent branches in the original user code; I hope this is not digging > > too > > > deep into the guest program because I know that that's not really the > > point > > > of VEX. There seem to be many different kinds of Exit statements, > > > "jumpkinds," but I don't understand from the documentation in > > libvex_ir.h > > > which refer exclusively (if that is the case for any) to branches in the > > > guest code. > > > > I think you will find they are Ijk_Boring actually - the important > > question is what the next address is. In other words where VEX will > > be told to resume execution, and whether it follows the end of the > > previous block. > > > > Obviously if you're interesting in function calls as well then you > > would need to look at Ijk_Call and Ijk_Return as well, and if you > > want to know about system calls then the Ijk_Sys_* ones. > > > > I don't know whether you could, in principle, get something like > > an Ijk_ClientReq or Ijk_Yield that was effectively also a branch by > > virtue of having a next execution address that didn't follow on from > > the previous block or not. Julian will probably know. > > > > > In lackey each Exit statement is surrounded by helper calls that count > > total > > > branches and total taken branches, but I'm afraid of counting branches > > that > > > don't pertain to the guest program, or should I not be concerned? > > > > All branches taken by VEX will pertain to the guest program - it won't > > suddenly decide to jump to non-guest code as it knows nothing about it. > > > > The only way to get host code executed is either to generate IR that > > does a host call (which will be within a block) or for VEX to encounter > > a client request in which case it will stop with Ijk_ClientReq and the > > surrounding program will have to execute the client request and then > > restart VEX at the indicated continuation address. > > > > Tom > > > > -- > > Tom Hughes (to...@co...) > > http://www.compton.nu/ > > > > I see. I think my concern with the "undesirable branches" was because I > looked at many of the different > |
|
From: Chris E. <chr...@gm...> - 2007-02-04 22:03:35
|
On 2/3/07, Josef Weidendorfer <Jos...@gm...> wrote: > > On Saturday 03 February 2007, Chris Eberz wrote: > > I wonder though why none of the jumpkinds were call or return. > > Which architecture is this on? x86 > Lackey > > confirms that my foo() function is being called as many times as call it > in > > the source code (I'm using no optimizations: gcc -O0), but none of > jumpkinds > > of either an Ist_Exit or an IRSB are ljk_Call. I thought that the calls > to > > foo() may have been hidden by BB chasing, > > Yes. > > > but I've > > --vex-guest-chase-thresh=0 and I still see no ljk_Calls. Since a > function > > call in this case is unconditional, > > Yes, again. > > > is it wrong for me to expect to see it > > as an Ist_Stmt in the middle of an ISRB, > > Why? If there is no chasing, a call should always conclude a BB. Which leads me to think that maybe I'm using the command line arguments incorrectly. If --vex-gues-chase-thresh=0 is set, then shouldn't a jumpkind of ljk_Call appear at least once in the program, either in an IRSB's jumpkind field or in an Ist_Stmt.Ist.Exit's jumpkind field? > > as I think Nicholas said in this > > thread that unconditional jumps become the implicit jumps within the > IRSB > > data structure that point to the next block? > > > > Also, just an aside, is it right to assume that I can determine forward > and > > backward branching in the guest code by whether or not the jumps in IR > are > > forward or backwards (comparing the IR statement address to the jump > target > > I guess)? > > I am not sure what you mean with "IR statement address", as IR is only > intermediate. Why can you not just look at the address of the guest > instructions for the jump (using data from IMark)? > > Josef Sorry, I don't think I phrased the question correctly. Is the address of a guest instruction, an Ist_IMark, in the same address space as the address in Stmt.Ist.Exit.dst? During instrumentation, I want the helper functions being called to know whether a jump is being made forwards or backwards in code, and I don't want to just naively compare the address from the Ist_IMark to the destination address of the Ist_exit (or the destination in the IR expression, IRSB.next) if they have nothing to do with eachother. Thanks, - Chris > > > Thanks, > > > > - Chris > > > > On 1/26/07, Tom Hughes <to...@co...> wrote: > > > > > > In message < > 337...@ma...> > > > Chris Eberz < chr...@gm...> wrote: > > > > > > > I'm attempting to write a tool that performs simple branch > > > prediction. The > > > > problem I'm running in to is figuring which of the Exit IR > statements > > > > represent branches in the original user code; I hope this is not > digging > > > too > > > > deep into the guest program because I know that that's not really > the > > > point > > > > of VEX. There seem to be many different kinds of Exit statements, > > > > "jumpkinds," but I don't understand from the documentation in > > > libvex_ir.h > > > > which refer exclusively (if that is the case for any) to branches in > the > > > > guest code. > > > > > > I think you will find they are Ijk_Boring actually - the important > > > question is what the next address is. In other words where VEX will > > > be told to resume execution, and whether it follows the end of the > > > previous block. > > > > > > Obviously if you're interesting in function calls as well then you > > > would need to look at Ijk_Call and Ijk_Return as well, and if you > > > want to know about system calls then the Ijk_Sys_* ones. > > > > > > I don't know whether you could, in principle, get something like > > > an Ijk_ClientReq or Ijk_Yield that was effectively also a branch by > > > virtue of having a next execution address that didn't follow on from > > > the previous block or not. Julian will probably know. > > > > > > > In lackey each Exit statement is surrounded by helper calls that > count > > > total > > > > branches and total taken branches, but I'm afraid of counting > branches > > > that > > > > don't pertain to the guest program, or should I not be concerned? > > > > > > All branches taken by VEX will pertain to the guest program - it won't > > > suddenly decide to jump to non-guest code as it knows nothing about > it. > > > > > > The only way to get host code executed is either to generate IR that > > > does a host call (which will be within a block) or for VEX to > encounter > > > a client request in which case it will stop with Ijk_ClientReq and the > > > surrounding program will have to execute the client request and then > > > restart VEX at the indicated continuation address. > > > > > > Tom > > > > > > -- > > > Tom Hughes (to...@co...) > > > http://www.compton.nu/ > > > > > > > > I see. I think my concern with the "undesirable branches" was because I > > looked at many of the different > > > > > |
|
From: Julian S. <js...@ac...> - 2007-02-04 22:25:06
|
> Sorry, I don't think I phrased the question correctly. Is the address of a > guest instruction, an Ist_IMark, in the same address space as the address > in Stmt.Ist.Exit.dst? Yes, so .. > During instrumentation, I want the helper functions > being called to know whether a jump is being made forwards or backwards in > code, and I don't want to just naively compare the address from the > Ist_IMark to the destination address of the Ist_exit (or the destination in > the IR expression, IRSB.next) if they have nothing to do with eachother. .. the naive comparison should correctly tell you the branch direction. Note you may get the branch sense inverted by IR optimisation. Consider this: 0x1234: jz 0x5678 That might come out as IRStmt_IMark(0x1234, 5) IRStmt_Exit( ... compute Z flag ..., 0x5678) IRStmt_IMark(0x1239, ...) // translation of fallthru insn But an equally valid translation is as follows, and this _may_ occur. I don't know offhand if any shipping V will do this, but the IR can certainly express it, and some experimental variants do do it: IRStmt_IMark(0x1234, 5) IRStmt_Exit( ... compute NZ flag ..., 0x1239) IRStmt_IMark(0x5678) // translation of branch target insn J |
|
From: Josef W. <Jos...@gm...> - 2007-02-04 23:33:15
|
On Sunday 04 February 2007, Julian Seward wrote: > Note you may get the branch sense inverted by IR optimisation. Consider > this: > > 0x1234: jz 0x5678 > > [...] > > IRStmt_IMark(0x1234, 5) > IRStmt_Exit( ... compute NZ flag ..., 0x1239) > IRStmt_IMark(0x5678) > // translation of branch target insn This actually happens in reality sometimes. In callgrind, I need to check this for branch tracing, and quite early I saw instances of it appearing. Josef |
|
From: Julian S. <js...@ac...> - 2007-02-04 23:41:40
|
On Sunday 04 February 2007 23:32, Josef Weidendorfer wrote: > On Sunday 04 February 2007, Julian Seward wrote: > > Note you may get the branch sense inverted by IR optimisation. Consider > > this: > > > > 0x1234: jz 0x5678 > > > > [...] > > > > IRStmt_IMark(0x1234, 5) > > IRStmt_Exit( ... compute NZ flag ..., 0x1239) > > IRStmt_IMark(0x5678) > > // translation of branch target insn > > This actually happens in reality sometimes. > In callgrind, I need to check this for branch tracing, and quite early > I saw instances of it appearing. Yes. Thinking about it more, iropt will unroll small single basic block loops, and in doing so will invert the sense of at least one of the branches in the unrolled loop. J |