|
From: David B. <dav...@gm...> - 2012-06-15 13:35:24
|
Does anyone know of a tool that allows two copies of a binary to be run at the same time (in lockstep, instruction by instruction) and monitors the memory image of the binaries for discrepancies, or divergences in control flow? Yet again, I'm in the situation where code is producing subtly difference science results with different optimisation settings and I need to track down where it starts. Valgrind contains most of the raw capabilities to implement this (I think), but it's way beyond my hacking capabilities. Any thoughts, comments or tool suggestions gladly received. Cheers, Brock |
|
From: John R. <jr...@bi...> - 2012-06-15 14:02:18
|
On 06/15/2012 06:35 AM, David Brockley wrote: > ... (in lockstep, instruction by instruction) ... > ... subtly different results with different optimisation settings ... This requires at least as much intelligence as 'diff'. It is necessary to detect and ignore different instruction sequences which nevertheless produce "isomorphic" results, including different use of temporary storage locations, different register spilling, etc. And if the code is not numerically stable with regard to calculations done in floating point arithmetic, then _all_ bets are off! Even when stable, one setting might have errors contained to the least-significant 10 bits, while another setting might contain them only in the lowest 13 bits. Suggestion: get the subroutine call graph, "cut off" some of the "utility" subroutines that are leaf nodes (or close to leaf nodes), then "do the diff" at entry+exit of the remaining routines. This has some hope of being able to apply "divide and conquer" instead of "plod along", especially if applied in a somewhat top-down manner. It might be possible to "script" gdbserver, or write a helper program which talks to two instances of gdbserver, so as to automate the work. -- |