|
From: Greg P. <gp...@ap...> - 2009-08-13 17:53:40
|
On Aug 13, 2009, at 9:40 AM, Philippe Waroquiers wrote: >> Doing a gdb stub properly will require significant rethinking and >> replumbing of the scheduler. You'd need mechanisms for controlling >> the scheduler from inside the tool and/or error manager (stop now >> because an error occurred) and from the attached debugger (stop >> now because the user said so). Single-stepping and breakpoints are >> even more fun - you'd probably want a way to tell the scheduler to >> recompile only one instruction at a time, if that doesn't confuse >> the tools too much. > > In the stub I wrote, the interface called by the core of valgrind > (called from the > error manager) is: > > /* If connection not yet opened, listen on a host:port and accept > an incoming connection. > If/when connection opened, reads gdb remote protocol packets > and executes the requested commands */ > extern void VG_(gdbserver) ( ThreadId tid ); Agreed, something like that is the best way to go. > When an error was encountered, this gdbserver was called; and was > reporting to > gdb that a "break" was encountered. It was then possible to use gdb > to examine > memory (print variables and similar). > A continue command in gdb was just causing gdbserver to return to > valgrind > error mgr, which was then continuing to run as usual. This works okay, though in my system it would get ugly if I tried to implement single-step after an error, instead of continue. > The impact of this on valgrind looked relatively easy/low risk, and > was more or > less working (after a limited dev. effort). > > For the rest, I did not experiment with anything but this was what I > was thinking: > * for setting breaks and similar: > * have a list of program counter in valgrind for which a break > is desired > (modified by gdbserver when gdb asks to put or delete a break) > * each time a break is either inserted or deleted, the > translation of the block containing this program counter is > discarded > * at translation time of a block, if the block range of program > counter contains > a break in the list of break, give the IR statements of the > block to the tool; > but add a new IR instruction "break_needed_here" just before > the IR needed > for the instruction at the given program counter: > * the tool has nothing to do with this IR, except give it back > to valgrind core in the modified IR > * then the valgrind core has to translate this IR in a call to > VG_(gdbserver) which would > similarly report to gdb that the debugged process has > encountered a break. An alternative to putting the break directly in the IR is to stop all translations at the instruction before it. Then control passes back to the scheduler, which can perform the "is there a breakpoint here" check. > To the contrary, how the gdb could "interrupt" a running valgrind is > unclear. > I know that in callgrind, the problem was already looked at (to have > callgrind_control and similar). > To my knowledge, the solution was to poll at regular interval a > control file. > So, we could imagine to have the core polling at regular interval by > calling VG_(gdbserver) > (with gdbserver having an argument to say: "do not block, just see > if gdb has sent a "interrupt" command) > IIRC, the callgrind schema is relatively simple but has a (big) > disadvantage that when the valgrind client > is blocked in a system call (e.g. a blocking read), then no way to > "control" it. > > I think that other solutions have been looked at (i.e. have the > valgrind core that is using a signal or > a "system" thread). I think this is giving various problems > (interaction between valgrind and the client > process) but I do not know much more. Any pointer to a description > of problems with this approach ? Interrupt is harder. In the mechanism I used, the debug stub basically zeroed the "execute N more blocks before returning to the scheduler" counter. Then when the scheduler regained control it handed off to the debug stub to do its thing. The synchronization here gets messy (for example, what if a real error occurs after the break request but before it stops in the scheduler). And I'm pretty sure my bugs were in here somewhere. Implementing without user-break would simplify things, but I think user-break is important. > (I guess some useful things can be used from the gb stub trial from > MacOS port e.g. for > all what concerns the access to registers and so on. E.g. I did not > looked at floating points and similar, > and if IIRC, the MacOS stub was having the code for x86 and x86_64 > for all that). |