|
From: Nicholas N. <nj...@ca...> - 2004-07-02 16:22:21
|
On Fri, 2 Jul 2004, Jeremy Fitzhardinge wrote: > On Fri, 2004-07-02 at 13:55 +0100, Nicholas Nethercote wrote: >> A better way of identifying instructions is with a (obj_file, obj_offset) >> pair. That will AFAICT uniquely identify any static instruction. > > Yeah, that's not a bad idea. There's a few complications though. > > Do you actually mean byte offset into the object file? That's a bit > coarse, since you'd then have to work hard to map that back into a > segment and get symtab information. It also makes it hard to work out > the mapped address if you just have a file offset+baseaddr. If you > recorded (file, segment, offset), you could address these. Er, I'm not sure what I mean, I've forgotten the difference between object files and segments. > And obviously you also need the virtual address of the instruction for > the common case of the .so has not been unloaded. If you store file as > (path, base_address), you know where the object was mapped, and you can > work everything out from that. Ok, then let's pretend that's what I meant. -- Actually, even this still isn't really what we want. The whole point of this is to get back to file/function/line info. If we recorded that straight away in error messages instead of instruction addresses, the debug/symbol info would be nice and fresh and there wouldn't be any problems with unloading, and no need to fiddle around with object/segment offset and maybe reload symbols later. Of course, the reason we currently don't do this is because it would vastly increase the size of each ExeContext. Unless we worked out a clever scheme whereby it didn't. I'm thinking something where every unique code location recorded in an ExeContext is stored in a way that it's never duplicated, and the same for all file/function names. I'm currently looking at rejigging Cachegrind's data structures. I think I can solve the missing-info-for-unloaded-code and also significantly simplify its code. The key is a tri-level data structure that looks like: table(filename, table(fn_name, table(line_num, CC))) where CC is the cost-centre stored for each instruction. Instruction addresses are translated immediately to file/fn/line info, so there's no staleness. Also, all the relevant filenames and fn_names are stored only once, which is nice. (This structure is also necessary due to the way Cachegrind dumps its info at the end.) N |