|
From: Jeremy F. <je...@go...> - 2004-07-02 17:06:56
|
On Fri, 2004-07-02 at 17:21 +0100, Nicholas Nethercote wrote: > On Fri, 2 Jul 2004, Jeremy Fitzhardinge wrote: > > > On Fri, 2004-07-02 at 13:55 +0100, Nicholas Nethercote wrote: > >> A better way of identifying instructions is with a (obj_file, obj_offset) > >> pair. That will AFAICT uniquely identify any static instruction. > > > > Yeah, that's not a bad idea. There's a few complications though. > > > > Do you actually mean byte offset into the object file? That's a bit > > coarse, since you'd then have to work hard to map that back into a > > segment and get symtab information. It also makes it hard to work out > > the mapped address if you just have a file offset+baseaddr. If you > > recorded (file, segment, offset), you could address these. > > Er, I'm not sure what I mean, I've forgotten the difference between object > files and segments. Well, an ELF file looks (very approximately) like this 0: ELF Header PHeader -> A PHeader -> C PHeader -> B A: Segment B: Segment C: Segment Each Segment is a chunk mmaped out of the file, from a particular file offset. The PHeaders contains the mapping between virtual address and file offset. There's also a lot of other things in the ELF file, so the net result is that the file offset doesn't have much to do with the mapped virtual address. > I'm currently looking at rejigging Cachegrind's data structures. I think > I can solve the missing-info-for-unloaded-code and also significantly > simplify its code. The key is a tri-level data structure that looks like: > > table(filename, table(fn_name, table(line_num, CC))) > > where CC is the cost-centre stored for each instruction. Instruction > addresses are translated immediately to file/fn/line info, so there's no > staleness. Also, all the relevant filenames and fn_names are stored only > once, which is nice. (This structure is also necessary due to the way > Cachegrind dumps its info at the end.) The trouble with using file/line info is that, obviously, a lot of code doesn't have that info, and also that there can be more than one CC per code line. J |