From: Nathan T. <er...@cs...> - 2005-03-25 01:01:16
|
Ok, I've crawled through OProfile code, inserted some extra printfs and think that I understand what happening. Much of what I posted previously is right, but my retraction pointed out the main error. Here's my basic question: OProfile stores samples as file offsets (and the offsets are different depending on whether the image is in kernel or user space). I've figured out how to recover kernel samples offsets (explained below), but it appears the user sample offsets are not very useful because the point from which offsets are recorded is not saved. Is this correct? Is there a way to retrieve the info? My revised assesment of the issue follows. The corrected parts are unquoted. > As briefly explained earlier in this thread, I am wrestling with > recovering sample offset values from OProfile. As a clarification > to that earlier post, I offer the following. > > > What we want: the uninterpreted OProfile data. This would be > similar to "opreport -d", but we want all of the data, not just > that associated with symbols. In other words, for each load > module, we want all samples and counts for each event. > > > Here is a diagram of how a binary image (linked app or DSO) is > represented in memory along with some interesting address points to > clarify exactly how sample offsets are represented. > > A B C D > |--|----|----------------|------------------------ > txt sample(offset,count) > > A = address of beginning of image (may or may not equal B) > B = load address > C = address of first text section (may or may not equal B) > D = VMA, virtual memory address > > Goal: Ability to recover fully resolved VMA (for linked binaries) > or unrelocated VMA (for DSOs) for use with binutils. > resolved VMA: D > relative VMA: D - B > > > Now I will summarize how offsets are represented in OProfile and > hpcrun (HPCToolkit's process based profiler). Notice that I argue that > some of Oprofile's code documentation is in error. > > OProfile: sample offset: (cf. profile.h:112) D - C for kernel images D - B for user images text offset (cf. op_bfd::get_start_offset()): C - A sample 'VMA' (cf. profile_t::const_iterator::vma()): (cf. profile_t::add_sample_file) D - A for kernel images (adds text offset) D - B for user images (does not add text offset) ==> These 'VMAs' are not very useful! ==> It is possible to: 1. ignore text offset 2. find text vma which is load address for kernel images This gives: (D - C) + C = vma for kernel images (D - B) = relative vma for DSOs (D - B) = junk for linked apps ^^^^^^^^^^^^^^^^^^^^ > hpcrun: > sample offset: D - B > load address (B) is saved > > ==> both needed values are readily available > > > OProfile is able to find the VMAs, but it circuitously forces the > data to be assocated with the symbol table. This is a summary of > the algorithm for "opreport -d" > (cf. profile_container::add_samples) > > Given an image and its corresponding samples > for each symbol sym in image's symbol table > base_vma := bfd_vma(sym) // resolved or relative > (sym_beg, sym_end) := symbol_range(sym) // file offsets > for each sample s associated with the range (sym_beg, sym_end) // how does this vma calculation make sense? the offsets // can mean *different* things vma := base_vma + (s.vma - // D - A/B, as shown above file_offset(sym)) // D - A, by def and test print(vma, s.count) So, the algorithm may figure out VMAs, but it 0. seems to compare two possibly different offsets 1. ignores OProfile's useless 'VMA' (D - A/B) 2. requires binutils to obtain a useful VMA 3. requires samples to be associated with a symbol 4. consequently it ignores samples that are e.g. before the first symbol |