From: Nathan T. <er...@cs...> - 2005-03-23 19:26:20
|
Hi, I am writing a tool to convert OProfile data into something that we can process with HPCToolkit (http://www.hipersoft.rice.edu/hpctoolkit/). When oprofile collects samples, it converts the VMAs to an offset from the beginning of the mapped binary image, or the load address. Is there an easy way to recover this load address value from the sample file? After trying 'opreport -d', I looked around populate.cpp et al. to try to figure out how it calculated VMAs. (This is not so important for DSOs, but is for executable binaries.) However, I found that this is done by using binutils to extract the symbol address and playing some games with it. I was hoping there might be a more direct and efficient way. thanks in advance, -Nathan Tallent. |
From: William C. <wc...@re...> - 2005-03-23 20:31:32
|
Nathan Tallent wrote: > Hi, > > I am writing a tool to convert OProfile data into something that we can > process with HPCToolkit (http://www.hipersoft.rice.edu/hpctoolkit/). > > When oprofile collects samples, it converts the VMAs to an offset from > the beginning of the mapped binary image, or the load address. Is there > an easy way to recover this load address value from the sample file? > > After trying 'opreport -d', I looked around populate.cpp et al. to try > to figure out how it calculated VMAs. (This is not so important for > DSOs, but is for executable binaries.) However, I found that this is > done by using binutils to extract the symbol address and playing some > games with it. I was hoping there might be a more direct and efficient > way. > > thanks in advance, > -Nathan Tallent. The kernel converts the VMA into a dentry and file offset. OProfile doesn't make the VMA available because load addresses could vary, particularly for shared libraries. OProfile accumulates the samples, so VMA from different processes are not going to match up. The hpctoolkit expects things to be VMA addresses? What kind of interface does hpctoolkit need/want? There has been some previous discussion on interface to other tools. Tools like eclipse need a similar interface. It would be good to design the interface to simplify exporting the data to other tools and have a shared library useable by the outside tools. --Will |
From: Nathan T. <er...@cs...> - 2005-03-23 20:49:42
|
Will, Thanks for your response. > The kernel converts the VMA into a dentry and file offset. OProfile > doesn't make the VMA available because load addresses could vary, > particularly for shared libraries. OProfile accumulates the samples, so > VMA from different processes are not going to match up. VMAs will be different for DSOs, though not for linked binaries. In this case, the code will always load at the same virtual address. > The hpctoolkit expects things to be VMA addresses? What kind of > interface does hpctoolkit need/want? There has been some previous The output of opreport -d is almost perfect, except I do not want to be restricted by what is in the symbol table. I basically want: image, <counter, sample period, etc> offset : count ... However, I want to be able to query these offsets for debugging information. Thus, for non DSOs, I'd need the actual vma in order to do the binutils line-lookup. > discussion on interface to other tools. Tools like eclipse need a > similar interface. It would be good to design the interface to simplify > exporting the data to other tools and have a shared library useable by > the outside tools. This sounds like a good idea. We have been thinking about doing just this, but weren't sure you would be interested in maintaining it. -Nathan. |
From: John L. <le...@mo...> - 2005-03-24 02:15:57
|
On Wed, Mar 23, 2005 at 02:49:31PM -0600, Nathan Tallent wrote: > >discussion on interface to other tools. Tools like eclipse need a > >similar interface. It would be good to design the interface to simplify > >exporting the data to other tools and have a shared library useable by > >the outside tools. > > This sounds like a good idea. We have been thinking about doing just > this, but weren't sure you would be interested in maintaining it. We'd be very interested in this, but it would have to be done right: a maintainable, stable, interface that abstracts the details at the right level. Not an easy job. regards john |
From: Nathan T. <er...@cs...> - 2005-03-24 03:49:01
|
Nathan Tallent wrote: > The output of opreport -d is almost perfect, except I do not want to be > restricted by what is in the symbol table. I basically want: > > image, <counter, sample period, etc> > offset : count > ... > > However, I want to be able to query these offsets for debugging > information. Thus, for non DSOs, I'd need the actual vma in order to do > the binutils line-lookup. As briefly explained earlier in this thread, I am wrestling with recovering sample offset values from OProfile. As a clarification to that earlier post, I offer the following. What we want: the uninterpreted OProfile data. This would be similar to "opreport -d", but we want all of the data, not just that associated with symbols. In other words, for each load module, we want all samples and counts for each event. Here is a diagram of how a binary image (linked app or DSO) is represented in memory along with some interesting address points to clarify exactly how sample offsets are represented. A B C D |--|----|----------------|------------------------ txt sample(offset,count) A = address of beginning of image (may or may not equal B) B = load address C = address of first text section (may or may not equal B) D = VMA, virtual memory address Goal: Ability to recover fully resolved VMA (for linked binaries) or unrelocated VMA (for DSOs) for use with binutils. resolved VMA: D relative VMA: D - B Now I will summarize how offsets are represented in OProfile and hpcrun (HPCToolkit's process based profiler). Notice that I argue that some of Oprofile's code documentation is in error. OProfile: sample offset: D - C profile.h:112 claims the sample offset is: D - C for kernel and kernel modules D - B for user space images however, I cannot see how this can be true when the code always adds (C - A) to the offset to get the sample 'VMA' (popluate.cpp:64) text offset, C - A, is given by op_bfd::get_start_offset() sample 'VMA', D - A, is given by profile_t::const_iterator::vma ==> if A == B, then the sample VMA is D - B (something we want) ==> otherwise we cannot find either D - B or D easily (must use the symbol table; see below) hpcrun: sample offset: D - B load address (B) is saved ==> both needed values are readily available OProfile is able to find the VMAs, but it circuitously forces the data to be assocated with the symbol table. This is a summary of the algorithm for "opreport -d" (cf. profile_container::add_samples) Given an image and its corresponding samples for each symbol sym in image's symbol table base_vma := bfd_vma(sym) // resolved or relative (sym_beg, sym_end) := symbol_range(sym) // file offsets for each sample s associated with the range (sym_beg, sym_end) vma := base_vma + (s.vma - // D - A, as shown above file_offset(sym)) // D - A, by def. print(vma, s.count) So, the algorithm does figure out VMAs, but it 1. ignores OProfile's useless 'VMA' (D - A) 2. requires binutils to obtain a useful VMA 3. requires samples to be associatd with a symbol 4. consequently it ignores samples that are e.g. before the first symbol -Nathan Tallent. |
From: Nathan T. <er...@cs...> - 2005-03-24 16:02:55
|
Nathan Tallent wrote: > As briefly explained earlier in this thread, I am wrestling with > recovering sample offset values from OProfile. As a clarification > to that earlier post, I offer the following. [snip] > Now I will summarize how offsets are represented in OProfile and > hpcrun (HPCToolkit's process based profiler). Notice that I argue that > some of Oprofile's code documentation is in error. > > OProfile: > sample offset: D - C > profile.h:112 claims the sample offset is: Retraction: I see that I missed some logic in profile.cpp and that the comment at profile.h:112 *is* correct. I am working on figuring out exactly what is happening. apologies, -Nathan. |
From: Nathan T. <er...@cs...> - 2005-03-25 01:01:16
|
Ok, I've crawled through OProfile code, inserted some extra printfs and think that I understand what happening. Much of what I posted previously is right, but my retraction pointed out the main error. Here's my basic question: OProfile stores samples as file offsets (and the offsets are different depending on whether the image is in kernel or user space). I've figured out how to recover kernel samples offsets (explained below), but it appears the user sample offsets are not very useful because the point from which offsets are recorded is not saved. Is this correct? Is there a way to retrieve the info? My revised assesment of the issue follows. The corrected parts are unquoted. > As briefly explained earlier in this thread, I am wrestling with > recovering sample offset values from OProfile. As a clarification > to that earlier post, I offer the following. > > > What we want: the uninterpreted OProfile data. This would be > similar to "opreport -d", but we want all of the data, not just > that associated with symbols. In other words, for each load > module, we want all samples and counts for each event. > > > Here is a diagram of how a binary image (linked app or DSO) is > represented in memory along with some interesting address points to > clarify exactly how sample offsets are represented. > > A B C D > |--|----|----------------|------------------------ > txt sample(offset,count) > > A = address of beginning of image (may or may not equal B) > B = load address > C = address of first text section (may or may not equal B) > D = VMA, virtual memory address > > Goal: Ability to recover fully resolved VMA (for linked binaries) > or unrelocated VMA (for DSOs) for use with binutils. > resolved VMA: D > relative VMA: D - B > > > Now I will summarize how offsets are represented in OProfile and > hpcrun (HPCToolkit's process based profiler). Notice that I argue that > some of Oprofile's code documentation is in error. > > OProfile: sample offset: (cf. profile.h:112) D - C for kernel images D - B for user images text offset (cf. op_bfd::get_start_offset()): C - A sample 'VMA' (cf. profile_t::const_iterator::vma()): (cf. profile_t::add_sample_file) D - A for kernel images (adds text offset) D - B for user images (does not add text offset) ==> These 'VMAs' are not very useful! ==> It is possible to: 1. ignore text offset 2. find text vma which is load address for kernel images This gives: (D - C) + C = vma for kernel images (D - B) = relative vma for DSOs (D - B) = junk for linked apps ^^^^^^^^^^^^^^^^^^^^ > hpcrun: > sample offset: D - B > load address (B) is saved > > ==> both needed values are readily available > > > OProfile is able to find the VMAs, but it circuitously forces the > data to be assocated with the symbol table. This is a summary of > the algorithm for "opreport -d" > (cf. profile_container::add_samples) > > Given an image and its corresponding samples > for each symbol sym in image's symbol table > base_vma := bfd_vma(sym) // resolved or relative > (sym_beg, sym_end) := symbol_range(sym) // file offsets > for each sample s associated with the range (sym_beg, sym_end) // how does this vma calculation make sense? the offsets // can mean *different* things vma := base_vma + (s.vma - // D - A/B, as shown above file_offset(sym)) // D - A, by def and test print(vma, s.count) So, the algorithm may figure out VMAs, but it 0. seems to compare two possibly different offsets 1. ignores OProfile's useless 'VMA' (D - A/B) 2. requires binutils to obtain a useful VMA 3. requires samples to be associated with a symbol 4. consequently it ignores samples that are e.g. before the first symbol |
From: John L. <le...@mo...> - 2005-03-26 00:08:05
|
On Thu, Mar 24, 2005 at 07:01:12PM -0600, Nathan Tallent wrote: > Here's my basic question: OProfile stores samples as file offsets (and > the offsets are different depending on whether the image is in kernel or > user space). I've figured out how to recover kernel samples offsets > (explained below), but it appears the user sample offsets are not very > useful because the point from which offsets are recorded is not saved. > Is this correct? Is there a way to retrieve the info? I don't understand your question. As you state above, the samples are stored as file offsets. So the "point from which offsets are recorded" is ... the start of the file. Perhaps you're asking "given a profile_t::samples_range() iterator, how do I get the section info for each sample" ?? You'd have to iterate through the BFD sections, looking at ->filepos and size of the section that delimit the offset value, then looking up the ELF info in the asection as needed. regards john |
From: Nathan T. <er...@cs...> - 2005-03-26 00:36:50
|
John, Thanks for your response. John Levon wrote: > On Thu, Mar 24, 2005 at 07:01:12PM -0600, Nathan Tallent wrote: > >>Here's my basic question: OProfile stores samples as file offsets (and >>the offsets are different depending on whether the image is in kernel or >>user space). I've figured out how to recover kernel samples offsets >>(explained below), but it appears the user sample offsets are not very >>useful because the point from which offsets are recorded is not saved. >>Is this correct? Is there a way to retrieve the info? > > > I don't understand your question. As you state above, the samples are > stored as file offsets. So the "point from which offsets are recorded" > is ... the start of the file. No, it is not the start of the file (point A). This is what I was trying to explain with the diagram from my previous mail: A B C D |--|----|----------------|------------------------ txt sample(offset,count) I say this because: 1. I've read the code. 2. From the Internals Manual, chapter 3, section 6: "When a mapping is found that contains the PC value... ...we alter the absolute value such that it is an offset from the start of the mapping (the mapping need not start at the start of the actual file, so we have to consider the offset value of the mapping)." 3. From comments in the source code: (cf. profile.h:112) ... This is done because * we use the information provided in /proc/ksyms, which only gives * the mapped position of .text, and the symbol _text from * vmlinux. This value is used to fix up the sample offsets * for kernel code as a result of this difference (in user-space * samples, the sample offset is from the start of the mapped * file, as seen in /proc/pid/maps). > Perhaps you're asking "given a profile_t::samples_range() iterator, how > do I get the section info for each sample" ?? No. What I would like -- and what is necessary for the data to be useful for other tools -- is to convert the offsets into something that can serve as a persistent and easy index, such as the VMA reported by objdump (which, of course is the VMA stored in ELF section headers). "opreport -d" currently cannot figure out this VMA unless it has a symbol from which it can extract this value. It then add the symbol's VMA to the difference between two offsets. (This is explained in the pseudo code I gave for "opreport -d"). Please also see the bug report I submitted not long ago. There probably is a way to know exactly what image position becomes offset 0 when map'ed by the loader. This would provide an offset that is actually from the beginning of the file. I've been looking into this, but haven't found anything definite yet. hope this is clearer, -Nathan. |
From: John L. <le...@mo...> - 2005-03-26 01:05:51
|
On Fri, Mar 25, 2005 at 06:36:40PM -0600, Nathan Tallent wrote: > >I don't understand your question. As you state above, the samples are > >stored as file offsets. So the "point from which offsets are recorded" > >is ... the start of the file. > > No, it is not the start of the file (point A). This is what I was Let's ignore kernel files for a moment and consider a shared library mapped into a process's address space. In the kernel, the oprofile routines receive a PC value. This is matched against a vm_area_struct that contains the PC value, and then the offset value is calculated as follows: 250 *offset = (vma->vm_pgoff << PAGE_SHIFT) + addr - vma->vm_start; (see drivers/oprofile/buffer_sync.c) So, the offset we feed into userspace is indeed the absolute file offset against 'A' on your diagram: > A B C D > |--|----|----------------|------------------------ > txt sample(offset,count) If you were to open the binary image with a hex editor and go to the file offset, you'd find the instruction that was at the PC value in the process's memory image. > 2. From the Internals Manual, chapter 3, section 6: This is talking about line 250 above. > >Perhaps you're asking "given a profile_t::samples_range() iterator, how > >do I get the section info for each sample" ?? > > No. What I would like -- and what is necessary for the data to be > useful for other tools -- is to convert the offsets into something that > can serve as a persistent and easy index, such as the VMA reported by > objdump (which, of course is the VMA stored in ELF section headers). You're asking for exactly what I said. You want to start off with profile_t::samples_range(), and for each sample, locate its section in the BFD data. The section's ->filepos vs. the sample offset will give you the section-relative offset of the sample. From there, you can add on the load address of the same section to get the relevant run-time VMA. > There probably is a way to know exactly what image position becomes > offset 0 when map'ed by the loader. This is already dealt with by line 250 above, plus the calculation I mention in the previous paragraph. regards, john |
From: Nathan T. <er...@cs...> - 2005-03-28 16:18:04
|
John Levon wrote: > So, the offset we feed into userspace is indeed the absolute file offset > against 'A' on your diagram: > > >> A B C D >> |--|----|----------------|------------------------ >> txt sample(offset,count) > > > If you were to open the binary image with a hex editor and go to the > file offset, you'd find the instruction that was at the PC value in the > process's memory image. Great. Thanks for the clarification here and in the documentation. It looks like the easy and efficient way of calculating a 'persistent VMA' D is to take this offset (D - A) and combine it with the offset of the first text section (C - A) and the persistent VMA (C) of the first text section: D = C + ((D - A) - (C - A)) That will work for both kernel and user space images as well as linked executables and DSOs. -Nathan. |
From: John L. <le...@mo...> - 2005-03-26 01:23:41
|
On Fri, Mar 25, 2005 at 06:36:40PM -0600, Nathan Tallent wrote: > "When a mapping is found that contains the PC value... ...we alter the > absolute value such that it is an offset from the start of the mapping ^^^^^^^ just noticed this typo. This should read "...from the start of the file being mapped..." I've fixed it in CVS. regards john |
From: Nathan T. <er...@cs...> - 2005-03-28 16:09:05
|
John Levon wrote: > On Fri, Mar 25, 2005 at 06:36:40PM -0600, Nathan Tallent wrote: > > >>"When a mapping is found that contains the PC value... ...we alter the >>absolute value such that it is an offset from the start of the mapping > > ^^^^^^^ > > just noticed this typo. This should read "...from the start of the file > being mapped..." > > I've fixed it in CVS. Ah-ha! That makes a big difference. Thanks for your help. -Nathan. |