From: <gr...@re...> - 2002-10-08 20:54:18
|
hi, along our list of things to do is nailing down the file format of oprofile's devices and database files, and making sure they are platform portable and self-describing. this is basically just to make sure that debugging someone else's oprofile setup is plausible, even from another machine. to this end, will suggested I write up an "abi document", which we could put into doc/ or something, which just specifies file formats. so then we would have something in english + diagrams, and explicit at the byte-level, to compare structs, testsuites, and i/o functions against as we carry on development. So I have written a quick draft of such a document, which follows. Any comments or corrections? If it looks OK I'm going to start emitting patches which tighten up at least the database i/o routines to conform. -graydon --- oprofile ABI ~~~~~~~~~~~~ This document is a normative reference to the oprofile ABI, by which we mean the binary layout of two interfaces: - the structure of the mmap()'ed device files used to move data between the kernel and the daemon - the structure of the mmap()'ed database file used to store samples permanently on disk The ABI must satisfy the following requirements: - it must be reasonably simple, and represent roughly what the tools already do at the time of writing. - it must have an unambiguous encoding at the byte stream level. - it must support 31, 32, 64 and possibly larger word-size architectures. - it must support classical "big", "little", and possibly other endiannesses (word byte orders). - the ABI must be reasonably self-identifying and self-describing. This document places no requirements on the reading end; it is acceptable for a reading application to simply give up if some combination of input fields is confusing or unacceptable. The purpose of this document is to lay out what must be *written*, so that a reading application has a chance to decide whether it can sensibly handle a file or not, and so that all information needed to decode the file is available even if the file has been moved to a different host environment. Headers ~~~~~~~ The first 8 bytes of any "oprofile ABI" file are called the header, and establsh a magic number, version codes (major versions being backwards-incompatible, minor being compatible), a width byte, and a reserved byte. note-device header: 0 1 2 3 4 5 6 7 8 +-------+-------+-------+-------+-------+-------+-------+------+ | 'o' | 'p' | 'n' | 'd' | major | minor | width | rsvd | sample-device header: 0 1 2 3 4 5 6 7 8 +-------+-------+-------+-------+-------+-------+-------+------+ | 'o' | 'p' | 's' | 'd' | major | minor | width | rsvd | hash-map device header: 0 1 2 3 4 5 6 7 8 +-------+-------+-------+-------+-------+-------+-------+------+ | 'o' | 'p' | 'h' | 'd' | major | minor | width | rsvd | database file header: 0 1 2 3 4 5 6 7 8 +-------+-------+-------+-------+-------+-------+-------+------+ | 'o' | 'p' | 'd' | 'b' | major | minor | width | rsvd | The width byte specifies the width of subsequent words in the file, after the reserved byte. the width byte is an unsigned value, referred to hereafter as "wb", and is a count of bytes; for example if the wb=8, the remaining words in the file are 8 bytes (64 bits) each. the remainder of the file is a sequence of words of this size, for efficiency sake. The next wb bytes after the header are called the endianness word, referred to hereafter as "eb"; they specify a byte-permutation which maps bytes in subsequent words of the file to sub-word bytes in memory, counting from the low-order byte. In other words, the "identity" interpretation of subsequent words in the file is as classical little-endian words, but any byte order may be specified using appropriate eb bytes. Specifically, let the byte eb[0] be the first byte in the endianness word, and eb[wb-1] be the last byte in the endianness word, then a word in memory can be faithfully constructed from any on-disk format, in any type of host memory, using this procedure: word read_word(int const * eb) { long w = 0x0; for(int i = 0; i < wb; ++i) { w |= (next_byte() << (eb[i] * 8)); } return w; } Obviously, it is illegal for any byte in the endian word to be greater than wb-1, and also that implementations may choose to use more efficient algorithms than the above (such as the identity function on IA32) if they recognize a compatible endianness in the endian word. All words in the remainder of the file must follow the encoding scheme of the endian word. examples: platform wb eb[0]..eb[wb-1] ------------------------+-------+---------------------- IA32 little endian | 4 | 0 1 2 3 IA64 little endian | 8 | 0 1 2 3 4 5 6 7 PPC32 big endian | 4 | 3 2 1 0 PPC64 big endian | 8 | 7 6 5 4 3 2 1 0 kooky word and endian | 5 | 3 4 2 0 1 All remaining structures in either file, with the exception of the string pool in the hash map file, are sequences of words (of length wb bytes) following the endianness mapping of eb. Note device file ~~~~~~~~~~~~~~~~ after the endianness word, the note device file is a dynamically produced stream of notes. there is no limit to the number of notes which occur. each note is 6 words, of the following form: [ address ] [ length ] [ offset ] [ hash of path ] [ pid ] [ note type ] Sample device file ~~~~~~~~~~~~~~~~~~ after the endianness word, the sample device is a dynamically produced stream of sample buffers. there is no limit to the number of sample buffers which occur. each sample buffer begins with a 3-word buffer head: [ cpu number ] [ sample count ] - called sc hereafter [ profiler state ] after the buffer head, there is a sequence of 4*sc words, divided into sc 4-word samples: _ [ sample 0 ] \ [ ] \__ one sample [ ] / [ ]_/ [ sample 1 ] \ [ ] \__ one sample [ ] / [ ]_/ ... ... ... ... _ [ sample sc-1 ] \ [ ] \__ one sample [ ] / [ ]_/ as depicted above, each sample is 4 words long, and contains the following fields: [ eip (instruction pointer) ] [ sample count ] [ hardware counter number ] [ pid (process ID) ] Hash map file ~~~~~~~~~~~~~ after the endianness word, the hash map file has 2 words containing the sizes of the hashtable and string pool: [ size of hashtable ] - called hsz hereafter [ size of stringpool ] - called spsz hereafter followed by (2 * hsz) words containing hash entries _ [ entry 1 ] \_ one hash entry [ ]_/ [ entry 1 ] \_ one hash entry [ ]_/ ... ... _ [ entry hsz-1 ] \_ one hash entry [ ]_/ followed by spsz *bytes*, which is the string pool. strings are packed one after another, are not aligned, and are null-terminated. as depicted above, each hash entry is 2 words long, and contains the following fields: [ name (index in string pool) ] [ parent (index in hashtable) ] Database files ~~~~~~~~~~~~~~ after the endianness word, a DB file has the following 8 words, representing a database-speficif header: [ total size (allocated) ] - called tsz hereafter [ current size (in use) ] [ root page index ] [ page table size ] - called ptsz hereafter [ padding / reserved 0 ] [ padding / reserved 1 ] [ padding / reserved 2 ] [ padding / reserved 3 ] the database header is then followed by a sequence of tsz page table records. each page table record is a sequence of (3*ptsz)+2 words, divided into a 2-word header and ptsz 3-word page table entries (ptes): [ occupation count ] [ left page index ]_ [ page table entry 0 ] \ [ ... ] >-- one pte [ ... ]_/ [ page table entry 1 ] \ [ ... ] >-- one pte [ ... ]_/ ... ... ... _ [ page table entry ptsz-1 ] \ [ ... ] >-- one pte [ ... ]_/ as depicted above, each page table entry is 3 words long, and contains the following fields: [ right page index ] [ database val (sample count) ] [ database key (eip) ] |
From: <gr...@re...> - 2002-10-09 13:20:57
|
doh! the database section of that document should of course read like *this*, not the one I posted yesterday. -graydon Database files ~~~~~~~~~~~~~~ after the endianness word, a DB file has a word specifying the length of the "custom" header: [ length of custom header ] - called chsz hereafter followed by chsz words of custom header, which are specific to an application using the database library. the custom header for an oprofile sample file is typically 21 words long. it is not necessary to interpret this header in order to manipulate a database file, but it is specified here for completeness. the layout of this header is as follows: [ magic ] [ version ] [ is_kernel ] [ counter event ] [ counter unit mask ] [ counter number ] [ cpu type number ] [ counter value ] [ cpu speed (floating point) ] [ mtime ] [ separate samples ] [ reserved 0 ] [ reserved 1 ] ... [ reserved 19 ] after any custom header words, a DB file has the following 8 words, representing a database-file-speficific header: [ total size (allocated) ] - called tsz hereafter [ current size (in use) ] [ root page index ] [ page table size ] - called ptsz hereafter [ padding / reserved 0 ] [ padding / reserved 1 ] [ padding / reserved 2 ] [ padding / reserved 3 ] the database header is then followed by a sequence of tsz page table records. each page table record is a sequence of (3*ptsz)+2 words, divided into a 2-word header and ptsz 3-word page table entries (ptes): [ occupation count ] [ left page index ]_ [ page table entry 0 ] \ [ ... ] >-- one pte [ ... ]_/ [ page table entry 1 ] \ [ ... ] >-- one pte [ ... ]_/ ... ... ... _ [ page table entry ptsz-1 ] \ [ ... ] >-- one pte [ ... ]_/ as depicted above, each page table entry is 3 words long, and contains the following fields: [ right page index ] [ database val (sample count) ] [ database key (eip) ] |
From: John L. <le...@mo...> - 2002-10-09 14:40:47
|
On Wed, Oct 09, 2002 at 09:20:43AM -0400, gr...@re... wrote: > after the endianness word, a DB file has a word specifying the length > of the "custom" header: > > [ length of custom header ] - called chsz hereafter > > followed by chsz words of custom header, which are specific to an > application using the database library. the custom header for an > oprofile sample file is typically 21 words long. it is not necessary > to interpret this header in order to manipulate a database file, but > it is specified here for completeness. the layout of this header is as > follows: > > [ magic ] > [ version ] > [ is_kernel ] > [ counter event ] > [ counter unit mask ] > [ counter number ] > [ cpu type number ] > [ counter value ] > [ cpu speed (floating point) ] > [ mtime ] > [ separate samples ] > [ reserved 0 ] > [ reserved 1 ] > ... > [ reserved 19 ] I don't understand what makes this a "custom" header ? It is just *the* header, surely ? Note that we want to move away from using this header anyway. It's incomplete, and a simple .info ASCII file for the database file is preferred IMHO > the database header is then followed by a sequence of tsz page table > records. each page table record is a sequence of (3*ptsz)+2 words, > divided into a 2-word header and ptsz 3-word page table entries > (ptes): I wonder if this part couldn't be expanded a little bit ? regards john -- "Everything in the world runs through Birmingham, and gets stuck on New Street." - Brian Marsden |
From: John L. <le...@mo...> - 2002-10-09 14:37:33
|
On Tue, Oct 08, 2002 at 04:54:06PM -0400, gr...@re... wrote: > along our list of things to do is nailing down the file format of > oprofile's devices and database files, and making sure they are > platform portable and self-describing. this is basically just to make > sure that debugging someone else's oprofile setup is plausible, even > from another machine. Good. > note-device header: > > sample-device header: > > hash-map device header: I'm completely lost here. Are you seriously suggesting we encode this stuff in the kernel/userspace interface ? Why ? I can't think of a single reason that each of these devices needs a header (especially an endianness header) regards john -- "Everything in the world runs through Birmingham, and gets stuck on New Street." - Brian Marsden |
From: <gr...@re...> - 2002-10-09 18:01:28
|
At Wed, 9 Oct 2002 15:34:26 +0100, John Levon wrote: > I'm completely lost here. Are you seriously suggesting we encode this > stuff in the kernel/userspace interface ? Why ? 2 reasons: 1. because some platforms support bi-endian code and multiple word sizes, typically for backwards compatibility with binaries from previous generations (eg. sparc, mips, ppc, x86_64, ia64). if on the off chance you're using a module of one binary format, and a daemon of another, this will catch it. 2. because if something pathological happens on a customer machine you could have the engineer onsite record the device stream and mail it back to another engineer for analysis. (I admit, they're not *terribly* compelling scenarios. but they exist.) > I don't understand what makes this a "custom" header ? It is just *the* > header, surely ? libdb seems to me to have been designed to accept "any" header in here; the open call is accompanied by a byte count of the custom header, which libdb itself ignores, but permits its caller to examine. if I am wrong about how libdb is designed, then you're right it's just "the" header. > Note that we want to move away from using this header anyway. It's > incomplete, and a simple .info ASCII file for the database file is > preferred IMHO ok. would you prefer it if I implemented code for .info files, rather than support this libdb "user specified header" thing? I could just axe that part of libdb too. > > the database header is then followed by a sequence of tsz page table > > records. each page table record is a sequence of (3*ptsz)+2 words, > > divided into a 2-word header and ptsz 3-word page table entries > > (ptes): > > I wonder if this part couldn't be expanded a little bit ? ok, but. I don't really know what you're looking for. it says how many words there are and what to call each of them. what else should it say? -graydon |
From: John L. <le...@mo...> - 2002-10-09 18:13:37
|
On Wed, Oct 09, 2002 at 02:01:19PM -0400, gr...@re... wrote: > 1. because some platforms support bi-endian code and multiple word > sizes, typically for backwards compatibility with binaries from > previous generations (eg. sparc, mips, ppc, x86_64, ia64). if on the > off chance you're using a module of one binary format, and a daemon of > another, this will catch it. This can only happen if a user who doesn't know what they're doing mis-installs stuff themselves. I'd rather seem them have the pain temporarily, than oprofile source permanently. > 2. because if something pathological happens on a customer machine you > could have the engineer onsite record the device stream and mail it > back to another engineer for analysis. Then you can write a short utility to post-process the stream. You will have contact with the user, so can get the info you need to know how it's encoded. > (I admit, they're not *terribly* compelling scenarios. but they exist.) Sure. But I don't think the exrta complexity is worth it for kernel code. > libdb seems to me to have been designed to accept "any" header in > here; the open call is accompanied by a byte count of the custom > header, which libdb itself ignores, but permits its caller to examine. OK, you're probably right, I don't know the libdb code. > > Note that we want to move away from using this header anyway. It's > > incomplete, and a simple .info ASCII file for the database file is > > preferred IMHO > > ok. would you prefer it if I implemented code for .info files, rather > than support this libdb "user specified header" thing? I could just > axe that part of libdb too. I think so. We need a discussion of what goes in the info file, and how we guarantee unique sample file names (simply profileXXX, where XXX is a unique number, and we have profileXXX.info ?) I'd prefer a simple .info format like : binary: /bin/blah event: CPU_CLK_UNHALTED count: 100000 This change would ease the transition work to pp_interface (which I still want to do). In particular, we should drop the counter number - instead just describe the even setup for the profile file. Please have a read of pp_interface again for where I think things should be headed. > ok, but. I don't really know what you're looking for. it says how many > words there are and what to call each of them. what else should it > say? Perhaps just a little description of how the trees are linked together ... I admit I didn't look closely at this, I may be talking shit regards john -- "Everything in the world runs through Birmingham, and gets stuck on New Street." - Brian Marsden |
From: Philippe E. <ph...@wa...> - 2002-10-09 18:41:00
|
John Levon wrote: > On Wed, Oct 09, 2002 at 02:01:19PM -0400, gr...@re... wrote: > > >>1. because some platforms support bi-endian code and multiple word >>sizes, typically for backwards compatibility with binaries from >>previous generations (eg. sparc, mips, ppc, x86_64, ia64). if on the >>off chance you're using a module of one binary format, and a daemon of >>another, this will catch it. > > > This can only happen if a user who doesn't know what they're doing > mis-installs stuff themselves. I'd rather seem them have the pain > temporarily, than oprofile source permanently. John and me agree on: add complexity only if and when it is usefull. [snip ...] >>libdb seems to me to have been designed to accept "any" header in >>here; the open call is accompanied by a byte count of the custom >>header, which libdb itself ignores, but permits its caller to examine. > > > OK, you're probably right, I don't know the libdb code. yeps libdb receive sizeof(opd_header) as parameter to db_open to know where start the database file, I've tried that to avoid dependencies on opd_header. Perhaps I'll relax that in future because anyway libdb have dependencies on the samples count/eip word size. >>>Note that we want to move away from using this header anyway. It's >>>incomplete, and a simple .info ASCII file for the database file is >>>preferred IMHO >> >>ok. would you prefer it if I implemented code for .info files, rather >>than support this libdb "user specified header" thing? I could just >>axe that part of libdb too. > > > I think so. We need a discussion of what goes in the info file, and how > we guarantee unique sample file names (simply profileXXX, where XXX is a > unique number, and we have profileXXX.info ?) > > I'd prefer a simple .info format like : > > binary: /bin/blah > event: CPU_CLK_UNHALTED > count: 100000 I was thinking about only one .info file uid: xxxx pid: 123 <-- probably need the uid aof parent pid ppid: 125 <-- ditto binary: etc... uid: yyyy ... and samples file like: samples-xxx new pp spec can be used as a guide of what we need. Phil |
From: <gr...@re...> - 2002-10-09 20:25:38
|
At Wed, 9 Oct 2002 19:10:20 +0100, John Levon wrote: > > On Wed, Oct 09, 2002 at 02:01:19PM -0400, gr...@re... wrote: > > This can only happen if a user who doesn't know what they're doing > mis-installs stuff themselves. I'd rather seem them have the pain > temporarily, than oprofile source permanently. ok. I'm just stating what our motivations are; users who don't know what they're doing are a relatively common feature of the world and we were trying to guard against them. I don't know if we have a plan for "maintainer disagrees with feature". perhaps we'll just not do it :) > Then you can write a short utility to post-process the stream. You will > have contact with the user, so can get the info you need to know how > it's encoded. possibly. makes me wonder though: maybe I don't bother making any of the db access code endian- or word-size agnostic at all, I just have it record what format it was written in in its header, and write a single tool, as you say, which can translate to any other word size and endianness. offline, separately. would that be better? > I'd prefer a simple .info format like : ... > Please have a read of pp_interface again for where I think things should > be headed. ... > Perhaps just a little description of how the trees are linked together > ... I admit I didn't look closely at this, I may be talking shit ok, perhaps this can be made to work. though it raises another (possibly heathen) idea in my mind: why are we even using a homebrew db at all? why not use gdbm or berkeley or something? they *do* allow nice quick random storage, unlike hdf5. and we could still store profile attributes *in* the file, as key/val pairs, w/o the need for separate .info files. I don't mean to offend with all these suggestions; I just want to get the problems of storage (which are really besides the point of profiling) solved in ways which cost as little effort and trouble, both now and in the future, as possible. -graydon |
From: John L. <le...@mo...> - 2002-10-09 22:28:08
|
On Wed, Oct 09, 2002 at 04:25:26PM -0400, gr...@re... wrote: > ok. I'm just stating what our motivations are; users who don't know > what they're doing are a relatively common feature of the world and we > were trying to guard against them. I don't know if we have a plan for > "maintainer disagrees with feature". perhaps we'll just not do it :) Well, I don't quite follow. You're going to ship this, right ? So the user has everything already set up when they install Red Hat ... to fuck this bit up takes quite a bit of effort on the part of the erring user, in my opinion. I'd probably be OK with a patch that made oprofiled startup go livid if the detected endianness doesn't match what it should do, unless it's too ugly... > possibly. makes me wonder though: maybe I don't bother making any of > the db access code endian- or word-size agnostic at all, I just have > it record what format it was written in in its header, and write a > single tool, as you say, which can translate to any other word size > and endianness. offline, separately. would that be better? I'd prefer this immensely. > ok, perhaps this can be made to work. though it raises another > (possibly heathen) idea in my mind: why are we even using a homebrew > db at all? why not use gdbm or berkeley or something? they *do* allow > nice quick random storage, unlike hdf5. and we could still store > profile attributes *in* the file, as key/val pairs, w/o the need for > separate .info files. How quick ? phe has stats on average insert times. I believe it's pretty fast ... if you can prove that sleepycat or whatever can match it, then sure, it's preferable to use standard code for obvious reasons. > I don't mean to offend with all these suggestions; I just want to get I think we're a little thicker skinned than that :) If stuff doesn't get questioned we won't get anywhere ... regards john -- "Everything in the world runs through Birmingham, and gets stuck on New Street." - Brian Marsden |
From: Philippe E. <ph...@wa...> - 2002-10-09 17:28:48
|
gr...@re... wrote: > hi, > > along our list of things to do is nailing down the file format of > oprofile's devices and database files, and making sure they are > platform portable and self-describing. this is basically just to make > sure that debugging someone else's oprofile setup is plausible, even > from another machine. > > to this end, will suggested I write up an "abi document", which we > could put into doc/ or something, which just specifies file > formats. so then we would have something in english + diagrams, and > explicit at the byte-level, to compare structs, testsuites, and i/o > functions against as we carry on development. So I have written a > quick draft of such a document, which follows. Any comments or > corrections? If it looks OK I'm going to start emitting patches which > tighten up at least the database i/o routines to conform. Such document is usefull, most of it is fine except: > Headers > ~~~~~~~ > > The first 8 bytes of any "oprofile ABI" file are called the header, > and establsh a magic number, version codes (major versions being > backwards-incompatible, minor being compatible), a width byte, and a > reserved byte. > > note-device header: > > 0 1 2 3 4 5 6 7 8 > +-------+-------+-------+-------+-------+-------+-------+------+ > | 'o' | 'p' | 'n' | 'd' | major | minor | width | rsvd | > > sample-device header: > > 0 1 2 3 4 5 6 7 8 > +-------+-------+-------+-------+-------+-------+-------+------+ > | 'o' | 'p' | 's' | 'd' | major | minor | width | rsvd | > > hash-map device header: > > 0 1 2 3 4 5 6 7 8 > +-------+-------+-------+-------+-------+-------+-------+------+ > | 'o' | 'p' | 'h' | 'd' | major | minor | width | rsvd | > > database file header: > > 0 1 2 3 4 5 6 7 8 > +-------+-------+-------+-------+-------+-------+-------+------+ > | 'o' | 'p' | 'd' | 'b' | major | minor | width | rsvd | Take a bigger header with fixed field, it's less prone error. Roughly something like "\n1\r" part catch transfer mode error between binary/text mode char magic[8] = "opdb\n1\r\0"; char endian_encoding[8]; char major; char minor; char samples_count_length; char instruction_pointer_length; char reserved[12]; > > > The width byte specifies the width of subsequent words in the file, > after the reserved byte. the width byte is an unsigned value, referred > to hereafter as "wb", and is a count of bytes; for example if the > wb=8, the remaining words in the file are 8 bytes (64 bits) each. the > remainder of the file is a sequence of words of this size, for > efficiency sake. Don't we need separate size for eip, samples count, page size etc.. > > The next wb bytes after the header are called the endianness word, > referred to hereafter as "eb"; they specify a byte-permutation which > maps bytes in subsequent words of the file to sub-word bytes in > memory, counting from the low-order byte. In other words, the > "identity" interpretation of subsequent words in the file is as > classical little-endian words, but any byte order may be specified > using appropriate eb bytes. > > Specifically, let the byte eb[0] be the first byte in the endianness > word, and eb[wb-1] be the last byte in the endianness word, then a > word in memory can be faithfully constructed from any on-disk format, > in any type of host memory, using this procedure: > > word read_word(int const * eb) { > long w = 0x0; > for(int i = 0; i < wb; ++i) { > w |= (next_byte() << (eb[i] * 8)); > } > return w; > } > > Obviously, it is illegal for any byte in the endian word to be greater > than wb-1, and also that implementations may choose to use more > efficient algorithms than the above (such as the identity function on > IA32) if they recognize a compatible endianness in the endian word. the libdb db part that insert samples in database should work in native word/endianess mode. How pp tools will deals with different words size ? regards, Phil |
From: <gr...@re...> - 2002-10-09 18:10:36
|
At Wed, 09 Oct 2002 19:28:45 +0000, Philippe Elie wrote: > "\n1\r" part catch transfer mode error between binary/text mode > char magic[8] = "opdb\n1\r\0"; > char endian_encoding[8]; > char major; > char minor; > char samples_count_length; > char instruction_pointer_length; > char reserved[12]; hm. this is possible, but I do not like fixing endianness encoding at 8 bytes; it only takes a tiny bit more logic to future-proof it. I'll add in the bit about catching binary vs. text mode; I hadn't thought of that. > Don't we need separate size for eip, samples count, page size etc.. we might. I don't know. it seemed easy enough to me to just assume that the word size you chose for the file is big enough to fit the largest of those; I assume that is almost always the word size the host machine is fastest at processing. storing the size of every type strikes me as more complex than necessary. do you think the space savings would matter much? > the libdb db part that insert samples in database should work in > native word/endianess mode. How pp tools will deals with different > words size ? I was following the strategy of passing all db opertions through word read/write functions which do the byte conversion when necessary, else read native words. I can do something else if you like. any suggestions? -graydon |
From: <gr...@re...> - 2002-10-10 15:30:11
|
hi, I thought this over some more and ran some tests with various approaches, and as near as I can tell phil's point about mmap()'ed i/o having to happen natively in the db file is correct; byte-at-a-time simply kills performance. and in any case it would be very tedious to change *all* the i/o functions to any other "more neutral" format. so instead, I've rewritten the ABI document to reflect your suggestions: dropping specification of the device files, and prepending a fixed-length header to the sample file which specifies the binary layout of the structures the compiler generated. I then propose to write 1 extra function and 1 extra tool: - a function to check the layout, endianness and sizes of the current host's structures and return an error code if trying to open an incompatible file. - a tool which translates from one layout/size/endianness specification to another, byte-by-byte. I think this will probably be sufficient for our purposes. I've attached the rewritten ABI document (and am abandonning the theory that >64bit machines are important. by the time they are we can adjust the file version number). (as a side note: berkeley DB, despite being a little slow for "live" use, seems common & useful enough that an exporter to it might be as useful or moreso than hdf5. shall I extend the hdf5 exporter to do so as well? maybe call it op_export?) -graydon oprofile sample file ABI ~~~~~~~~~~~~~~~~~~~~~~~~ This document is a normative reference to the oprofile sample file ABI, by which we mean the binary layout of the mmap()'ed buffer the daemon uses to store data. ABI requirements: - it must be reasonably simple, and represent roughly what the tools already do at the time of writing. - it must have an unambiguous encoding at the byte stream level. - it must support 31, 32, and 64 bit architectures. - it must support classical "big", "little", and possibly other endiannesses (word byte orders). - the ABI must be reasonably self-identifying and self-describing. Non-requirements: - storage of all data in a "neutral" format. it is enough that it be a *described* native format, and probably must be to perform well. - ability to be sensibly read by any reader. some readers will always find some read-formats impossible, but they should be able to *tell* when they're going to be reading incorrectly. Header ~~~~~~ The first 64 bytes of the file are called the header, and establsh a magic number, version codes (major versions being backwards-incompatible, minor being compatible), the widths and offsets of various fields within structures, and an endianness mapping for subsequent portions of the file. byte contents -----+----------------------------------------------- 0 | ascii 111 = 'o' 1 | ascii 112 = 'p' 2 | ascii 100 = 'd' 3 | ascii 98 = 'b' 4 | ascii 10 = '\n' 5 | ascii 49 = '1' 6 | ascii 13 = '\r' 7 | ascii 0 = '\0' 8 | major version number 9 | minor version number 10 | length of int type 11 | length of unsigned int type 12 | length of "page count" type 13 | length of "page index" type 14 | length of "value" type 15 | length of "key" type 16 | compiler padding in db_item_t structure type 17 | compiler padding in db_page_t structure type 18 | compiler padding in db_descr_t structure type 19 | max number of pte's in a page (DB_MAX_PAGE) 20 | amount of padding in descr (DB_PAD_DESCR) 21 | offset of child_page within db_item_t 22 | offset of info within db_item_t 23 | offset of key within db_item_t 24 | offset of count within db_page_t 25 | offset of p0 within db_page_t 26 | offset of page_table within db_page_t 27 | offset of size within db_descr_t 28 | offset of current_size within db_descr_t 29 | offset of root_idx within db_descr_t 30 | offset of padding within db_descr_t 31-46| reserved for future use 47 | file offset at which descr occurs (lsb) 48 | file offset at which descr occurs 49 | file offset at which descr occurs 50 | file offset at which descr occurs 51 | file offset at which descr occurs 52 | file offset at which descr occurs 53 | file offset at which descr occurs 54 | file offset at which descr occurs (msb) 56 | endianness map (lsb) 57 | endianness map 58 | endianness map 59 | endianness map 60 | endianness map 61 | endianness map 62 | endianness map 63 | endianness map (msb) All offsets and lengths are in bytes. Compiler padding of a structure type S is the difference between the sum of all the lengths of sub-structures of S and sizeof(S). If compiler padding is worse than 256 bytes, try a compiler pragma which packs the structure a bit. File offsets are encoded as little-endian 64-bit unsigned integers. The endianness map specifies the relationship between subsequent sub-word bytes in the file and their arithmetic significance on the host platform which generated them. Specifically, if eb is the endianness map, then an n-byte datum from disk address k can be read using the following algorithm: unsigned long long read_word_at(unsigned char * disk, off_t k, size_t n, int eb[8]) { unsigned long long datum = 0; for (; n--; ++k) datum |= (disk[k] << (eb[k & 7] * 8)); return datum; } Subsequent items ~~~~~~~~~~~~~~~~ The remainder of the database file consists of structures, arrays, and indices. Two facts are worth emphasizing: 1. That the overall size of the each structure is considered to be the sum of sub-structure sizes _plus_ the compiler padding size for this structure (byte numbers 16-18 in the db header). This value is *not always equal* to the plain sum of sub-structure sizes, nor the current compiler's opinion of sizeof(structure name). 2. That page indices are not the same as byte offsets. After the header, there are *no* concrete byte offsets in the file, only page indices. A page index must be multiplied by the size of the page structure, and added to the base of the page array, to acquire a byte offset. DB Description ~~~~~~~~~~~~~~ The database may then have an application header or padding, followed by the database "descr" structure. This structure states the size, occupation count, and root index of the page tree. The descr structure contains the following fields (offsets specified in db header): name byte length --------------+------------------------- size | length of "page count" type current_size | length of "page count" type root_idx | length of "page index" type padding | length of int type * DB_PAD_DESCR Page array ~~~~~~~~~~ Immediately following the "descr" structure, there is a contiguous array of page structures, linked together (by array indices) in a search tree. Each page contains the following fields: name byte length --------------+------------------------- count | length of "unsigned int" type p0 | length of "page index" type page_table | size of db_item_t struct * DB_MAX_PAGE Items ~~~~~ Each item within a page structure's page_table is a key -> value association, as well as a possible index for a child page structure. Each item structure contains the following fields: name byte length --------------+------------------------- child_page | length of "page index" type info | length of "value" type key | length of "key" type |
From: John L. <le...@mo...> - 2002-10-10 16:53:03
|
On Thu, Oct 10, 2002 at 11:30:01AM -0400, gr...@re... wrote: > so instead, I've rewritten the ABI document to reflect your > suggestions: dropping specification of the device files, and > prepending a fixed-length header to the sample file which specifies > the binary layout of the structures the compiler generated. I then > propose to write 1 extra function and 1 extra tool: > > - a function to check the layout, endianness and sizes of the current > host's structures and return an error code if trying to open an > incompatible file. > > - a tool which translates from one layout/size/endianness > specification to another, byte-by-byte. Sounds good to me. > (as a side note: berkeley DB, despite being a little slow for "live" > use, seems common & useful enough that an exporter to it might be as > useful or moreso than hdf5. shall I extend the hdf5 exporter to do so > as well? maybe call it op_export?) op_export sounds like a fine idea. > 10 | length of int type Phil ? This stuff look OK ? > 16 | compiler padding in db_item_t structure type > 17 | compiler padding in db_page_t structure type > 18 | compiler padding in db_descr_t structure type Why do we need padding rather than just sizeof() pluss the offset values ? > Specifically, if eb is the endianness map, then an n-byte datum from > disk address k can be read using the following algorithm: What platforms are we concerned about that aren't simple little/big endian ? > 1. That the overall size of the each structure is considered to be > the sum of sub-structure sizes _plus_ the compiler padding size > for this structure (byte numbers 16-18 in the db header). This > value is *not always equal* to the plain sum of sub-structure > sizes, nor the current compiler's opinion of sizeof(structure > name). In fact... can't we write out each structure member individually ? regards john -- "Everything in the world runs through Birmingham, and gets stuck on New Street." - Brian Marsden |
From: graydon h. <gr...@re...> - 2002-10-13 21:09:51
Attachments:
oprofile-abi.patch
|
On Thu, 2002-10-10 at 12:49, John Levon wrote: > > - a function to check the layout, endianness and sizes of the current > > host's structures and return an error code if trying to open an > > incompatible file. > > > > - a tool which translates from one layout/size/endianness > > specification to another, byte-by-byte. > > Sounds good to me. Ok, the first part I've now done. I have a small patch (attached) to libdb which writes an ABI header into the db file, and causes db_open to check for ABI compatibility, exiting when there is failure to match. Otherwise it changes nothing about the way libdb works. > ... stuff about using op_export w/o any ABI header on native file ... > > Graydon, what do you think ? Sorry I didn't object too much earlier, > but I really think this would be the preferable approach. Perhaps. Your point about flexibility for future change in the native format is the only one I'd worry about, with this patch. There is no performance impact at all, nor "99% path" complication, just a file open check. But to address the "future-proofing" issue, I padded out a good dozen words of space for ABI header additions, as well as put version numbers in. And I do feel that there is an advantage to this approach: a sample file we retrieve from a user is in *exactly* the form it was written to disk by the daemon, not modified by op_export. It's your call, but this code at any rate does work and is pretty minimal. -graydon |
From: John L. <le...@mo...> - 2002-10-14 01:44:50
|
On Sun, Oct 13, 2002 at 05:09:41PM -0400, graydon hoare wrote: > Perhaps. Your point about flexibility for future change in the native > format is the only one I'd worry about, with this patch. There is no > performance impact at all Fine. Performance obviously isn't an issue with this patch. >, nor "99% path" complication Well, there is. You're writing out a complicated header that nobody needs in 99% of cases. > And I do feel that there is an advantage to this approach: a sample file > we retrieve from a user is in *exactly* the form it was written to disk > by the daemon, not modified by op_export. You're correct - this is an advantage to your approach. However, on balance, I don't feel this outweighs the ugliness of this header stuff. I really think that if you need this, it should be in op_export only - that code will bootstrap its own knowledge of the native layout, and write out all the ABI goop you need (or some other format). This localises the stuff so it's only needed for the very rare cases. And yes, you'll need an op_import too, but you are going to need that *anyway* - else how could you run oprofpp or whatever ? It makes far better sense to localise the inherent ugliness of reading non-native structures into a simple translation tool, than include the stuff in basic libdb. The only problem that could require the "exact form" of the file would be bugs in ABI writing OR reading. This remains true whether it's in external op_import/export tools, or in the main libdb code. So what changes ? In summary: I'd definitely be OK with op_export/import in CVS, but I don't think we want this in its present form. regards john -- "That's just kitten-eating wrong." - Richard Henderson |
From: graydon h. <gr...@re...> - 2002-10-14 20:32:50
|
On Sun, 2002-10-13 at 21:41, John Levon wrote: > Well, there is. You're writing out a complicated header that nobody > needs in 99% of cases. Perhaps I was a bit too generous saying it adds "no" complexity; but the complexity it adds is small. Nothing fundamental about libdb's i/o is altered: after the first 64 bytes the file is byte-for-byte exactly the same as it was. I just add a small header and a write/check pair for that header. I don't think of this as a major penalty, but of course you may differ in opinion on that. > And yes, you'll need an op_import too, but you are going to need that > *anyway* - else how could you run oprofpp or whatever ? It makes far > better sense to localise the inherent ugliness of reading non-native > structures into a simple translation tool, than include the stuff in > basic libdb. I'm not suggesting any code for "reading non-native structures" go into libdb. You're right, that's yucky stuff that belongs in an "op_import" tool. I'm merely asking that libdb be changed so that it always describes the type of file it's producing. I want to avoid the situation where I've got a file in my hands of *unknown* binary structure. > The only problem that could require the "exact form" of the file would > be bugs in ABI writing OR reading. This remains true whether it's in > external op_import/export tools, or in the main libdb code. So what > changes ? In your scenario, you can get this: $ oprofpp -l /usr/bin/emacs segmentation fault (core dumped) and in our scenario, you will at absolute worst get this: $ oprofpp -l /usr/bin/emacs db_open: ABI layout incompatibility. It's a simple matter of usability. In our case we know what to do, and have all the information needed to do it right at hand: run op_import and we're back in business. In your case, you *might* guess it's an ABI bug, or you might not. But even if you do, you then have to go find the copy of the oprofile tools which produced the sample and run op_export and work off the result of that, and that might involve going back to the customer. That's what we're trying to insulate against. -graydon |
From: John L. <le...@mo...> - 2002-10-14 22:22:13
|
On Mon, Oct 14, 2002 at 04:32:38PM -0400, graydon hoare wrote: > Perhaps I was a bit too generous saying it adds "no" complexity; but the > complexity it adds is small. Nothing fundamental about libdb's i/o is > altered: after the first 64 bytes the file is byte-for-byte exactly the > same as it was. I just add a small header and a write/check pair for > that header. I don't think of this as a major penalty, but of course you > may differ in opinion on that. The code failed my "do I understand this immediately" test. When something fails that test, it means it has a much higher hurdle to prove its worth. > I'm not suggesting any code for "reading non-native structures" go into > libdb. You're right, that's yucky stuff that belongs in an "op_import" > tool. I'm merely asking that libdb be changed so that it always > describes the type of file it's producing. I want to avoid the situation > where I've got a file in my hands of *unknown* binary structure. When would that happen and why ? It is such a borderline case: you would have to have your hands on a non-exported sample file that you have lost all context to. It's not worth catering for. > In your scenario, you can get this: > > $ oprofpp -l /usr/bin/emacs > segmentation fault (core dumped) Only if : a) the user has fucked up somehow and is mixing non-native binaries for their system. I don't care about the user fucking this up: they should recompile oprofile and install it properly. b) the user is incapable of running op_export. What circumstances can this happen under ? The most obvious I can think of is ABI bugs, which as I've said, applies in both cases anyway. > copy of the oprofile tools which produced the sample and run op_export > and work off the result of that, and that might involve going back to > the customer. That's what we're trying to insulate against. Sorry, I'm not that interested in your customers unless the resultant feature is generally useful and does not uglify the source. This code fails both tests. A separate op_export passes both. With a separate op_export, you can still have op_export --show-abi - the exact same information is STILL available to you if you really come across an ABI bug. You need two pieces of information - the original sample file, and the native ABI of the remote machine. All you are proposing is to commingle (I love that word !) the two all the time, rather than just when needed. regards john -- "That's just kitten-eating wrong." - Richard Henderson |
From: William C. <wc...@nc...> - 2002-10-14 21:34:54
|
Having headers definitely has benefit to oprofile. The lack of header will cause problems in the future. Currently, all the platforms are ia32 and the expectation is the analysis is done on the same machine that the data was collected on. The embedded and carrier grade linux platforms can benefit from the data that oprofile collects. For these systems the data has to be moved elsewhere for analysis. It is likely that there will be differences between the native data formats used by the machine to collect the data and the native data format used by machine to analyze the data (e.g. powerpc big-endian embedded target and ia32 little-endian host for data analysis). The header in the data files provides a means of catching those mismatch problems, rather than silently doing the wrong thing or mysteriously dying much later in the process. Raw binary files without any information about the layout or data format are a problem. It makes it very hard to "do the right" when migrating the data to another system. op_export and op_import commands are going to be more error prone without the header information because the user will have to specify the input file format and there will be no way to check that is correct. It would much much better to address this host/target problem with headers now rather than have people complain about it a year from now after someone ports OProfile to something that is not an ia32 processor. Then we change the oprofile data format to support headers a year from now. The overhead for the header is small and only occurs to check to see that the file is in the appropriate format. The code that reads in the data needs does not need to change. Tools such as gprof have headers in their data format and already do checks to avoid endian problems. This allows the gmon.out file to be portable between machines: http://sources.redhat.com/binutils/docs-2.12/gprof.info/File-Format.html -Will John Levon wrote: > On Sun, Oct 13, 2002 at 05:09:41PM -0400, graydon hoare wrote: > > >>Perhaps. Your point about flexibility for future change in the native >>format is the only one I'd worry about, with this patch. There is no >>performance impact at all > > > Fine. Performance obviously isn't an issue with this patch. > > >>, nor "99% path" complication > > > Well, there is. You're writing out a complicated header that nobody > needs in 99% of cases. > > >>And I do feel that there is an advantage to this approach: a sample file >>we retrieve from a user is in *exactly* the form it was written to disk >>by the daemon, not modified by op_export. > > > You're correct - this is an advantage to your approach. However, on > balance, I don't feel this outweighs the ugliness of this header stuff. > > I really think that if you need this, it should be in op_export only - > that code will bootstrap its own knowledge of the native layout, and > write out all the ABI goop you need (or some other format). This > localises the stuff so it's only needed for the very rare cases. > > And yes, you'll need an op_import too, but you are going to need that > *anyway* - else how could you run oprofpp or whatever ? It makes far > better sense to localise the inherent ugliness of reading non-native > structures into a simple translation tool, than include the stuff in > basic libdb. > > The only problem that could require the "exact form" of the file would > be bugs in ABI writing OR reading. This remains true whether it's in > external op_import/export tools, or in the main libdb code. So what > changes ? > > In summary: I'd definitely be OK with op_export/import in CVS, but I > don't think we want this in its present form. > > regards > john > |
From: John L. <le...@mo...> - 2002-10-14 22:11:35
|
On Mon, Oct 14, 2002 at 04:45:13PM -0400, William Cohen wrote: > Raw binary files without any information about the layout or data format > are a problem. It makes it very hard to "do the right" when migrating > the data to another system. op_export and op_import commands are going > to be more error prone without the header information because the user > will have to specify the input file format and there will be no way to > check that is correct. Huh ? The user runs the "op_export" tool that is installed. It automatically knows the native layout, because it was built for that target. op_export inserts all the header goop or whatever. Yes, definitely, we want the ability to inspect sample files on non-matching machines. This does not imply we have to have this header goop everywhere. > now. The overhead for the header is small and only occurs to check to > see that the file is in the appropriate format. The code that reads in > the data needs does not need to change. You have to have something or you can't use oprofpp on a non-native sample file, and the whole exercise is pointless. So we definitely have to have an op_import (or the equivalent inlined in the libdb code). I don't want either of these to be in the main code, and I don't think phil does either. regards john -- "That's just kitten-eating wrong." - Richard Henderson |
From: William C. <wc...@nc...> - 2002-10-14 23:26:18
|
John Levon wrote: > On Mon, Oct 14, 2002 at 04:45:13PM -0400, William Cohen wrote: > > >>Raw binary files without any information about the layout or data format >>are a problem. It makes it very hard to "do the right" when migrating >>the data to another system. op_export and op_import commands are going >>to be more error prone without the header information because the user >>will have to specify the input file format and there will be no way to >>check that is correct. > > > Huh ? The user runs the "op_export" tool that is installed. It > automatically knows the native layout, because it was built for that > target. op_export inserts all the header goop or whatever. > > Yes, definitely, we want the ability to inspect sample files on > non-matching machines. This does not imply we have to have this header > goop everywhere. In most cases one wants to minimize the amount of work done on the target system collecting the data. Embedded systems have limited resources and making the embedded system do additional work is not desired. The fewer/smaller executables on the embedded linux system the better. Having op_export on the target specify the host for analysis is the wrong place to do that work. That means embedded machine has to generate a new export for each possible machine doing the analysis. op_export should just do the minimum work to package the data to move off the data collection system. The translation work should be done in op_import on the analysis system where time and space is not an issue. The most reliably way to transfer that machine specific information is in a header in the same file. With the headers there is no chance for the information about the data collection machine to get lost or be misspecified by the person running the op_import program. Most files on Linux system have some type of header information in the form of magic numbers to avoid misinterpreting the file. This allows the system to identify errors like running an executable that is for the wrong architecture. It also allows the Linux loader to "do the right thing." >>now. The overhead for the header is small and only occurs to check to >>see that the file is in the appropriate format. The code that reads in >>the data needs does not need to change. > > > You have to have something or you can't use oprofpp on a non-native > sample file, and the whole exercise is pointless. So we definitely have > to have an op_import (or the equivalent inlined in the libdb code). I > don't want either of these to be in the main code, and I don't think > phil does either. When the oprofile gets accepted in the mainline linux code there are going to be people that are going to want oprofile to work cross platforms. Without the header information it is going to be difficult to make the data files portable cross platforms. Just assuming that the system's data format is the format, elminating the headers, and forcing the user to specify the format leaves oprofile open to possible user errors. OProfile should be robust. Having the headers in the data files will really help make OProfile be able to catch errors. -Will |
From: Keith W. <ke...@tu...> - 2002-10-15 07:38:06
|
William Cohen wrote: > John Levon wrote: > >> On Mon, Oct 14, 2002 at 04:45:13PM -0400, William Cohen wrote: >> >> >>> Raw binary files without any information about the layout or data >>> format are a problem. It makes it very hard to "do the right" when >>> migrating the data to another system. op_export and op_import >>> commands are going to be more error prone without the header >>> information because the user will have to specify the input file >>> format and there will be no way to check that is correct. >> >> >> >> Huh ? The user runs the "op_export" tool that is installed. It >> automatically knows the native layout, because it was built for that >> target. op_export inserts all the header goop or whatever. >> >> Yes, definitely, we want the ability to inspect sample files on >> non-matching machines. This does not imply we have to have this header >> goop everywhere. > > > In most cases one wants to minimize the amount of work done on the > target system collecting the data. Embedded systems have limited > resources and making the embedded system do additional work is not > desired. The fewer/smaller executables on the embedded linux system the > better. Having op_export on the target specify the host for analysis is > the wrong place to do that work. That means embedded machine has to > generate a new export for each possible machine doing the analysis. > op_export should just do the minimum work to package the data to move > off the data collection system. The translation work should be done in > op_import on the analysis system where time and space is not an issue. > The most reliably way to transfer that machine specific information is > in a header in the same file. With the headers there is no chance for > the information about the data collection machine to get lost or be > misspecified by the person running the op_import program. > > Most files on Linux system have some type of header information in the > form of magic numbers to avoid misinterpreting the file. This allows the > system to identify errors like running an executable that is for the > wrong architecture. It also allows the Linux loader to "do the right > thing." Hmm. Howabout: Generate at compile time a 32bit hash of the abi data, use that as the magic number at the top of each generated file. Then oprofpp and other tools that try and interpret the binary files will be able to verify that the file is in the correct format. You still need the right version of op_export, but at least it can tell if it's operating on a file it understands (by comparing magic numbers). ... you get many of the benefits with a single compile-time generated magic number instead of a header. Keith |
From: graydon h. <gr...@re...> - 2002-10-14 23:04:36
|
On Mon, 2002-10-14 at 16:45, William Cohen wrote: > The embedded and carrier grade linux platforms > can benefit from the data that oprofile collects. For these systems the > data has to be moved elsewhere for analysis. The central issue in the embedded space is, I think, that the user can't always "go back and run some program" on the target board, especially since the program might not even be present (for memory needs) on the board. So the ABI information has to leave the daemon automatically while it's running, not as a separate (possibly omitted) pass. On the other hand, John's point is that it adds some muck to the database file format, making the contents of libdb/ slightly more complex to understand, and is unnecessary in "99%" of cases where analysis and use are happening on the same arch/compiler combo. So we talked this over a bit on IRC and I think came to a reasonable compromise, which is that the daemon can be made to emit a (textual, extensible) ABI description to /var/lib/oprofile/abi when it starts up, and the code for writing this description (and converting stuff between ABIs) can all be isolated in a separate abi/ directory, with libabi.a and op_export/op_import handling all the specifics, and the only change to the daemon being a call to "output_abi_format()" in oprofiled::main(). Users *will* be forced to collect /var/lib/oprofile/abi along with the sample files, but that doesn't require running any extra tools on the target board, and actually *reduces* the memory needs on an embedded target since the abi file only needs to be written once, not prepended to every sample file. Good enough? -graydon |
From: William C. <wc...@nc...> - 2002-10-15 15:22:01
|
/var/lib/oprofile/abi sounds reasonable. -Will graydon hoare wrote: > On Mon, 2002-10-14 at 16:45, William Cohen wrote: > > >>The embedded and carrier grade linux platforms >>can benefit from the data that oprofile collects. For these systems the >>data has to be moved elsewhere for analysis. > > > The central issue in the embedded space is, I think, that the user can't > always "go back and run some program" on the target board, especially > since the program might not even be present (for memory needs) on the > board. So the ABI information has to leave the daemon automatically > while it's running, not as a separate (possibly omitted) pass. > > On the other hand, John's point is that it adds some muck to the > database file format, making the contents of libdb/ slightly more > complex to understand, and is unnecessary in "99%" of cases where > analysis and use are happening on the same arch/compiler combo. > > So we talked this over a bit on IRC and I think came to a reasonable > compromise, which is that the daemon can be made to emit a (textual, > extensible) ABI description to /var/lib/oprofile/abi when it starts up, > and the code for writing this description (and converting stuff between > ABIs) can all be isolated in a separate abi/ directory, with libabi.a > and op_export/op_import handling all the specifics, and the only change > to the daemon being a call to "output_abi_format()" in > oprofiled::main(). > > Users *will* be forced to collect /var/lib/oprofile/abi along with the > sample files, but that doesn't require running any extra tools on the > target board, and actually *reduces* the memory needs on an embedded > target since the abi file only needs to be written once, not prepended > to every sample file. > > Good enough? > > -graydon > > > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > oprofile-list mailing list > opr...@li... > https://lists.sourceforge.net/lists/listinfo/oprofile-list > |
From: Philippe E. <ph...@wa...> - 2002-10-12 18:23:53
|
gr...@re... wrote: [snip] IMHO you would prioritize test suite and op_export, publying an abi for the interface between dae/module/pp tools is usefull but also look like a promise we remain compatible with this abi in future, we probably can do this in future but oprofile is an alpha version for now. Impact of db performance is greater in 2.5 oprofile (cause we no longer have a hash table in module). The abi is likely to be broken if we want to improve db performance. > Specifically, if eb is the endianness map, then an n-byte datum from > disk address k can be read using the following algorithm: > > unsigned long long > read_word_at(unsigned char * disk, off_t k, size_t n, int eb[8]) > { > unsigned long long datum = 0; > for (; n--; ++k) > datum |= (disk[k] << (eb[k & 7] * 8)); > return datum; > } do you suggest than pp tools must use unsigned long long as type for all things readed from a samples files ? regards, Phil |
From: John L. <le...@mo...> - 2002-10-12 18:40:01
|
On Sat, Oct 12, 2002 at 08:24:05PM +0000, Philippe Elie wrote: > IMHO you would prioritize test suite and op_export, publying > an abi for the interface between dae/module/pp tools is > usefull but also look like a promise we remain compatible with > this abi in future, we probably can do this in future but oprofile > is an alpha version for now. Impact of db performance is greater > in 2.5 oprofile (cause we no longer have a hash table in module). > The abi is likely to be broken if we want to improve db > performance. I've been thinking too. What I think would be best is to have an op_export that can export a platform-specific sample file into something that can be debugged on another system. Basically we build an op_export on the source machine, therefore it "instinctively" understands the binary format. And it can then export to a platform-agnostic format (which could have the ABI header stuff, or hdf5, or whatever) Pros ---- No complication on the 99% path of using the same machine No performance concerns Flexibility of changes in platform-specific and op_export code Cons ---- Possibility of op_export bugs interfering with the debug effort Graydon, what do you think ? Sorry I didn't object too much earlier, but I really think this would be the preferable approach. regards john -- "That's just kitten-eating wrong." - Richard Henderson |