From: Daniel W. <dan...@gm...> - 2014-07-25 00:04:10
|
Here I elaborate on my point in calling libelf and libdwarf the APIs to nowhere using a detailed example. On Thu, Jul 24, 2014 at 1:21 PM, Daniel Wilkerson <dan...@gm...> wrote: > On Thu, Jul 24, 2014 at 10:15 AM, Joseph Koshy > <jk...@us...> wrote: >> dw> Libelf and Libdwarf are APIs to nowhere: they are full of functions >> dw> that tell you how to get a foo from a bar, but do not tell you how to >> dw> get a bar in the first place or what you can do with a foo once you >> dw> have one. >> > jk> If you are looking for an overview of the ELF(3) API, you could > jk> look at the tutorial "libelf by Example". It is the first search result > jk> when one searches for "libelf tutorial". > dw> Yes, libelf by example is pretty good for just getting started. dw> Mostly my complaint is about dwarf, but there are still some dw> interesting open questions about elf. So here is an example of what I am talking about. Libelf by Example, chapter 4 "Examining the Program Header Table", you explain what a "segment" is and then you show an example. This example shows how to use libelf to load an elf, get the number of entries in the program header table (which seem to be another name for what one might call "segment headers") using elf_getphdrnum(), and how to get the program header itself (a GElf_Phdr ) using gelf_getphdr(). Ok, now that I have a GElf_Phdr phdr, what can I do with it? Typing man gelf_getphdr gives me a page which refers me to pages such as man gelf(3) which helpfully explains: GElf_Phdr A class-independent representation of an ELF Program Header Table entry. Ok, but what is the API I can use to do things with a GElf_Phdr ?! As far as I can tell from your docs, nothing. So I started grepping through your source. I find that a GElf_Phdr is basically the same as an Elf64_Phdr: gelf.h:46:typedef Elf64_Phdr GElf_Phdr; /* Program header */ Now back in man elf(5) we find lots of information on a Elf64_Phdr: typedef struct { uint32_t p_type; uint32_t p_flags; Elf64_Off p_offset; Elf64_Addr p_vaddr; Elf64_Addr p_paddr; uint64_t p_filesz; uint64_t p_memsz; uint64_t p_align; } Elf64_Phdr; This is followed by extensive explanation on the meaning of each field (which you also detail in Libelf by Example). Ok, that's helpful: I can find out all kinds of information about the segment. But suppose I wanted to actually load said segment into memory (if, say, its p_type says it is loadable). How do I get the data of the segment? The p_offset field looks promising, but what is it an offset from *exactly*. Do I really just seek from the start of the file? Even if I try this simpler interpretation and it works on small examples, that doesn't mean I'm really using it correctly and it will keep working. Elf sections turned out to be more complex that that: there is a handy API called elf_getdata() which returns successive data blocks, all parts of a given section. What is the program header equivalent of elf_getdata() ? Again, I find nothing when I look; if it is there, I would argue that it is not easy to find. So again, I start grepping through your source. Hmm, this is interesting. I think your comments are off by one: read each comment and pair it carefully with the name of its field. From common/elfdefinitions.h: /* 64 bit PHDR entry. */ typedef struct { Elf64_Word p_type; /* Type of segment. */ Elf64_Word p_flags; /* File offset to segment. */ Elf64_Off p_offset; /* Virtual address in memory. */ Elf64_Addr p_vaddr; /* Physical address (if relevant). */ Elf64_Addr p_paddr; /* Size of segment in file. */ Elf64_Xword p_filesz; /* Size of segment in memory. */ Elf64_Xword p_memsz; /* Segment flags. */ Elf64_Xword p_align; /* Alignment constraints. */ } Elf64_Phdr; Ah, finally I find it in size/size.c: static void handle_core_note(Elf *elf, GElf_Ehdr *elfhdr, GElf_Phdr *phdr, char **cmd_line) { size_t max_size; uint64_t raw_size; GElf_Off offset; static pid_t pid; uintptr_t ver; Elf32_Nhdr *nhdr, nhdr_l; static int reg_pseudo = 0, reg2_pseudo = 0, regxfp_pseudo = 0; char buf[BUF_SIZE], *data, *name; . . . data = elf_rawfile(elf, &max_size); offset = phdr->p_offset; while (data != NULL && offset < phdr->p_offset + phdr->p_filesz) { nhdr = (Elf32_Nhdr *)(uintptr_t)((char*)data + offset); So the base of the p_offset is the return value of elf_rawfile() ! Yea! But do note how obscure the answer is. There is no manpage for handle_core_note(); it's just an obscure internal function. Note that I picked libelf as an example because you seem to think Libelf by Example and the man pages are sufficient. Libdwarf presents many more elaborate examples of the above situation, some of which I have yet to solve. In sum, my complaint is not that you do not show me how to get started, my complaint is that you do not show me how to get finished! This kind of situation happens repeatedly in the libelf and libdwarf APIs, which is how come I called them the APIs to nowhere. Two suggestions: (1) Elf is a container for information, so don't just show me how to read the meta-data from the container, show me how to go *through* the container to get at the contained data! Libelf by example is great as far as it goes; please go just a bit further and show actually getting the contained data out. I think in most cases this is just a couple more API calls. (2) More generally, show me how to locally walk the graph of nouns and verbs. That is, the man pages would be far more helpful if they each had two more paragraphs: (a) the APIs to get the *inputs* to the function in question, and (b) the APIs to operate on the *outputs* of the function in question. Then I can connect APIs together and walk the graph from a Foo to a get_Foo(Foo const *foo, Bar *bar) to a Bar, to a load_Bar(Bar *const, void * data, int *length), etc. Daniel |