From: Kaz K. <kky...@gm...> - 2006-11-29 08:09:46
|
In glibc, there is a dlvsym() function for retrieving versioned symbols. This is very useful if you are writing an application, targetting the ABI of a shared library, to ensure that old versions your program will use the right ABI even in new versions of that library (as long as they don't drop support for that ABI). Concrete example. RIght now, I'm using the CLISP FFI to call "__xstat64" in "libc.so.6". But "__xstat64" refers to the latest and greatest ABI for that symbol which a given installation of "libc.so.6" exports. That latest and greatest function could, for instance, think that the struct stat object I'm giving it is bigger than it really is, and write beyond its end. What I want is to access the "GLIBC_2.2" version of that symbol. This might appear in the "nm" listing of the library as "__xstat64@@GLIBC_2.2". You can't get to this symbol with the dlsym API. And the @@ will change to @ anyway if GLIBC_2.2 is no longer the default version for that symbol. The way you ask for this symbol is dlvsym(handle, "__xstat64", "GLIBC_2.2"). And so, what if there was a (:SYMVER ...) option in the CLISP FFI whereby you could specify the version, thereby causing it to use dlvsym? (Rationale for the name: derived from the .symver GNU assembler directive for defining versioned symbols). (def-call-out __xstat64 (:library "libc.so.6") (:symver "GLIBC_2.2") ... etc) Now you can be quite confident that even though you tested the code with, say, glibc-2.3.4, it will still run if you upgrade to glibc 2.5. That is to say, run as well as any of the compiled C programs linked to the library. Comments? |
From: Sam S. <sd...@gn...> - 2006-11-29 14:20:35
|
Kaz Kylheku wrote: > In glibc, there is a dlvsym() function for retrieving versioned symbols. how about other libc implementations? woe32? > (def-call-out __xstat64 > (:library "libc.so.6") > (:symver "GLIBC_2.2") > ... etc) I would prefer (:library "libc.so.6" "GLIBC_2.2") this way we will not have to check that :library is given for each :symver and also this will give a good symver default via default-library. the only problem I see here is that we were thinking about switching to libltdl - does it support this versioning? Sam. |
From: Bruno H. <br...@cl...> - 2006-11-29 15:15:39
|
Kaz Kylheku wrote: > This is very useful if you are writing an application, targetting the > ABI of a shared library, to ensure that old versions your program will > use the right ABI even in new versions of that library (as long as > they don't drop support for that ABI). > > Concrete example. RIght now, I'm using the CLISP FFI to call > "__xstat64" in "libc.so.6". But "__xstat64" refers to the latest and > greatest ABI for that symbol which a given installation of "libc.so.6" > exports. That latest and greatest function could, for instance, think > that the struct stat object I'm giving it is bigger than it really is, > and write beyond its end. Yup, this is a problem, because we have hardcoded in modules/bindings/glibc/linux.lisp definitions like this: (def-c-struct stat (st_dev dev_t) (__pad1 ushort) (st_ino ino_t) (st_mode mode_t) (st_nlink nlink_t) (st_uid uid_t) (st_gid gid_t) (st_rdev dev_t) (__pad2 ushort) (st_size off_t) (st_blksize ulong) (st_blocks ulong) (st_atime time_t) (__unused1 ulong) (st_mtime time_t) (__unused2 ulong) (st_ctime time_t) (__unused3 ulong) (__unused4 ulong) (__unused5 ulong) ) That is, we have extracted the 'struct stat' of a particular glibc version and therefore also need the __xstat function of that particular ABI. But I disagree with the approach. The C library (more precisely its header files and the symbol versioning in libc.so) shields the usual C programmer from such problems. I think clisp should get on the same level, and use the solution that the C library maintainers propose. Otherwise we have to track closely the glibc versions and update the def-c-struct definitions manually in the future. Concretely this means one of the two following approaches: a) Generate the (def-c-struct stat ...) form at compile time, for example by having a C program like this: #include <sys/types.h> #include <sys/stat.h> #include <stdlib.h> #include <stdio.h> int main () { printf("%d\n", offsetof (struct stat, st_dev)); ... printf("%d\n", offsetof (struct stat, st_ctime)); printf("%d\n", sizeof (((struct stat *) 0)->st_dev)); ... printf("%d\n", sizeof (((struct stat *) 0)->st_ctime)); return 0; } and a bit of Lisp code that infers where are the gaps between the fields, based on these offset and size numbers. b) Add a new primitive (def-c-partial-struct ...) to the FFI that causes this gap computation to occur in the FFI, based on C snippets emitted by the .lisp -> .c compiler. (I call it "partial" because the definition of the fields are not complete. The C definition of the struct can have additional fields that are not visible from Lisp.) Bruno |
From: Sam S. <sd...@gn...> - 2006-11-29 15:33:46
|
Bruno Haible wrote: > Concretely this means one of the two following approaches: > a) Generate the (def-c-struct stat ...) form at compile time, > for example by having a C program like this: > > #include <sys/types.h> > #include <sys/stat.h> > #include <stdlib.h> > #include <stdio.h> > int main () > { > printf("%d\n", offsetof (struct stat, st_dev)); > ... > printf("%d\n", offsetof (struct stat, st_ctime)); > printf("%d\n", sizeof (((struct stat *) 0)->st_dev)); > ... > printf("%d\n", sizeof (((struct stat *) 0)->st_ctime)); > return 0; > } > > and a bit of Lisp code that infers where are the gaps between the > fields, based on these offset and size numbers. > b) Add a new primitive (def-c-partial-struct ...) to the FFI that causes > this gap computation to occur in the FFI, based on C snippets emitted > by the .lisp -> .c compiler. (I call it "partial" because the definition > of the fields are not complete. The C definition of the struct can have > additional fields that are not visible from Lisp.) yes, this is what we need. we already have def-c-const that eliminates the need to copy the actual values of #defined symbols into lisp. it would be nice to access C structure automatically too. |
From: Kaz K. <kky...@gm...> - 2006-11-29 15:34:50
|
On 11/29/06, Sam Steingold <sd...@gn...> wrote: > Kaz Kylheku wrote: > > In glibc, there is a dlvsym() function for retrieving versioned symbols. > > how about other libc implementations? woe32? It's an ELF feature. Windows DLLs don't have versioned symbols. Microsoft's idea of symbol versioning is: - write a broken or inadequate function, and then when we figure out what it should really do, add an Ex to its name and keep the old one. - Put a size (pardon me ``dwSize''), field into every structure whose representation might change (by getting more fields at the end). The library function checks the size field to determine which ABI is being used. (This is actually not reasonable; objects should describe themselves, like they do in Lisp, right? But it has limitations. The structure can't exist in two versions that have the same size). - The good old trick of leaving reserved fields in a structure, so today's client application allocates it as big as tomorrow's application. - Miscellaneous other hacks, like the Winsock initialization with a version field before you can do any socket work. > > (def-call-out __xstat64 > > (:library "libc.so.6") > > (:symver "GLIBC_2.2") > > ... etc) > > I would prefer (:library "libc.so.6" "GLIBC_2.2") Or maybe have a property on it like (:library "libc.so.6" :symver "GLIBC_2.2"). > this way we will not have to check that :library is given for each > :symver and also this will give a good symver default via default-library. > > the only problem I see here is that we were thinking about switching to > libltdl - does it support this versioning? [ ... google ...] Apparently a dlvsym wrapper is not in the API. But that could be patched. The versioning is a feature of the underlying ELF object format, plus the toolchain and the libdl.so API. |
From: Bruno H. <br...@cl...> - 2006-11-29 15:15:41
|
Sam Steingold asked: > > In glibc, there is a dlvsym() function for retrieving versioned symbols. > > how about other libc implementations? woe32? Only glibc has dlvsym(). Woe32 uses #defines in the include files to address this problem. > this will give a good symver default via default-library. How do you mean this? A library does not have a "default symbol version". For example glibc-2.4 has symbols openat@@GLIBC_2.4 open@@GLIBC_2.0 but no symbol open@@GLIBC_2.4 If you want the address of the open() function and you don't know its specific version, you cannot use dlvsym(); you must use dlsym(handle,"open"). > the only problem I see here is that we were thinking about switching to > libltdl - does it support this versioning? No. Probably because it's not as useful as Kaz thinks (see the other mail). Bruno |
From: Sam S. <sd...@gn...> - 2006-11-29 15:28:59
|
Bruno Haible wrote: > Sam Steingold asked: >>> In glibc, there is a dlvsym() function for retrieving versioned symbols. >> how about other libc implementations? woe32? > > Only glibc has dlvsym(). Woe32 uses #defines in the include files to address > this problem. > >> this will give a good symver default via default-library. > > How do you mean this? A library does not have a "default symbol version". > For example glibc-2.4 has symbols > > openat@@GLIBC_2.4 > open@@GLIBC_2.0 > > but no symbol > > open@@GLIBC_2.4 > > If you want the address of the open() function and you don't know its > specific version, you cannot use dlvsym(); you must use dlsym(handle,"open"). sym = dlvsym(lib,"foo","ver"); if (sym == NULL) sym = dlsym(lib,"foo"); |
From: Kaz K. <kky...@gm...> - 2006-11-29 17:54:04
|
On 11/29/06, Bruno Haible <br...@cl...> wrote: > Sam Steingold asked: > > > In glibc, there is a dlvsym() function for retrieving versioned symbols. > > > > how about other libc implementations? woe32? > > Only glibc has dlvsym(). Woe32 uses #defines in the include files to address > this problem. This problem is so nontrivial that Microsoft developed COM (out of DCE) to address it. In the base non-COM libraries, versioning is done with size fields and other hacks, like I mentioned in my other e-mail. But primary way by which versioning problems are addressed in the MS environment is by writing and using COM DLL's. Interfaces are versioned, and tied to a 128 bit GUID. The library location problem is also solved by GUIDS. To locate a library, a client passes a ``class ID'' (CLSID) GUID to an API function. That API pulls out the path name from the registry and loads the library. Next, the application asks for an interface, using an ``interface ID'' (IID). If the object supports that interface, it returns a pointer (which happens to point to data that is binary compatible with the way the Microsoft compiler compiles C++ abstract base classes). > A library does not have a "default symbol version". > For example glibc-2.4 has symbols > > openat@@GLIBC_2.4 > open@@GLIBC_2.0 > > but no symbol > > open@@GLIBC_2.4 > > If you want the address of the open() function and you don't know its > specific version, you cannot use dlvsym(); you must use dlsym(handle,"open"). Correct. But if you are writing FFI stuff, you know exactly what is versioned and what isn't. You know that as of a particular version of the library, openat was introduced as a versioned symbol. So you target that symbol, using the version that you want, and declare that your program needs that version of the library or later. In the case of open, since it's not a versioned symbol, you just target open. So what happens if glibc 2.6 comes out and needs to introduce a versioned open? How will your program run? What will happen is that they will make open into an alias: open -> old_open So old clients will be redirected to old_open and continue to run. There will be a new_open function, and some versioned symbol which will alias to that function. open@@GLIBC_2.6 -> new_open So why use versioned symbols and not just keep repeating the aliasing trick? Because the old unversioned symbol can only alias to one version. You have to pick a single function, like old_open, and map open to that. And that's what you're stuck with. Versioning allows open to refer to different things depending on who is asking. Newly compiled programs would be linked to the new open, requesting version GLIBC_2.6. The dynamic linker, ld.so, uses the equivalent of dlvsym to grab "open" at version "GLIBC_2.6". For old clients, there is no version request, so it grabs "open" via the equivalent of dlsym, which resolves to the same address as old_open. When glibc 2.7 comes out, there can be an open@@GLIBC_2.7, as well as an open@GLIBC_2.6. The double @@ indicates that that's the default version that is selected by newly linked programs. > > the only problem I see here is that we were thinking about switching to > > libltdl - does it support this versioning? > > No. Probably because it's not as useful as Kaz thinks (see the other mail). Versioned symbols are the reason why, at all, you can upgrade a GNU/Linux system without everything going haywire, like it did in the a.out days. At work here, I've been able to take a vendor's embedded distro (targetting MIPS) running glibc 2.3.4, and run their binaries under a glibc-2.5 that I compiled. I literally copied the new glibc over top of the root filesystem, and by golly, it booted. That is thanks, in part, to careful ABI versioning. We can only speculate about the reason why libtool's library doesn't have dlvsym (yet!). Maybe that project is more focused on different problems. If a library with versioned symbols is linked using libtool, all that versioning stuff still works. Libtool helps with issues related to finding the library at link time and run time. Maybe nobody is using dlopen() on versioned interfaces with libraries that are used in conjunction with libtool. Typically libaries that are designed for run-time use solve their versioning problems in other ways, one big reason being portability. If you're designing a platform-independent ``plugin'', then you can't just say ``we will use ELF symbol versioning to deal with versioning issues, and too bad those of you who are on non-ELF platforms''. But in the case of glibc, we are run-time linking to a library which is not normally used this way. That project has decided to deal with versioning problems by using the versioning facilities in ELF. Which is fine, since it gets to define the platform, basically. That means that if you want to target that library, you have to play by its rules. glibc is not even linked using libtool, so it would be stupid to use libtool's API to access it. If dlopen("libc.so.6") doesn't work, you have a big problem that libtool won't help you with. |
From: Kaz K. <kky...@gm...> - 2006-11-29 17:36:58
|
On 11/29/06, Bruno Haible <br...@cl...> wrote: > Yup, this is a problem, because we have hardcoded in > modules/bindings/glibc/linux.lisp definitions like this: [ snip ] > That is, we have extracted the 'struct stat' of a particular glibc version > and therefore also need the __xstat function of that particular ABI. > > But I disagree with the approach. The C library (more precisely its > header files and the symbol versioning in libc.so) shields the usual C > programmer from such problems. Yes, and it does that using the same approach. Only, of course, you can re-compile and re-link the C program, which extracts the particular version at that time. In the FFI, you're doing it by hand. > I think clisp should get on the same level, > and use the solution that the C library maintainers propose. That /is/ the solution that they propose: use versioned symbols to target a stable ABI. The extraction of the interface and selection of symbols is hidden in the toolchain, that's all. > Otherwise we > have to track closely the glibc versions and update the def-c-struct > definitions manually in the future. That's an issue within CLISP. Don't confuse that with what should or should not be available to users of CLISP through the FFI interface. If CLISP wants to solve its __xstat issue in some other way, that's fine. In my application, I don't mind maintaining DEF-C-STRUCT definitions by hand. Since CLISP is a complex app which has to be compiled anyway, the considerations are different. But it's nice to be able to package a CLISP program which is nothing but .lisp files that are fed into CLISP. > Concretely this means one of the two following approaches: > a) Generate the (def-c-struct stat ...) form at compile time, > for example by having a C program like this: > > #include <sys/types.h> > #include <sys/stat.h> > #include <stdlib.h> > #include <stdio.h> > int main () > { > printf("%d\n", offsetof (struct stat, st_dev)); > ... > printf("%d\n", offsetof (struct stat, st_ctime)); > printf("%d\n", sizeof (((struct stat *) 0)->st_dev)); > ... > printf("%d\n", sizeof (((struct stat *) 0)->st_ctime)); > return 0; > } The problem is that if this C program were to actually call stat(), it would call an appropriately versioned symbol, and so you could use the binary version of this program with a newer version of glibc, where it would continue to produce the same output. Yet if it were to be recompiled, its output might change. Heck, the program doesn't even tell you that you really need to call some version of __xstat64, which takes an extra parameter. > and a bit of Lisp code that infers where are the gaps between the > fields, based on these offset and size numbers. Right. You also need the size of the entire structure. > b) Add a new primitive (def-c-partial-struct ...) to the FFI that causes > this gap computation to occur in the FFI, based on C snippets emitted > by the .lisp -> .c compiler. (I call it "partial" because the definition > of the fields are not complete. The C definition of the struct can have > additional fields that are not visible from Lisp.) ;; proposed syntax (def-partial-c-struct x (:size <whatever>) ;; extracted using sizeof (y uint32 :offset 24) ;; y member at offset 16 ...)) So now CLISP will allocate an amount of bytes equal to :size for the structure, and only do conversions on the defined fields, ignoring the gaps on the way :IN, and setting them to zero on the way :OUT. If any field extends beyond the limit specified by :SIZE, the compiler for the form can signal an error. This partial struct idea is definitely very good and worth implementing. But it does not solve the ABI versioning problem. It solves the API extraction problem, by reducing manual labor. You need both approaches. Use the partial struct hack to get only the fields you are interested in, so you don't have to keep revising the definition when new fields are added, which you are not even interested in. And then use the versioned symbols to lock in on the ABI which corresponds to the extracted API. |