|
From: Dimitri Papadopoulos-O. <pap...@sh...> - 2003-11-14 13:39:22
|
Hi, I'm trying to run valgrind-2.0.0 on a program that makes use of OpenGL on Red Hat Linux 9. This is a dual-processor Dell with NVidia card and latest drivers 1.0-4496. Valgrind dies. Any clue? $ setenv __GL_FORCE_GENERIC_CPU 1 $ valgrind anatomist.bin ==14331== Memcheck, a.k.a. Valgrind, a memory error detector for x86-linux. ==14331== Copyright (C) 2002-2003, and GNU GPL'd, by Julian Seward. ==14331== Using valgrind-2.0.0, a program supervision framework for x86-linux. ==14331== Copyright (C) 2000-2003, and GNU GPL'd, by Julian Seward. ==14331== Estimated CPU clock rate is 1996 MHz ==14331== For more details, rerun with: -v ==14331== valgrind: vg_ldt.c:167 (vgPlain_do_useseg): Assertion `(seg_selector & 7) == 7' failed. sched status: Thread 1: status = Runnable, associated_mx = 0x0, associated_cv = 0x0 ==14331== at 0x41643641: __nvsym18200 (in /usr/lib/tls/libGL.so.1.0.4496) Note: see also the FAQ.txt in the source distribution. It contains workarounds to several common problems. If that doesn't help, please report this bug to: js...@ac... In the bug report, send all the above text, the valgrind version, and what Linux distro you are using. Thanks. $ -- Dimitri |
|
From: Dirk M. <dm...@gm...> - 2003-11-14 14:06:18
|
On Friday 14 November 2003 14:41, Dimitri Papadopoulos-Orfanos wrote: > I'm trying to run valgrind-2.0.0 on a program that makes use of OpenGL > on Red Hat Linux 9. This is a dual-processor Dell with NVidia card and > latest drivers 1.0-4496. Valgrind dies. Any clue? you apparently compiled valgrind on a different system than you're running it on. try export LD_ASSUME_KERNEL=2.4.1 |
|
From: Dimitri Papadopoulos-O. <pap...@sh...> - 2003-11-14 14:17:02
|
Hi, >>I'm trying to run valgrind-2.0.0 on a program that makes use of OpenGL >>on Red Hat Linux 9. This is a dual-processor Dell with NVidia card and >>latest drivers 1.0-4496. Valgrind dies. Any clue? > > > you apparently compiled valgrind on a different system than you're running it > on. No, it was compiled on the same machine. > try export LD_ASSUME_KERNEL=2.4.1 Isn't Valgrind 2.0.0 supposed to set this variable itself? Anyway, it doesn't help. The error message remains the same. -- Dimitri |
|
From: Tom H. <th...@cy...> - 2003-11-14 14:36:33
|
In message <200...@gm...>
Dirk Mueller <dm...@gm...> wrote:
> On Friday 14 November 2003 14:41, Dimitri Papadopoulos-Orfanos wrote:
>
>> I'm trying to run valgrind-2.0.0 on a program that makes use of OpenGL
>> on Red Hat Linux 9. This is a dual-processor Dell with NVidia card and
>> latest drivers 1.0-4496. Valgrind dies. Any clue?
>
> you apparently compiled valgrind on a different system than you're
> running it on.
>
> try export LD_ASSUME_KERNEL=2.4.1
Actually I don't think it is that. Somebody else saw this with the
NVidia OpenGL drivers a while ago - it seems that they only supply
an NPTL version and don't mark the library in the right way for the
fallback to work when LD_ASSUME_KERNEL is set anyway.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Dimitri Papadopoulos-O. <pap...@sh...> - 2003-11-14 15:07:09
|
Hi, > Actually I don't think it is that. Somebody else saw this with the > NVidia OpenGL drivers a while ago - it seems that they only supply > an NPTL version and don't mark the library in the right way for the > fallback to work when LD_ASSUME_KERNEL is set anyway. That was me I think: http://sourceforge.net/mailarchive/forum.php?thread_id=3094441&forum_id=32038 I had forgotten about that I'm afraid. Sorry. I tried reporting this to NVidia, but they don't seem to care. Maybe I didn't find the correct address to send bug reports to. Maybe I didn't explain the problem well enough, I'm not sure I understand how libraries should be marked. I wasn't able to find any documentation about that on the Web. Do you have a pointer to such documentation? Can we expect an NPTL version of Valgrind in 2004? -- Dimitri |
|
From: Tom H. <th...@cy...> - 2003-11-14 15:26:26
|
In message <3FB...@sh...>
Dimitri Papadopoulos-Orfanos <pap...@sh...> wrote:
> Can we expect an NPTL version of Valgrind in 2004?
I posted a message a while back explaining that I'm not convinced it
is possible with the current threading model - I have tried.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Dimitri Papadopoulos-O. <pap...@sh...> - 2003-11-14 17:25:03
|
Hi, >>Can we expect an NPTL version of Valgrind in 2004? > > > I posted a message a while back explaining that I'm not convinced it > is possible with the current threading model - I have tried. I have Insure++ here, which provides Chaperon, an equivalent to Valgrind. It has been ported to Red Hat 9, and works with both NPTL and NVidia drivers. Maybe it's possible to get Valgrind to work on Red Hat after all? I'm not saying it's easy or that it won't be ugly, but maybe it's technicaly feasible. -- Dimitri |
|
From: Dimitri Papadopoulos-O. <pap...@sh...> - 2003-11-14 16:06:52
|
Hi, >> Actually I don't think it is that. Somebody else saw this with the >> NVidia OpenGL drivers a while ago - it seems that they only supply >> an NPTL version and don't mark the library in the right way for the >> fallback to work when LD_ASSUME_KERNEL is set anyway. > > > That was me I think: > http://sourceforge.net/mailarchive/forum.php?thread_id=3094441&forum_id=32038 Maybe this problem could be add to the FAQ? It should probably be added under both Q10 and Q11. Valgrind is unable to support NPTL in its current state, and NVidia libraries prevent from disabling NPTL correctly: Q10. I upgraded to Red Hat 9 and threaded programs now act strange / deadlock when they didn't before. A10. Thread support on glibc 2.3.2+ with NPTL is not as good as on older LinuxThreads-based systems. We have this under consideration. Avoid Red Hat >= 8.1 for the time being, if you can. 5 May 03: 1.9.6 should be significantly improved on Red Hat 9, SuSE 8.2 and other glibc-2.3.2 systems. Actually A10 should be improved, probably this way: * Valgrind is unable to support NPTL. * Avoid Red Hat >= 8.1 if you can. * If you can't, try disabling NPTL using LD_ASSUME_KERNEL. Versions of Valgind >= ... do that automatically. * This won't work with NVidia's drivers. and the "5 May 03" note should be removed as it's not really informative. -- Dimitri |
|
From: Dan K. <da...@ke...> - 2003-11-14 17:25:21
|
Tom Hughes wrote: >>On Friday 14 November 2003 14:41, Dimitri Papadopoulos-Orfanos wrote: >> >>>I'm trying to run valgrind-2.0.0 on a program that makes use of OpenGL >>>on Red Hat Linux 9. This is a dual-processor Dell with NVidia card and >>>latest drivers 1.0-4496. Valgrind dies. Any clue? >> > ... Somebody else saw this with the > NVidia OpenGL drivers a while ago - it seems that they only supply > an NPTL version and don't mark the library in the right way for the > fallback to work when LD_ASSUME_KERNEL is set anyway. Aha. That's good to know. Anyone know if NVidia has been informed of this? - Dan |
|
From: Jeremy F. <je...@go...> - 2003-11-17 01:19:19
|
On Fri, 2003-11-14 at 08:51, Tom Hughes wrote: > As valgrind has it's own libpthread we would have to work out how > setup the thread area for the new thread, which seems to be very > complicated as the GDT entry set in the kernel seems to point at a > thread descriptor structure which in turns points at the TLS data > and so on. OK, I looked at this in more detail. There are two parts of the puzzle: 1: There's the kernel mechanism for setting up a thread-local storage area, using the set_thread_area syscall. The argument to this is a segment descriptor, like the one used for set_ldt. This segment descriptor is assigned to one of the 3 TLS entries in the GDT. On thread context switch, the kernel reassigns the GDT entries to the thread's TLS segment. The thread itself assigns that descriptor to one of its segment registers, and uses %seg:0 as the pointer to its TLS area. This is easy for us to implement, since Julian has already done the hard work. We can store a per-thread "GDT" containing only these entries as part of each Thread structure. In VG_(do_useg)() we just look for the GDT (vs LDT) bit in the segment selector and do the appropriate thing. 2: All this is to support the new TLS extensions to the ABI. These are described in detail in http://people.redhat.com/drepper/tls.pdf. I've read this once, but I still don't understand the details. The essential point is that ELF files can now have a PT_TLS segment which is used as a prototype segment for thread-local variables. When a new thread is created, it effectively gets a new copy of the contents of the PT_TLS segment attached to its own thread area (pointed to via %gs). This is does lazily, so only the TLS segments which the thread actually uses are copied for it. This means that there's cooperation between the dynamic linker and libpthread. Since we control the one but not the other, we need to make our libpthread compatible with the dynamic linker's TLS implementation. The ABI documents some of this, but unfortunately it only documents the compiler interface to this goo, but not the internal interfaces between libpthread and the ld.so. The easy part of this is making sure that new threads get their own new TLS areas (which the glibc libpthread does by passing CLONE_SETTLS to clone(), which is the equivalent of doing set_thread_area() in the new thread). Trickier is getting the details of all the other structures right. And since they're internal to glibc, there's no certainty they won't change from release to release... BTW, this is completely independent of NPTL. The TLS stuff is an extension to the ABI which is also supported by the pre-NPTL threads library (though I'm not sure how they make do without the set_thread_area syscall). BTW(2): Nvidia will use set_thread_area and this tls mechanism when available. Otherwise they implement it themselves using a similar segment trick. In summary, I'm not sure what to do about this. J |
|
From: Jeremy F. <je...@go...> - 2003-11-14 16:35:32
|
On Fri, 2003-11-14 at 07:26, Tom Hughes wrote: > In message <3FB...@sh...> > Dimitri Papadopoulos-Orfanos <pap...@sh...> wrote: > > > Can we expect an NPTL version of Valgrind in 2004? > > I posted a message a while back explaining that I'm not convinced it > is possible with the current threading model - I have tried. Actually, there seem to be two distinct issues here. There's NPTL and there's TLS. Are you saying we can't implement TLS without doing all the other NPTL stuff? J |
|
From: Dirk M. <dm...@gm...> - 2003-11-14 15:11:57
|
On Friday 14 November 2003 16:09, Dimitri Papadopoulos-Orfanos wrote: > Can we expect an NPTL version of Valgrind in 2004? That depends on if any of the developers is actually adventurous enough to use Redhat and port it to their outdated, incompatible and horribly buggy copy of what will eventually become NPTL in the future. |
|
From: Daniel V. <vei...@re...> - 2003-11-14 15:23:51
|
On Fri, Nov 14, 2003 at 04:11:49PM +0100, Dirk Mueller wrote: > On Friday 14 November 2003 16:09, Dimitri Papadopoulos-Orfanos wrote: > > > Can we expect an NPTL version of Valgrind in 2004? > > That depends on if any of the developers is actually adventurous enough to use > Redhat and port it to their outdated, incompatible and horribly buggy copy of > what will eventually become NPTL in the future. Let's say that this an extremely biased answer. NPTL was developped by Red Hat, maybe this infuriates you (no idea why), maybe you hope to have things changed (that may be reasonable), but such vitriolic jugement is poor communication at best, yours, Daniel -- Daniel Veillard | Red Hat Network https://rhn.redhat.com/ vei...@re... | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/ |
|
From: Dirk M. <dm...@gm...> - 2003-11-14 15:51:48
|
On Friday 14 November 2003 16:23, Daniel Veillard wrote: > > That depends on if any of the developers is actually adventurous enough > > to use Redhat and port it to their outdated, incompatible and horribly > > buggy copy of what will eventually become NPTL in the future. > Let's say that this an extremely biased answer. NPTL was developped > by Red Hat, maybe this infuriates you (no idea why), maybe you hope > to have things changed (that may be reasonable), but such vitriolic > jugement is poor communication at best, So you say that a) the version of NPTL in Redhat 9 is still current with the development tree b) the version of NPTL in Redhat 9 is fully binary and behaviour compatible with the stuff that went into kernel 2.6 c) the version of NPTL in Redhat 9 is without bugs ? Lets just say that this was an extremely biased answer. About the "no idea why": I don't think this belongs on a public mailing list, but since you started dragging it into it, I've to give it an answer: I find it unacceptable to do such a massive ABI change in a "vendor" kernel that is labeled as "linux 2.4". Its not anything near linux 2.4 what you ship. Instead you expect that every 3rd party developer will be happy and full of joy that you again managed to release a distro that is in core parts completely incompatible to any (!) other distro out there. That alone wouldn't be too bad (you could claim that you're technically ahead of other distributions), but the fact is that its not even compatible to what went into vanilla kernel makes it a bad decision. But after redhat-gcc and redhat-glibc it was just a matter of time until there would be a redhat-kernel. But anyway, maybe you can raise the communication level by explaining how we can support NPTL in a compatible way. I don't see any, but maybe you know more than me. Dirk |
|
From: Daniel V. <vei...@re...> - 2003-11-14 16:25:50
|
On Fri, Nov 14, 2003 at 04:51:44PM +0100, Dirk Mueller wrote: > On Friday 14 November 2003 16:23, Daniel Veillard wrote: > > > > That depends on if any of the developers is actually adventurous enough > > > to use Redhat and port it to their outdated, incompatible and horribly > > > buggy copy of what will eventually become NPTL in the future. > > Let's say that this an extremely biased answer. NPTL was developped > > by Red Hat, maybe this infuriates you (no idea why), maybe you hope > > to have things changed (that may be reasonable), but such vitriolic > > jugement is poor communication at best, > > So you say that > > a) the version of NPTL in Redhat 9 is still current with the development tree > b) the version of NPTL in Redhat 9 is fully binary and behaviour compatible > with the stuff that went into kernel 2.6 > c) the version of NPTL in Redhat 9 is without bugs > > ? Of course not. Red Hat Linux 9 was released one year ago, that was the first NPTL release who made into a distro, that's all. > Lets just say that this was an extremely biased answer. [...] > But anyway, maybe you can raise the communication level by explaining how we > can support NPTL in a compatible way. I don't see any, but maybe you know > more than me. Have you tried to ask the NPTL main developpers ? Ingo Molnar and Ulrich Drepper will probably give precise answers if you ask precise questions. And if they say that trying to get this working in 2.4 is impossible or too much work/mess for the timeframe left with the 2.4 kernel, then you have something to answer to enquiring users without sounding like someone just bashing their work ! Daniel -- Daniel Veillard | Red Hat Network https://rhn.redhat.com/ vei...@re... | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/ |
|
From: Dan K. <da...@ke...> - 2003-11-14 17:21:58
|
Dirk Mueller wrote: > So you say that > > a) the version of NPTL in Redhat 9 is still current with the development tree > b) the version of NPTL in Redhat 9 is fully binary and behaviour compatible > with the stuff that went into kernel 2.6 > c) the version of NPTL in Redhat 9 is without bugs > > ? > > Lets just say that this was an extremely biased answer. > > About the "no idea why": I don't think this belongs on a public mailing list, > but since you started dragging it into it, I've to give it an answer: I find > it unacceptable to do such a massive ABI change in a "vendor" kernel that is > labeled as "linux 2.4". Its not anything near linux 2.4 what you ship. > Instead you expect that every 3rd party developer will be happy and full of > joy that you again managed to release a distro that is in core parts > completely incompatible to any (!) other distro out there. That alone > wouldn't be too bad (you could claim that you're technically ahead of other > distributions), but the fact is that its not even compatible to what went > into vanilla kernel makes it a bad decision. ? The compatibility doesn't seem to bad to me. *Someone* had to take the hit and be the first to switch to NPTL etc., and I'm really glad Red Hat did it. Yes, there was some disruption, but it was minimal and responsibly handled, and worth it. > But after redhat-gcc and > redhat-glibc it was just a matter of time until there would be a > redhat-kernel. Aha, you're still pissed about gcc-2.96-rh, aren't you? My personal advice is: let it go. Red Hat had to ship that; they supported it well; and they moved to a standard gcc when that was ready. I fully support what Red Hat did (except perhaps for naming). Apologies for replying to an off-topic thread... - Dan |
|
From: Jeremy F. <je...@go...> - 2003-11-14 16:43:16
|
On Fri, 2003-11-14 at 05:41, Dimitri Papadopoulos-Orfanos wrote: > Hi, > > I'm trying to run valgrind-2.0.0 on a program that makes use of OpenGL > on Red Hat Linux 9. This is a dual-processor Dell with NVidia card and > latest drivers 1.0-4496. Valgrind dies. Any clue? NVidia install both TLS and non-TLS versions of their libraries, but always seem to use the TLS version, even if you use LD_ASSUME_KERNEL=2.4.1 (as we do in the valgrind script). You can force it to use the non-TLS versions by hiding the TLS ones: $ cd /usr/lib/tls $ mkdir hide $ mv ligGL* hide This works for me - but you need to make sure you always use LD_ASSUME_KERNEL=2.4.1 when you run your code without Valgrind. BTW, the current 2.0.0 release (Nov 11) implements enough SSE to work without the GENERIC_CPU setting, at least for glxgears. J |
|
From: Bill R. Jr. <bru...@te...> - 2003-11-14 17:15:03
|
On Fri, Nov 14, 2003 at 08:43:14AM -0800, Jeremy Fitzhardinge wrote:
> $ cd /usr/lib/tls
> $ mkdir hide
> $ mv ligGL* hide
>
> This works for me - but you need to make sure you always use
> LD_ASSUME_KERNEL=2.4.1 when you run your code without Valgrind.
You can avoid this by running valgrind in a private namespace,
and using bind-mounts to cover/hide the libraries.
There is as yet no standard exec-chaining utility (ala time(1),
nice(1), etc.) for creating a private namespace, but it is rather
trivial:
#include <syscall.h>
static inline _syscall2(int, clone, int, flags, int, foo)
#define CLONE_NEWNS 0x00020000 /* New namespace */
...
int pid = clone(CLONE_NEWNS | SIGCHLD,0);
if (pid == 0)
...
In your new namespace, just do
sudo mount -n --bind ...
to avoid mucking with /etc/mtab.
Regards,
Bill Rugolsky
|
|
From: Tom H. <th...@cy...> - 2003-11-14 16:51:48
|
In message <106...@ix...>
Jeremy Fitzhardinge <je...@go...> wrote:
> > I posted a message a while back explaining that I'm not convinced it
> > is possible with the current threading model - I have tried.
>
> Actually, there seem to be two distinct issues here. There's NPTL and
> there's TLS. Are you saying we can't implement TLS without doing all
> the other NPTL stuff?
I can't even get TLS going properly...
I have a version of valgrind that can handle the GDT references
and maintain a mirror of the kernel's thread area table. It doesn't
currently handle set_thread_area and get_thread_area properly but
that would be easy to do.
In fact it will run single threaded programs under NPTL without any
problems as far as I can see.
The problem that I encountered is knowing how to clone the TLS data
when a new thread is started. The TLS data for the first thread is
initialised by ld.so before valgrind gets control so I just use the
get_thread_area system call to ask the kernel where it is.
When a new thread is started the NPTL libpthread creates a new
thread descriptor which seems to include a new copy of the TLS data
from what I could see. It then passes the pointer to that new block
to the clone call and clone sets up the kernel's thread area pointer
for the new thread.
As valgrind has it's own libpthread we would have to work out how
setup the thread area for the new thread, which seems to be very
complicated as the GDT entry set in the kernel seems to point at a
thread descriptor structure which in turns points at the TLS data
and so on.
The other bits of NPTL shouldn't be a problem should they? It just
means implementing things like the futex system call.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Jeremy F. <je...@go...> - 2003-11-14 17:04:08
|
On Fri, 2003-11-14 at 08:51, Tom Hughes wrote: > As valgrind has it's own libpthread we would have to work out how > setup the thread area for the new thread, which seems to be very > complicated as the GDT entry set in the kernel seems to point at a > thread descriptor structure which in turns points at the TLS data > and so on. Urk. I've never really looked at x86 segmentation, let alone how TLS uses it. I'm working on changes to let us get control before ld-linux.so starts. Would that help here? Actually, definitely, yes, because we can always set things up to tell the app that there's no TLS here, nope, go about your business. Another thought: with the new syscalls stuff, every app has its own cloned thread. Could we use those to hang kernel TLS state off in some way? > The other bits of NPTL shouldn't be a problem should they? It just > means implementing things like the futex system call. Well, more to the point, the other bits of NPTL don't have any app-visible API or ABI changes, so we can handle them in our libpthread. We don't need to emulate futex or clone unless someone wants to use them directly. J |
|
From: Tom H. <th...@cy...> - 2003-11-18 00:10:42
|
In message <106...@ix...>
Jeremy Fitzhardinge <je...@go...> wrote:
> There's the kernel mechanism for setting up a thread-local storage area,
> using the set_thread_area syscall. The argument to this is a segment
> descriptor, like the one used for set_ldt. This segment descriptor is
> assigned to one of the 3 TLS entries in the GDT. On thread context
> switch, the kernel reassigns the GDT entries to the thread's TLS
> segment. The thread itself assigns that descriptor to one of its
> segment registers, and uses %seg:0 as the pointer to its TLS area.
>
> This is easy for us to implement, since Julian has already done the hard
> work. We can store a per-thread "GDT" containing only these entries as
> part of each Thread structure. In VG_(do_useg)() we just look for the
> GDT (vs LDT) bit in the segment selector and do the appropriate thing.
Indeed, and that's the bit that I have working.
> All this is to support the new TLS extensions to the ABI. These are
> described in detail in http://people.redhat.com/drepper/tls.pdf. I've
> read this once, but I still don't understand the details.
I did look at it a while ago, but I seem to recall finding it
similarly difficult to grasp fully at first reading.
> The essential point is that ELF files can now have a PT_TLS segment
> which is used as a prototype segment for thread-local variables. When a
> new thread is created, it effectively gets a new copy of the contents of
> the PT_TLS segment attached to its own thread area (pointed to via
> %gs). This is does lazily, so only the TLS segments which the thread
> actually uses are copied for it.
There can actually be mutiple TLS segments in the ELF file - there
are typically .tdata and .tbss from what I can see.
In fact only the base executable seems to use direct %gs references. For
example, when xxx is a thread local variables, this code:
xxx++;
is compiled to the following code in the object file:
10: 8d 04 1d 00 00 00 00 lea 0x0(,%ebx,1),%eax
17: e8 fc ff ff ff call 18 <thread_main+0x18>
1c: ff 00 incl (%eax)
but when linked into an executable the linker turns that into:
8048540: 65 a1 00 00 00 00 mov %gs:0x0,%eax
8048546: 81 e8 04 00 00 00 sub $0x4,%eax
804854c: ff 00 incl (%eax)
in a shared object the linker leaves it more or less alone, other
than applying relocations:
8c4: 8d 04 1d 2c 00 00 00 lea 0x2c(,%ebx,1),%eax
8cb: e8 f8 fe ff ff call 7c8 <_init+0x88>
8d0: ff 00 incl (%eax)
The function being called is ___tls_get_addr and the lea is setting
up some sort of index into the TLS segment.
> This means that there's cooperation between the dynamic linker and
> libpthread. Since we control the one but not the other, we need to make
> our libpthread compatible with the dynamic linker's TLS implementation.
> The ABI documents some of this, but unfortunately it only documents the
> compiler interface to this goo, but not the internal interfaces between
> libpthread and the ld.so. The easy part of this is making sure that new
> threads get their own new TLS areas (which the glibc libpthread does by
> passing CLONE_SETTLS to clone(), which is the equivalent of doing
> set_thread_area() in the new thread). Trickier is getting the details
> of all the other structures right. And since they're internal to glibc,
> there's no certainty they won't change from release to release...
Even the cloning is hard because although we have the pointer to
the thread area for the original thread we have no idea how big
that area is because ld.so seems to set it up with a size of -1 so
the area is effectively unlimited.
In fact I believe the address passed to the kernel as the thread
area pointer is a pointer to the thread descriptor structure, with
the TLS data for the main executable just below it so that negative
offsets from %gs will find it. Other TLS data is found from the
thread descriptor somehow by the __tls_get_addr function.
Trying to emulate the whole thing would be horrible, but given
the incestuous links between ld.so, libc and libpthread it's hard
to see how else it can be done. It would also be a maintenance
nightmare of course, as you say.
> BTW, this is completely independent of NPTL. The TLS stuff is an
> extension to the ABI which is also supported by the pre-NPTL threads
> library (though I'm not sure how they make do without the
> set_thread_area syscall).
I don't think they do make do without it, which is why valgrind
falls over if you try and load any program or library with a TLS
section in the ELF file. Setting LD_ASSUME_KERNEL causes ld.so to
use a glibc built without using TLS segments.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Jeremy F. <je...@go...> - 2003-11-18 01:42:50
|
On Mon, 2003-11-17 at 16:10, Tom Hughes wrote: > There can actually be mutiple TLS segments in the ELF file - there > are typically .tdata and .tbss from what I can see. There's both .tdata and .tbss sections, but they both get mapped to the one PT_TLS segment. > In fact only the base executable seems to use direct %gs references. For > example, when xxx is a thread local variables, this code: > > xxx++; [...] > The function being called is ___tls_get_addr and the lea is setting > up some sort of index into the TLS segment. Yep, ___tls_get_addr is a function defined by the ABI to exist so the compiler can call it. There are many different types of reloc possible, depending on how much static info the compiler/linker has (like whether the code accessing the variable is necessarily in the same shared object or not). tls.pdf goes into (vast) detail about it. > Even the cloning is hard because although we have the pointer to > the thread area for the original thread we have no idea how big > that area is because ld.so seems to set it up with a size of -1 so > the area is effectively unlimited. Well, if we already have enough interaction with ld.so anyway, we can probably work out the size of the TLS chunk. I think it is sizeof(tcbheader_t). And clone() doesn't copy anything; it just sets up a new thread area at the given address, which libpthread points to its struct pthread. > In fact I believe the address passed to the kernel as the thread > area pointer is a pointer to the thread descriptor structure, with > the TLS data for the main executable just below it so that negative > offsets from %gs will find it. Other TLS data is found from the > thread descriptor somehow by the __tls_get_addr function. Yup, the ABI says something like that. But it is also part of the thread stack, so I'm not quite sure how much stuff is there. ___tls_get_addr is designed to hide a fair amount of the detail, and the compiler isn't allowed to make too many assumptions. > Trying to emulate the whole thing would be horrible, but given > the incestuous links between ld.so, libc and libpthread it's hard > to see how else it can be done. It would also be a maintenance > nightmare of course, as you say. Yes. It might be the only really sane way of doing it is to get some officially sanctioned hooks into glibc so that we can get enough information without too much interdependence. > I don't think they do make do without it, which is why valgrind > falls over if you try and load any program or library with a TLS > section in the ELF file. Setting LD_ASSUME_KERNEL causes ld.so to > use a glibc built without using TLS segments. Does that mean that code using TLS just SEGVs on a pre-TLS kernel? J |