|
From: Julian S. <js...@ac...> - 2005-03-16 20:39:26
Attachments:
ooo-hang-at-exit.txt.bz2
|
Jeremy I should have tried this earlier, but ... anyway, OOo 1.1.3 (standard download from openoffice.org, not a SuSE build) hangs at exit on SuSE 9.1 on 2.4.0rc4. I attach a trace of syscalls from the point where I clicked on 'X' to get rid of it, till the 'stuck in this state forever' point. Any ideas? J |
|
From: Jeremy F. <je...@go...> - 2005-03-16 22:36:18
|
Julian Seward wrote: >I should have tried this earlier, but ... anyway, OOo 1.1.3 >(standard download from openoffice.org, not a SuSE build) >hangs at exit on SuSE 9.1 on 2.4.0rc4. I attach a trace of >syscalls from the point where I clicked on 'X' to get rid of >it, till the 'stuck in this state forever' point. > >Any ideas? > Hard to tell, really. To be honest, it looks like an application bug. Two threads remain; thread 2 is the LinuxThreads manager thread, so it isn't going to go away until the other thread, thread 5, dies. Thread 5 seems to just be looping waiting for an FD which never happens. It would be interesting to attach strace to thread 5 in this state to see what FD it is actually waiting on. Which tool? How are you starting OOo? I can't reproduce it with the FC3 1.1.3 OOo: LD_ASSUME_KERNEL=2.4.1 valgrind --tool=none --trace-children=yes oowriter The 2.0 Beta OOo seems OK too (downloaded from openoffice.org). J |
|
From: Paul M. <pa...@sa...> - 2005-03-17 00:29:07
|
Jeremy Fitzhardinge writes: > Hard to tell, really. To be honest, it looks like an application bug. > Two threads remain; thread 2 is the LinuxThreads manager thread, so it > isn't going to go away until the other thread, thread 5, dies. Thread 5 > seems to just be looping waiting for an FD which never happens. It > would be interesting to attach strace to thread 5 in this state to see > what FD it is actually waiting on. Ummm, hasn't some other thread done an exit_group()? That is supposed to kill all the threads in the current process. We shouldn't expect every thread to do exit_group. Paul. |
|
From: Jeremy F. <je...@go...> - 2005-03-17 01:10:04
|
Paul Mackerras wrote:
>Jeremy Fitzhardinge writes:
>
>
>
>>Hard to tell, really. To be honest, it looks like an application bug.
>>Two threads remain; thread 2 is the LinuxThreads manager thread, so it
>>isn't going to go away until the other thread, thread 5, dies. Thread 5
>>seems to just be looping waiting for an FD which never happens. It
>>would be interesting to attach strace to thread 5 in this state to see
>>what FD it is actually waiting on.
>>
>>
>
>Ummm, hasn't some other thread done an exit_group()? That is supposed
>to kill all the threads in the current process. We shouldn't expect
>every thread to do exit_group.
>
This is LinuxThreads, so each thread is in its own thread group as far
as the kernel is concerned. NPTL threads are all killed by exit_group().
J
|
|
From: Julian S. <js...@ac...> - 2005-03-17 11:40:56
|
> Hard to tell, really. To be honest, it looks like an application bug. But it exits normally when run natively, and it also works fine on 2.2.0. > Which tool? How are you starting OOo? I can't reproduce it with the > FC3 1.1.3 OOo: ~/Vg240/valgrind/Inst/bin/valgrind --trace-children=yes --tool=none ~/OpenOffice.org1.1.3/soffice J |
|
From: Julian S. <js...@ac...> - 2005-03-17 12:25:57
|
Is FC3 LinuxThreads or NPTL ? If NPTL, do you have a LinuxThreads
system you can try reproducing this on? Is setting LD_ASSUME_KERNEL=2.4.1
really exactly the same as running it on a LinuxThreads-only system?
> Hard to tell, really. To be honest, it looks like an application bug.
> Two threads remain; thread 2 is the LinuxThreads manager thread, so it
> isn't going to go away until the other thread, thread 5, dies. Thread 5
> seems to just be looping waiting for an FD which never happens. It
> would be interesting to attach strace to thread 5 in this state to see
> what FD it is actually waiting on.
It looks like fd 12.
poll([{fd=12, events=POLLIN}], 1, 1000) = 0
rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP], NULL, 8) = 0
gettid() = 10505
read(1017, "T", 2) = 1
getpid() = 10505
write(1016, "SYSCALL[10505,5](168) --> 0 (0x0"..., 34) = 34
getpid() = 10505
write(1016, "SYSCALL[10505,5]( 78):", 22) = 22
write(1016, "sys_gettimeofday ( 0xAEFFFA04, 0"..., 36) = 36
gettimeofday({1111060117, 380070}, NULL) = 0
write(1016, " --> 0 (0x0)\n", 13) = 13
getpid() = 10505
write(1016, "SYSCALL[10505,5](168) mayBlock:", 31) = 31
write(1016, "sys_poll ( 0xAEFFF94C, 1, 1000 )"..., 33) = 33
write(1016, " --> ...\n", 9) = 9
gettid() = 10505
write(1018, "T", 1) = 1
rt_sigprocmask(SIG_SETMASK, [RTMIN RT_31], ~[ILL TRAP BUS FPE KILL SEGV
STOP], 8) = 0
poll(
In the (non-truncated version of the) log I sent yesterday,
fd 12 is used many times (open, mmap, close). The last
place it appears to have been *created* is
SYSCALL[4719,1](102) mayBlock:sys_socketcall ( 1, 0xAFEFCD40 ) --> ...
SYSCALL[4719,1](102) --> 12 (0xC)
The last place I can see it referenced is:
SYSCALL[4743,5]( 54) mayBlock:sys_ioctl ( 12, 0x541B (type=54, nr=1B, size=0),
0x2D ) --> ...
SYSCALL[4743,5]( 54) --> 0 (0x0)
SYSCALL[4743,5]( 3) mayBlock:sys_read ( 12, 0xAEFFF0F4, 128 ) --> ...
SYSCALL[4743,5]( 3) --> 128 (0x80)
SYSCALL[4743,5]( 54) mayBlock:sys_ioctl ( 12, 0x541B (type=54, nr=1B, size=0),
0x2D ) --> ...
SYSCALL[4743,5]( 54) --> 0 (0x0)
So I'm none the wiser.
Ioctl 0x541B is FIONREAD.
I'd prefer not to ship 2.4.0 with this bug in, if we can resolve it
relatively quickly. What else can I do to help you repro it?
J
|
|
From: Jeremy F. <je...@go...> - 2005-03-17 16:53:19
|
Julian Seward wrote:
>Is FC3 LinuxThreads or NPTL ? If NPTL, do you have a LinuxThreads
>system you can try reproducing this on? Is setting LD_ASSUME_KERNEL=2.4.1
>really exactly the same as running it on a LinuxThreads-only system?
>
>
>
>
>>Hard to tell, really. To be honest, it looks like an application bug.
>>Two threads remain; thread 2 is the LinuxThreads manager thread, so it
>>isn't going to go away until the other thread, thread 5, dies. Thread 5
>>seems to just be looping waiting for an FD which never happens. It
>>would be interesting to attach strace to thread 5 in this state to see
>>what FD it is actually waiting on.
>>
>>
>
>It looks like fd 12.
>
> poll([{fd=12, events=POLLIN}], 1, 1000) = 0
> rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP], NULL, 8) = 0
> gettid() = 10505
> read(1017, "T", 2) = 1
> getpid() = 10505
> write(1016, "SYSCALL[10505,5](168) --> 0 (0x0"..., 34) = 34
> getpid() = 10505
> write(1016, "SYSCALL[10505,5]( 78):", 22) = 22
> write(1016, "sys_gettimeofday ( 0xAEFFFA04, 0"..., 36) = 36
> gettimeofday({1111060117, 380070}, NULL) = 0
> write(1016, " --> 0 (0x0)\n", 13) = 13
> getpid() = 10505
> write(1016, "SYSCALL[10505,5](168) mayBlock:", 31) = 31
> write(1016, "sys_poll ( 0xAEFFF94C, 1, 1000 )"..., 33) = 33
> write(1016, " --> ...\n", 9) = 9
> gettid() = 10505
> write(1018, "T", 1) = 1
> rt_sigprocmask(SIG_SETMASK, [RTMIN RT_31], ~[ILL TRAP BUS FPE KILL SEGV
>STOP], 8) = 0
> poll(
>
>In the (non-truncated version of the) log I sent yesterday,
>fd 12 is used many times (open, mmap, close). The last
>place it appears to have been *created* is
>
>SYSCALL[4719,1](102) mayBlock:sys_socketcall ( 1, 0xAFEFCD40 ) --> ...
>SYSCALL[4719,1](102) --> 12 (0xC)
>
>The last place I can see it referenced is:
>
>SYSCALL[4743,5]( 54) mayBlock:sys_ioctl ( 12, 0x541B (type=54, nr=1B, size=0),
>0x2D ) --> ...
>SYSCALL[4743,5]( 54) --> 0 (0x0)
>SYSCALL[4743,5]( 3) mayBlock:sys_read ( 12, 0xAEFFF0F4, 128 ) --> ...
>SYSCALL[4743,5]( 3) --> 128 (0x80)
>SYSCALL[4743,5]( 54) mayBlock:sys_ioctl ( 12, 0x541B (type=54, nr=1B, size=0),
>0x2D ) --> ...
>SYSCALL[4743,5]( 54) --> 0 (0x0)
>
>So I'm none the wiser.
>
>Ioctl 0x541B is FIONREAD.
>
>I'd prefer not to ship 2.4.0 with this bug in, if we can resolve it
>relatively quickly. What else can I do to help you repro it?
>
>
Also, a full --trace-syscalls=yes --trace-signals=yes output would be
useful.
Something strange is definitely happening on your system. I'm pretty
sure fd 12 is the connection to the X server; it's a red herring.
What's suppose to happen is that thread 1 is supposed to tell the
manager to shut everything down. In your trace, it just exits quietly:
SYSCALL[4719,1](174) special:sys_rt_sigaction ( 3, 0xAFEFDB70, 0x0, 8 )
--> 0 (0x0)
SYSCALL[4719,1](174) special:sys_rt_sigaction ( 2, 0xAFEFDB70, 0x0, 8 )
--> 0 (0x0)
SYSCALL[4719,1](174) special:sys_rt_sigaction ( 1, 0xAFEFDB70, 0x0, 8 )
--> 0 (0x0)
SYSCALL[4719,1]( 91):sys_munmap ( 0x3DA0E000, 65536 ) --> 0 (0x0)
SYSCALL[4719,1]( 6):sys_close ( 18 ) --> 0 (0x0)
SYSCALL[4719,1]( 6):sys_close ( 13 ) --> 0 (0x0)
SYSCALL[4719,1](252) special:exit_group( 0 ) --> 252 (0xFC)
In my trace, thread 1 sends messages to the manager who in turn kills
off the other thread and itself before thread 1 finally exits.
I wonder if this is another instance of a strange vendor-special
LinuxThreads/NPTL hybrid? The full trace will be interesting, to see
how the threads are created.
J
|
|
From: Julian S. <js...@ac...> - 2005-03-17 17:40:49
|
> >I'd prefer not to ship 2.4.0 with this bug in, if we can resolve it > >relatively quickly. What else can I do to help you repro it? > > Also, a full --trace-syscalls=yes --trace-signals=yes output would be > useful. Done. I'll send it seperately as the mailing list doesn't like messages over 50k. > I wonder if this is another instance of a strange vendor-special > LinuxThreads/NPTL hybrid? The full trace will be interesting, to see > how the threads are created. Maybe. I was surprised to find recently that SuSE 9.1 (x86) didn't seem to be a 'normal' NPTL system; I thought it was. MichaelM, can you clarify? J |
|
From: Jeremy F. <je...@go...> - 2005-03-17 17:58:32
|
Julian Seward wrote:
>
>
>>I wonder if this is another instance of a strange vendor-special
>>LinuxThreads/NPTL hybrid? The full trace will be interesting, to see
>>how the threads are created.
>>
>>
>
>Maybe. I was surprised to find recently that SuSE 9.1 (x86) didn't seem
>to be a 'normal' NPTL system; I thought it was. MichaelM, can you clarify?
>
>
If it were really expecting exit_group() to kill all the threads, then
I'd expect to see a lot more MT programs fail to exit on this system.
Or perhaps programs mostly clean up their threads before calling exit()?
J
|
|
From: Michael M. <ma...@su...> - 2005-03-18 14:37:35
|
Hi, On Thu, 17 Mar 2005, Julian Seward wrote: > > I wonder if this is another instance of a strange vendor-special > > LinuxThreads/NPTL hybrid? The full trace will be interesting, to see > > how the threads are created. > > Maybe. I was surprised to find recently that SuSE 9.1 (x86) didn't seem > to be a 'normal' NPTL system; I thought it was. MichaelM, can you clarify? I've asked some kernel people here. I'm not aware of any non-standard changes of the kernel itself. AFAIK the 9.1 is as normal NPTL as it gets ;) OTOH I'm no expert in that. I've looked at the archives for this thread, and just have one additional clarification: 9.1 uses the kernel 2.6. Without any LD_ASSUME_KERNEL hackery it will use the /usr/lib/nptl/ libs (one should perhaps ensure that this is the case). At least in the beginning of this thread it seems as if Jeremy was arguing from a 9.1-is-linuxthreads perspective. Ciao, Michael. |
|
From: Jeremy F. <je...@go...> - 2005-03-18 16:05:34
|
Michael Matz wrote:
>I've asked some kernel people here. I'm not aware of any non-standard
>changes of the kernel itself. AFAIK the 9.1 is as normal NPTL as it gets
>;) OTOH I'm no expert in that. I've looked at the archives for this
>thread, and just have one additional clarification: 9.1 uses the kernel
>2.6. Without any LD_ASSUME_KERNEL hackery it will use the
>/usr/lib/nptl/ libs (one should perhaps ensure that this is the case). At
>least in the beginning of this thread it seems as if Jeremy was arguing
>from a 9.1-is-linuxthreads perspective.
>
That's what the traces Julian sent me indicate; they're using the
non-NPTL forms of clone(), etc. Maybe there's just something strange
with his installation? OOo on his system is using /lib/pthread.so.
J
|
|
From: Jeremy F. <je...@go...> - 2005-03-17 16:21:58
|
Julian Seward wrote:
>But it exits normally when run natively, and it also works fine on 2.2.0.
>
>
That doesn't preclude a race or some timing problem in their code. Does
it happen with other tools?
>Is FC3 LinuxThreads or NPTL ? If NPTL, do you have a LinuxThreads
>system you can try reproducing this on? Is setting LD_ASSUME_KERNEL=2.4.1
>really exactly the same as running it on a LinuxThreads-only system?
>
>
Well, it uses the same libpthread.so. The big difference is that it's
using a 2.6 kernel. I assume SuSE 9.1 is using some 2.4 kernel.
>>> Hard to tell, really. To be honest, it looks like an application bug.
>>> Two threads remain; thread 2 is the LinuxThreads manager thread, so it
>>> isn't going to go away until the other thread, thread 5, dies. Thread 5
>>> seems to just be looping waiting for an FD which never happens. It
>>> would be interesting to attach strace to thread 5 in this state to see
>>> what FD it is actually waiting on.
>>
>>
>
>It looks like fd 12.
>
> poll([{fd=12, events=POLLIN}], 1, 1000) = 0
> rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP], NULL, 8) = 0
> gettid() = 10505
> read(1017, "T", 2) = 1
> getpid() = 10505
> write(1016, "SYSCALL[10505,5](168) --> 0 (0x0"..., 34) = 34
> getpid() = 10505
> write(1016, "SYSCALL[10505,5]( 78):", 22) = 22
> write(1016, "sys_gettimeofday ( 0xAEFFFA04, 0"..., 36) = 36
> gettimeofday({1111060117, 380070}, NULL) = 0
> write(1016, " --> 0 (0x0)\n", 13) = 13
> getpid() = 10505
> write(1016, "SYSCALL[10505,5](168) mayBlock:", 31) = 31
> write(1016, "sys_poll ( 0xAEFFF94C, 1, 1000 )"..., 33) = 33
> write(1016, " --> ...\n", 9) = 9
> gettid() = 10505
> write(1018, "T", 1) = 1
> rt_sigprocmask(SIG_SETMASK, [RTMIN RT_31], ~[ILL TRAP BUS FPE KILL SEGV
>STOP], 8) = 0
> poll(
>
>In the (non-truncated version of the) log I sent yesterday,
>fd 12 is used many times (open, mmap, close). The last
>place it appears to have been *created* is
>
>SYSCALL[4719,1](102) mayBlock:sys_socketcall ( 1, 0xAFEFCD40 ) --> ...
>SYSCALL[4719,1](102) --> 12 (0xC)
>
>The last place I can see it referenced is:
>
>SYSCALL[4743,5]( 54) mayBlock:sys_ioctl ( 12, 0x541B (type=54, nr=1B, size=0),
>0x2D ) --> ...
>SYSCALL[4743,5]( 54) --> 0 (0x0)
>SYSCALL[4743,5]( 3) mayBlock:sys_read ( 12, 0xAEFFF0F4, 128 ) --> ...
>SYSCALL[4743,5]( 3) --> 128 (0x80)
>SYSCALL[4743,5]( 54) mayBlock:sys_ioctl ( 12, 0x541B (type=54, nr=1B, size=0),
>0x2D ) --> ...
>SYSCALL[4743,5]( 54) --> 0 (0x0)
>
>So I'm none the wiser.
>
>Ioctl 0x541B is FIONREAD.
>
>
I guess we need to find out what's at the other end of the socket. Just
after its creation, what other socket operations happen on it? It would
be useful to have strace output, since it gives more detail about what
the syscall args are.
Are there any other processes sitting around which it might be trying to
talk to? "lsof" might give a clue.
>I'd prefer not to ship 2.4.0 with this bug in, if we can resolve it
>relatively quickly. What else can I do to help you repro it?
>
>
This doesn't strike me as a showstopper bug: it seems easy to work
around (you can just ^C the process, yes?), it doesn't crash and it
doesn't seem to be affecting many people. If we can track it down in
the near future and the fix is a one-liner then OK, but otherwise I
think 2.4.0 is cooked.
J
|
|
From: Nicholas N. <nj...@cs...> - 2005-03-17 17:25:38
|
On Thu, 17 Mar 2005, Jeremy Fitzhardinge wrote: >> I'd prefer not to ship 2.4.0 with this bug in, if we can resolve it >> relatively quickly. What else can I do to help you repro it? >> > This doesn't strike me as a showstopper bug: it seems easy to work > around (you can just ^C the process, yes?), it doesn't crash and it > doesn't seem to be affecting many people. Does it preclude leak checking OOo? N |
|
From: Julian S. <js...@ac...> - 2005-03-17 17:51:17
|
On Thursday 17 March 2005 17:24, Nicholas Nethercote wrote: > On Thu, 17 Mar 2005, Jeremy Fitzhardinge wrote: > >> I'd prefer not to ship 2.4.0 with this bug in, if we can resolve it > >> relatively quickly. What else can I do to help you repro it? > > > > This doesn't strike me as a showstopper bug: it seems easy to work > > around (you can just ^C the process, yes?), it doesn't crash and it > > doesn't seem to be affecting many people. > > Does it preclude leak checking OOo? No, you can do control-C and then it moves onto leak checking. Still, it's a bit disconcerting. J |
|
From: Julian S. <js...@ac...> - 2005-03-18 01:11:54
|
> I wonder if this is another instance of a strange vendor-special > LinuxThreads/NPTL hybrid? On reflection, I'm a little puzzled. I thought one of the aims of your recent threading rework was to decouple V from the precise behaviour of the threading libraries. So long as they use sys_clone in a way we can cope with, we are happy to let the threading library do whatever it wants, right? But the implication of this comment is that in fact we do have to be concerned whether we're running NPTL, LinuxThreads or some hybrid. Ummmmm .... J |
|
From: Jeremy F. <je...@go...> - 2005-03-18 07:37:39
|
Julian Seward wrote:
>On reflection, I'm a little puzzled. I thought one of the aims of
>your recent threading rework was to decouple V from the precise
>behaviour of the threading libraries. So long as they use sys_clone
>in a way we can cope with, we are happy to let the threading library
>do whatever it wants, right? But the implication of this comment
>is that in fact we do have to be concerned whether we're running
>NPTL, LinuxThreads or some hybrid. Ummmmm ....
>
In general that's true, but exit and exit_group always need to be
emulated, and I'm worried about non-standard behaviour in exit_group,
particularly. Though its hard to tell, your trace seems to suggest that
it expects exit_group to kill the whole process, even though they
threads are not in the same thread group (they weren't created with
CLONE_THREAD). Valgrind, naturally, emulates standard 2.6 behaviour of
exit_group.
It may be moot because Alex Ivershen's report pointed out a real bug in
the handling of sigsuspend, which may have some bearing on your bug
(though I can't convince myself of that based on the trace). I'm about
to check in the fix for that one, so give it a go.
J
|
|
From: Julian S. <js...@ac...> - 2005-03-18 14:11:00
|
> In general that's true, but exit and exit_group always need to be > emulated, and I'm worried about non-standard behaviour in exit_group, > particularly. Though its hard to tell, your trace seems to suggest that > it expects exit_group to kill the whole process, even though they > threads are not in the same thread group (they weren't created with > CLONE_THREAD). Valgrind, naturally, emulates standard 2.6 behaviour of > exit_group. The kernel in question is a 2.6.5 variant from SuSE. I've mailed to ask if they have mutantified it re exit_group. > It may be moot because Alex Ivershen's report pointed out a real bug in > the handling of sigsuspend, which may have some bearing on your bug > (though I can't convince myself of that based on the trace). I'm about > to check in the fix for that one, so give it a go. Tried it, but unfortunately still hangs. J |
|
From: Jeremy F. <je...@go...> - 2005-03-18 16:24:09
|
Julian Seward wrote:
>>In general that's true, but exit and exit_group always need to be
>>emulated, and I'm worried about non-standard behaviour in exit_group,
>>particularly. Though its hard to tell, your trace seems to suggest that
>>it expects exit_group to kill the whole process, even though they
>>threads are not in the same thread group (they weren't created with
>>CLONE_THREAD). Valgrind, naturally, emulates standard 2.6 behaviour of
>>exit_group.
>>
>>
>
>The kernel in question is a 2.6.5 variant from SuSE. I've mailed to
>ask if they have mutantified it re exit_group.
>
>
I just installed a SuSE 9.1 VMWare machine, and it seems to be a proper
TLS/NPTL system. Is there something broken about your installation?
I can reproduce the hang if I start OOo with LD_ASSUME_KERNEL=2.4.1 to
disable NPTL, but otherwise it works fine.
J
|
|
From: Julian S. <js...@ac...> - 2005-03-18 16:33:55
|
> I just installed a SuSE 9.1 VMWare machine, and it seems to be a proper
> TLS/NPTL system. Is there something broken about your installation?
Not as far as I know. I haven't messed with it. This is really
extremely strange. This machine is a PIII, not a P4; would that
influence which thread library applied by default?
> I can reproduce the hang if I start OOo with LD_ASSUME_KERNEL=2.4.1 to
> disable NPTL, but otherwise it works fine.
Well, in a way, good.
Here's some info:
sewardj@phoenix:~$ ls -l /bin/ls
-rwxr-xr-x 1 root root 90616 2004-04-06 02:58 /bin/ls*
sewardj@phoenix:~$ md5sum /bin/ls
6e4af824bd787c3ea76f78e20a76a7fe /bin/ls
sewardj@phoenix:~$ ldd /bin/ls
linux-gate.so.1 => (0xffffe000)
librt.so.1 => /lib/librt.so.1 (0x40033000)
libacl.so.1 => /lib/libacl.so.1 (0x40045000)
libselinux.so.1 => /lib/libselinux.so.1 (0x4004b000)
libc.so.6 => /lib/libc.so.6 (0x40059000)
libpthread.so.0 => /lib/libpthread.so.0 (0x4016e000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)
libattr.so.1 => /lib/libattr.so.1 (0x401c1000)
sewardj@phoenix:~$ env | grep ASSUME
sewardj@phoenix:~$ ls /lib/tls
ls: /lib/tls: No such file or directory
J
|
|
From: Jeremy F. <je...@go...> - 2005-03-18 16:38:32
|
Julian Seward wrote:
>>I just installed a SuSE 9.1 VMWare machine, and it seems to be a proper
>>TLS/NPTL system. Is there something broken about your installation?
>>
>>
>
>Not as far as I know. I haven't messed with it. This is really
>extremely strange. This machine is a PIII, not a P4; would that
>influence which thread library applied by default?
>
>
>
>
>>I can reproduce the hang if I start OOo with LD_ASSUME_KERNEL=2.4.1 to
>>disable NPTL, but otherwise it works fine.
>>
>>
>
>Well, in a way, good.
>
>Here's some info:
>
>sewardj@phoenix:~$ ls -l /bin/ls
>-rwxr-xr-x 1 root root 90616 2004-04-06 02:58 /bin/ls*
>
>sewardj@phoenix:~$ md5sum /bin/ls
>6e4af824bd787c3ea76f78e20a76a7fe /bin/ls
>
>
That matches what I have...
>sewardj@phoenix:~$ ldd /bin/ls
> linux-gate.so.1 => (0xffffe000)
> librt.so.1 => /lib/librt.so.1 (0x40033000)
> libacl.so.1 => /lib/libacl.so.1 (0x40045000)
> libselinux.so.1 => /lib/libselinux.so.1 (0x4004b000)
> libc.so.6 => /lib/libc.so.6 (0x40059000)
> libpthread.so.0 => /lib/libpthread.so.0 (0x4016e000)
> /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)
> libattr.so.1 => /lib/libattr.so.1 (0x401c1000)
>
>sewardj@phoenix:~$ env | grep ASSUME
>
>sewardj@phoenix:~$ ls /lib/tls
>ls: /lib/tls: No such file or directory
>
>
...but I do have this. That's in the glibc package; try "rpm -V glibc".
J
|
|
From: Julian S. <js...@ac...> - 2005-03-18 16:48:28
|
I'm wondering if what I have is strange as a result of upgrading a SuSE 9.0 install to 9.1, rather than nuking the disk and doing a clean 9.1 install. Truth is I can't remember. > ...but I do have this. That's in the glibc package; try "rpm -V glibc". sewardj@phoenix:~$ rpm -qa | grep libc glibc-i18ndata-2.3.3-98 libcap-1.92-479 glibc-devel-2.3.3-98 glibc-2.3.3-98 glibc-locale-2.3.3-98 glibc-info-2.3.3-98 glibc-html-2.3.3-98 libchipcard-0.9.1-203 sewardj@phoenix:~$ rpm --query -l glibc-2.3.3-98 /etc/bindresvport.blacklist /etc/default/nss /etc/ld.so.conf /etc/nsswitch.conf /etc/rpc /lib/ld-2.3.3.so /lib/ld-linux.so.2 /lib/libBrokenLocale.so.1 /lib/libNoVersion.so.1 /lib/libSegFault.so /lib/libanl.so.1 /lib/libc.so.6 /lib/libcidn.so.1 /lib/libcrypt.so.1 /lib/libdl.so.2 /lib/libm.so.6 /lib/libmemusage.so /lib/libnsl.so.1 /lib/libnss_compat.so.2 /lib/libnss_dns.so.2 /lib/libnss_files.so.2 /lib/libnss_hesiod.so.2 /lib/libnss_nis.so.2 /lib/libnss_nisplus.so.2 /lib/libpcprofile.so /lib/libpthread.so.0 /lib/libresolv.so.2 /lib/librt.so.1 /lib/libthread_db.so.1 /lib/libutil.so.1 /sbin/ldconfig /usr/bin/gencat /usr/bin/getconf /usr/bin/getent /usr/bin/iconv /usr/bin/ldd /usr/bin/lddlibc4 /usr/bin/locale /usr/bin/localedef /usr/lib/pt_chown /usr/sbin/iconvconfig /usr/sbin/rpcinfo /usr/share/doc/packages/glibc /usr/share/doc/packages/glibc/LICENSES /usr/share/man/man1/getconf.1.gz /usr/share/man/man1/getent.1.gz /usr/share/man/man1/iconv.1.gz /usr/share/man/man1/locale.1.gz /usr/share/man/man1/localedef.1.gz /usr/share/man/man5/locale.alias.5.gz /usr/share/man/man8/rpcinfo.8.gz Ain't no tls there! J |
|
From: Jeremy F. <je...@go...> - 2005-03-18 16:58:49
|
Julian Seward wrote: >sewardj@phoenix:~$ rpm --query -l glibc-2.3.3-98 > > I have glibc-2.3.3-97 installed (build date 5 Apr 2004); it contains /lib/tls directories. What does "rpm -qi glibc" say? What does "rpm -V glibc" say? Maybe they split tls into a separate glibc package? I get: $ rpm -qi glibc Name : glibc Relocations: (not relocatable) Version : 2.3.3 Vendor: SuSE Linux AG, Nuernberg, Germany Release : 97 Build Date: Mon 05 Apr 2004 08:35:27 AM PDT Install date: Thu 17 Mar 2005 03:52:18 PM PST Build Host: zert200.suse.de Group : System/Libraries Source RPM: glibc-2.3.3-97.src.rpm Size : 6772421 License: GPL, LGPL Signature : DSA/SHA1, Mon 05 Apr 2004 08:41:05 AM PDT, Key ID a84edae89c800acaPackager : http://www.suse.de/feedback URL : http://www.gnu.org/software/libc/libc.html Summary : The standard shared libraries (from the GNU C Library) Description : The GNU C Library provides the most important standard libraries used by nearly all programs: the stndard C library, the standard math library and the POSIX thread library. Without these libraries, the system is not functional. Distribution: SuSE Linux 9.1 (i686) J |
|
From: Julian S. <js...@ac...> - 2005-03-18 21:59:24
|
> I have glibc-2.3.3-97 installed (build date 5 Apr 2004); it contains > /lib/tls directories. What does "rpm -qi glibc" say? What does "rpm > -V glibc" say? Maybe they split tls into a separate glibc package? sewardj@phoenix:~$ rpm -qi glibc Name : glibc Relocations: (not relocatable) Version : 2.3.3 Vendor: SuSE Linux AG, Nuernberg, Germany Release : 98 Build Date: Tue 06 Apr 2004 01:26:15 BST Install date: Tue 22 Jun 2004 23:25:00 BST Build Host: frobenius.suse.de Group : System/Libraries Source RPM: glibc-2.3.3-98.src.rpm Size : 3496068 License: GPL, LGPL Signature : DSA/SHA1, Tue 06 Apr 2004 01:32:16 BST, Key ID a84edae89c800aca Packager : http://www.suse.de/feedback URL : http://www.gnu.org/software/libc/libc.html Summary : The standard shared libraries (from the GNU C Library) Description : The GNU C Library provides the most important standard libraries used by nearly all programs: the stndard C library, the standard math library and the POSIX thread library. Without these libraries, the system is not functional. Distribution: SuSE Linux 9.1 (i586) sewardj@phoenix:~$ rpm -V glibc (viz, nothing) This SuSE 9.1 installation discrepancy is all very strange, but I'm not sure it's relevant, if you can repro the hang using LD_ASSUME_KERNEL. That seems to say there's a LinuxThreads issue. I have a straight LinuxThreads system here (Red Hat 7.3) which I'll see if I can repro the hang on. J |
|
From: Dirk M. <dm...@gm...> - 2005-03-18 22:29:35
|
On Friday 18 March 2005 19:10, Julian Seward wrote: > Distribution: SuSE Linux 9.1 (i586) you're using the non-TLS version of glibc. There are two glibc packages on the suse cd's, one in the i686 subdirectory (enabled TLS support) and one in the i586 directory (without TLS). They conflict each other, you can only have one of them installed at the same time. Dirk |
|
From: Jeremy F. <je...@go...> - 2005-03-18 22:57:18
|
Julian Seward wrote:
>I have a straight LinuxThreads system here (Red Hat 7.3) which I'll
>see if I can repro the hang on.
>
>
Tom Truscott mentioned getting some Valgrind assertion failure with an
"obsolete pthreads", but I haven't seen any more detail about this. It
could be related; if would be interesting to see if RH7.3 has problems.
J
|