From: Bodo S. <bst...@fu...> - 2004-11-09 16:48:26
|
Blaisorblade wrote: > On Friday 29 October 2004 11:26, Roland Kaeser wrote: > This is executed by a clone() child. What happens is that, with NPTL, both the > father and the son have the same pid, so the SIGSTOP is routed to the wrong > thread. However, this is not expected: having the same pid should be reserved > to when clone is called with CLONE_THREAD in the flags. I've verified that > this is not happening in this case, even with strace (to make sure glibc is > not playing any dirty tricks). But for some reasons, the kernel is behaving > as if this happened. Sorry, I have to oppose. The threads don't have the same pid! Only the getpid()- call to the lib returns the pid of the father. I wrote a small test program (attached). Please compile it with: gcc -static -o test_getpid_static test_getpid.c and gcc -o test_getpid_nptl test_getpid.c Using the two programs, you can see the following: 1) having linked my test with -static, each "getpid()" in the test results in a syscall (try "strace test_getpid_static") 2) linking without -static (I assume, this means using NPTL), only the first getpid() does a syscall, I guess, the further calls deliver a pid-value buffered in the lib! (try "strace test_getpid_static") The test here requests and prints out its pid twice, then it exits. But strace will show you two getpid()-calls only in case of the _static program. 3) the pid-history seen in 1 and 2 is used even for a child created with "clone()", regardless which clone-flags are used! Try "test_getpid_static clone" and "test_getpid_nptl clone" to see, what happens. After printing its pid twice, the program now creates a child via "clone()". The child requests its *real* pid via a "by-hand-syscall". Than it stops itself and is ptraced by the father, which prints out a message, if the child does a *real* getpid()-syscall. Note: If you remove the two getpid()-calls at the beginning of main(), the child will work correctly even with NPTL, since there isn't yet a buffered pid ... Summary: I guess, the behavior of NPTL is a bug. Do you agree? To which list should a bugreport be mailed? To work around, we could use the by-hand-syscall for os_getpid(). I didn't test it, but I'm quite shure, that it fixes the problem. Bodo |