From: David S. M. <da...@da...> - 2004-10-08 04:54:06
|
fcntl16.c (and probably a bunch of other places in the ltp testsuite) use the following scheme: static int alarm_flag; static void catch_alarm(int signo) { alarm_flag = 1; } signal(SIGALRM, catch_alarm); ... alarm(TIME_OUT); wait(&status); Just a quick scan shows that, among the fcntl tests alone, this technique is used in fcntl14.c, fcntl15.c, fcntl16.c, and fcntl17.c The expectation is that if no wait() events occur, the SIGALRM will cause the wait() to return with -EINTR after the signal handler catch_alarm() runs. This never happens, because signal() gives BSD signal semantics and part of that is that the SA_RESTART flag is passed into the actual system call (usually rt_sigaction) used to register the signal handler. SA_RESTART means that the system call is restarted, not returned from, when the signal handler returns. This is most easily seen with the following test program: #include <sys/types.h> #include <unistd.h> #include <signal.h> static volatile int alarm_flag; void catch_alarm(int signo) { printf("Got SIGALRM\n"); alarm_flag = 1; } void child(void) { pause(); } int main(void) { int status, pid; pid = fork(); if (pid == 0) child(); signal(SIGALRM, catch_alarm); alarm(5); wait(&status); kill(pid, SIGKILL); exit(0); } Run it through strace and you'll see something like the following: fork() = 17293 We fork the child. rt_sigaction(SIGALRM, {0x100960, [ALRM], SA_RESTART}, {SIG_DFL}, 0xfffff800001612f8, 18446744073709551615) = 0 Register SIGALRM signal handler with SA_RESTART flag. alarm(5) = 0 Request the alarm(). wait4(-1, 0x7fffffffabc, 0, NULL) = ? ERESTARTSYS (To be restarted) Wait for the child. --- SIGALRM (Alarm clock) @ 0 (0) --- fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffff80000020000 write(1, "Got SIGALRM\n", 12Got SIGALRM ) = 12 rt_sigreturn(0xe) = -1 E??? (errno -1) SIGALRM arrives, we print the debug message, and we return from the signal handler. wait4(-1, <unfinished ...> And wait4() is restarted. The test program will hang from this point forward. In order to fix this, test cases using this technique in LTP will need to explicitly call sigaction() with the appropriate flags. In particular, it will need to have the SA_RESTART flag clear in such calls. What's really fascinating to me is that this test is passing for somebody out there, I can't believe I'm the only person hitting this :-) I ran my little test program above both on x86 and sparc64, just in case it was some difference in glibc's signal() implementation, but both cases pass the SA_RESTART flag in and both cases restart the wait4() call after the signal handler runs and the program thus hangs forever afterwards. |