|
From: Arseny S. <aso...@gm...> - 2012-08-20 19:07:02
|
Hello Valgrind hackers!
I've managed to make Valgrind run out of stack w/ the following reduced testcase
which calls siglongjmp() inside the signal handler. However, I'm not filing a
bug because I'm not 100 % sure whether the code is fully legitimate in regards
to POSIX signals processing and non-local jumps.
The example runs well in the absence of either Valgrind or GDB instrumentation.
I however accept that this positive effect may solely come from the example's
actual simplicity and that I miss something fundamental about stack management
inside the processes. I also recognize that jumping out of the signal handler is
not something one should consider for everyday use.
The manual[1] reads that invoking siglongjmp() from nested signal handler leads
to undefined behaviour. However, in the following example this is not the case
(probably unless the next signal is received when the control flow is still
inside the signal handler invoked for the previous instance of that signal).
All in all, the issue seems to be connected to the fundamentals of POSIX, and
the misconception here is probably very obvious to experts. But I still didn't
found definitive "don't to it this way because…" explanation, and it is also not
covered in the Valgrind FAQ.
% cat longjmp-test.c
#include <setjmp.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <signal.h>
sigjmp_buf env;
static void
sighandler(int _)
{
static uint32_t actionid = 0;
(void)_;
printf("%s: Action ID: %u\n", __func__, ++actionid);
siglongjmp(env, actionid);
}
static void
action1(void)
{
uint32_t cnt1 = 0;
int a1a[10240];
a1a[0] = a1a[232] = 3, a1a[10239] = 62;
printf("%s: %u\n", __func__, cnt1);
}
int
main(void)
{
int ret;
struct sigaction sigusr1;
memset(&sigusr1, 0, sizeof(sigusr1));
sigusr1.sa_flags = SA_RESTART;
sigaction(SIGUSR1, NULL, &sigusr1);
sigusr1.sa_handler = sighandler;
sigaction(SIGUSR1, &sigusr1, NULL);
for (;;) {
ret = sigsetjmp(env, 1);
switch (ret % 2) {
case 0:
pause();
break;
case 1:
action1(), ret = 0;
break;
default:
return 8;
}
}
return EXIT_SUCCESS;
}
% gcc -ggdb3 -o longjmp-test longjmp-test.c
% valgrind ./longjmp-test
>From another virtual terminal:
% while true; do kill -USR1 #PID#; done
Reproducible at least w/ Valgrind 3.7.0 and 3.8.0, glibc 2.15 under x86_64
GNU/Linux w/o multilib (but this info is probably irrelevant).
[1] http://pubs.opengroup.org/onlinepubs/7908799/xsh/siglongjmp.html
|
|
From: John R. <jr...@bi...> - 2012-08-21 02:52:55
|
On 08/20/2012 12:06 PM, Arseny Solokha wrote: > I've managed to make Valgrind run out of stack w/ the following reduced testcase > which calls siglongjmp() inside the signal handler. However, I'm not filing a > bug because I'm not 100 % sure whether the code is fully legitimate in regards > to POSIX signals processing and non-local jumps. [snip] > struct sigaction sigusr1; > memset(&sigusr1, 0, sizeof(sigusr1)); > sigusr1.sa_flags = SA_RESTART; > sigaction(SIGUSR1, NULL, &sigusr1); > sigusr1.sa_handler = sighandler; > sigaction(SIGUSR1, &sigusr1, NULL); That code says, "use sighandler with existing .sa_mask and .sa_flags, whatever they are." You don't even print out the existing values for .sa_mask and .sa_flags. This is highly non-portable and non-reproducible. The shell (or any program which exeve()s this one) can change the expected behavior. POSIX says that default or ignored signals maintain their status; handled signals [that is, ones that have a .sa_handler in the "old" address space] get reset to default. So you have tossed us a grenade of unknown expectations. What are the actual values of .sa_mask and .sa_flags? After that, what are your considerations with respect to SA_ONSTACK? Why did you choose not to use it [force it]? Whenever [non-]re-entrancy is a problem, then the first rule is to separate physically the different logical uses. Then there is the problem of races, particularly with preserving the separation of user state from valgrind state. Valgrind probably must execute with .sa_mask = ~0 in order to preserve valgrind's sanity; but then there must be separate re-entrant storage to track the user's .sa_mask. This could get complex quickly. Please revise your program to print and track the value of the stack pointer, and to determine the *first* time that the stack pointer does not have the value that you expect. This will help everyone understand what happens. -- |