|
From: Alexander P. <gl...@go...> - 2013-08-07 16:08:00
Attachments:
test.c
|
Hi Valgrind devs, Jeffrey (CCed) has come up with a test (attached) where a pointer to an allocated memory chunk is only reachable from a general-purpose register of a detached thread (see the discussion at https://code.google.com/p/chromium/issues/detail?id=269278) That thread is live at the moment the main thread exits (so it's hard to call this situation a leak), but after that Valgrind sets all threads' statuses to VgTs_Empty therefore removing the general-purpose registers from the root set in the leak checking process. As a result, the leak is reported: $ gcc -pthread test.c -O2 -o test -g3 -Wall && ~/src/chromium/src/third_party/valgrind/linux_x64/bin/valgrind --tool=memcheck --gen-suppressions=all --demangle=no --leak-check=yes --show-possibly-lost=no ./test ==24619== Command: ./test ==24619== n Abandoning the helper. ==24619== ==24619== HEAP SUMMARY: ==24619== in use at exit: 276 bytes in 2 blocks ==24619== total heap usage: 2 allocs, 0 frees, 276 bytes allocated ==24619== ==24619== 4 bytes in 1 blocks are definitely lost in loss record 1 of 2 ==24619== at 0x4C2E900: malloc (vg_replace_malloc.c:270) ==24619== by 0x400A92: helper (test.c:12) ==24619== by 0x4E3FE99: start_thread (pthread_create.c:308) ==24619== Is this a correct approach to handling the background threads? If not, is there an easy way to fix the problem? It turns out that the state of the GP registers of the detached thread is preserved after it has been killed (at least in this particular case), so we could probably introduce a new thread status for detached threads that outlived the main thread and consider them valid in VG_(is_valid_tid). WBR, Alexander Potapenko Software Engineer Google Moscow |
|
From: Alexander P. <gl...@go...> - 2013-08-07 18:51:39
|
On Aug 7, 2013 8:07 PM, "Alexander Potapenko" <gl...@go...> wrote: > > Hi Valgrind devs, > > Jeffrey (CCed) has come up with a test (attached) where a pointer to > an allocated memory chunk is only reachable from a general-purpose > register of a detached thread (see the discussion at > https://code.google.com/p/chromium/issues/detail?id=269278) > That thread is live at the moment the main thread exits (so it's hard > to call this situation a leak), but after that Valgrind sets all > threads' statuses to VgTs_Empty therefore removing the general-purpose > registers from the root set in the leak checking process. As a result, > the leak is reported: > > $ gcc -pthread test.c -O2 -o test -g3 -Wall && > ~/src/chromium/src/third_party/valgrind/linux_x64/bin/valgrind > --tool=memcheck --gen-suppressions=all --demangle=no --leak-check=yes > --show-possibly-lost=no ./test > > ==24619== Command: ./test > ==24619== > n > Abandoning the helper. > ==24619== > ==24619== HEAP SUMMARY: > ==24619== in use at exit: 276 bytes in 2 blocks > ==24619== total heap usage: 2 allocs, 0 frees, 276 bytes allocated > ==24619== > ==24619== 4 bytes in 1 blocks are definitely lost in loss record 1 of 2 > ==24619== at 0x4C2E900: malloc (vg_replace_malloc.c:270) > ==24619== by 0x400A92: helper (test.c:12) > ==24619== by 0x4E3FE99: start_thread (pthread_create.c:308) > ==24619== > > > Is this a correct approach to handling the background threads? If not, > is there an easy way to fix the problem? > It turns out that the state of the GP registers of the detached thread > is preserved after it has been killed (at least in this particular > case), so we could probably introduce a new thread status for detached > threads that outlived the main thread and consider them valid in > VG_(is_valid_tid). > > Additional comment from Jeffrey: """ Note that the same leak report happens if you comment out the pthread_detach() call. If possible, it would be very nice to consider non-detached threads valid too, so that we don't need to add code to detach threads when the main thread wants to exit early. """ |
|
From: Philippe W. <phi...@sk...> - 2013-08-07 23:59:53
|
On Wed, 2013-08-07 at 22:51 +0400, Alexander Potapenko wrote: > > > > Is this a correct approach to handling the background threads? If > not, > > is there an easy way to fix the problem? > > It turns out that the state of the GP registers of the detached > thread > > is preserved after it has been killed (at least in this particular > > case), so we could probably introduce a new thread status for > detached > > threads that outlived the main thread and consider them valid in > > VG_(is_valid_tid). > > > > > Additional comment from Jeffrey: > """ > Note that the same leak report happens if you comment out the > pthread_detach() call. If possible, it would be very nice to consider > non-detached threads valid too, so that we don't need to add code to > detach threads when the main thread wants to exit early. > """ Valgrind intercepts the exit_group system call (_NR_exit_group), and does not call this system call. Instead, it loops over all alive threads of the thread group of the caller thread, and instructs these threads to terminate. Each such instructed thread verifies if it is not the last surviving thread of the process. If not the last surviving thread, it cleans up (e.g. sets its state to empty) and terminates itself by calling the _NR_exit syscall. The last surviving thread will first do various shutdown actions (in particular do the leak check) and then will exit the process. In other words, with the valgrind implementation of _NR_exit_group, the background threads are terminated when the leak check is done. Maybe what could be done is something like: Valgrind exit_group interception could determine if the exit_group will terminate the whole process (by checking if there are threads of another group that will continue to run). If it determines the exit_group will terminate the process, it could do the shutdown actions directly, and then really call the exit_group syscall. Or maybe we better keep the current technique of having each thread terminating itself, but having a new state "thread_group_exiting" that would be cleaned up by the last thread of the group I am however a little bit afraid of the possible race conditions in this area. Thread termination code seems tricky see e.g. https://bugs.kde.org/show_bug.cgi?id=226116 There are very probably a bunch of other tricky things to look at. E.g. tools can track that threads are exiting. With the above change, either tool will not be informed anymore that the threads of the group are exiting, or at best will be informed but not in the context of the thread terminating. For what concerns the problem of what is the "definition of leak": main() { char *p = malloc (10); exit(1); } When main calls exit, *p is still reachable. Is that to be considered as a definite leak or as still reachable ? There is no real big difference between main calling exit and a background thread being "terminated" because main calls exit/exit_group : in both cases, the main thread and the background thread conceptually still have access to their local vars when they die. Philippe |
|
From: Alexander P. <gl...@go...> - 2013-08-13 09:34:57
|
(+timurrrr) For the record, another idea is to perform the leak checking when the program is being terminated by an exit() call or right after the return from main(). If I insert VALGRIND_DO_LEAK_CHECK right before "return 0" in the above example, no leaks are reported, because the detached threads are still live. Perhaps we shouldn't shut them down before checking for leaks? On Thu, Aug 8, 2013 at 4:00 AM, Philippe Waroquiers <phi...@sk...> wrote: > On Wed, 2013-08-07 at 22:51 +0400, Alexander Potapenko wrote: > >> > >> > Is this a correct approach to handling the background threads? If >> not, >> > is there an easy way to fix the problem? >> > It turns out that the state of the GP registers of the detached >> thread >> > is preserved after it has been killed (at least in this particular >> > case), so we could probably introduce a new thread status for >> detached >> > threads that outlived the main thread and consider them valid in >> > VG_(is_valid_tid). >> > >> > >> Additional comment from Jeffrey: >> """ >> Note that the same leak report happens if you comment out the >> pthread_detach() call. If possible, it would be very nice to consider >> non-detached threads valid too, so that we don't need to add code to >> detach threads when the main thread wants to exit early. >> """ > > Valgrind intercepts the exit_group system call (_NR_exit_group), > and does not call this system call. > Instead, it loops over all alive threads of the thread group > of the caller thread, and instructs these threads to terminate. > > Each such instructed thread verifies if it is not the last surviving > thread of the process. If not the last surviving thread, it > cleans up (e.g. sets its state to empty) and > terminates itself by calling the _NR_exit syscall. > > The last surviving thread will first do various shutdown > actions (in particular do the leak check) and then will exit the > process. > > In other words, with the valgrind implementation of _NR_exit_group, > the background threads are terminated when the leak check is done. > > Maybe what could be done is something like: > Valgrind exit_group interception could determine if the exit_group > will terminate the whole process (by checking if there are threads > of another group that will continue to run). > If it determines the exit_group will terminate the process, > it could do the shutdown actions directly, > and then really call the exit_group syscall. > Or maybe we better keep the current technique of having each > thread terminating itself, but having a new state "thread_group_exiting" > that would be cleaned up by the last thread of the group > > I am however a little bit afraid of the possible race conditions > in this area. Thread termination code seems tricky see > e.g. https://bugs.kde.org/show_bug.cgi?id=226116 > There are very probably a bunch of other tricky things to look at. > E.g. tools can track that threads are exiting. > With the above change, either tool will not be informed anymore > that the threads of the group are exiting, or at best will > be informed but not in the context of the thread terminating. > > > For what concerns the problem of what is the "definition of leak": > main() > { > char *p = malloc (10); > exit(1); > } > When main calls exit, *p is still reachable. Is that to be considered > as a definite leak or as still reachable ? > There is no real big difference between main calling exit and a > background thread being "terminated" because main calls > exit/exit_group : in both cases, the main thread and the background > thread conceptually still have access to their local vars when > they die. > > Philippe > > -- Alexander Potapenko Software Engineer Google Moscow |
|
From: Philippe W. <phi...@sk...> - 2013-08-16 15:06:51
Attachments:
patch_reach_thread_register.txt
|
On Tue, 2013-08-13 at 13:34 +0400, Alexander Potapenko wrote:
> (+timurrrr)
> For the record, another idea is to perform the leak checking when the
> program is being terminated by an exit() call or right after the
> return from main().
> If I insert VALGRIND_DO_LEAK_CHECK right before "return 0" in the
> above example, no leaks are reported, because the detached threads are
> still live.
> Perhaps we shouldn't shut them down before checking for leaks?
It is better to have only one thread remaining when doing
the leak search to e.g. avoid problems when running
the libc free resource (see valgrind option
--run-libc-freeres=no|yes).
The attached patch solves the problem by having the exiting thread
marking (using VgSrc_ExitProcess) the other threads which must
terminate due to the sys_exit_group syscall.
The exitreason VgSrc_ExitProcess is then used to detect that
the registers of an empty thread still have to be used for leak
search.
Patch has been regression tested on linux x86, amd64 and ppc64.
Assuming the approach in the patch is deemed ok, there are still a few
additional points to cleanup e.g.
the darwin equivalent code should be done
(would be nice to have an access to a darwin system for that)
some obsolete comments about VgSrc_ExitProcess
confirming (or not) that a process can have threads of multiple
thread group, and updating the code accordingly
Philippe
|
|
From: Timur I. <tim...@go...> - 2013-09-23 13:58:26
|
Has this landed anywhere? 2013/8/16 Philippe Waroquiers <phi...@sk...> > On Tue, 2013-08-13 at 13:34 +0400, Alexander Potapenko wrote: > > (+timurrrr) > > For the record, another idea is to perform the leak checking when the > > program is being terminated by an exit() call or right after the > > return from main(). > > If I insert VALGRIND_DO_LEAK_CHECK right before "return 0" in the > > above example, no leaks are reported, because the detached threads are > > still live. > > Perhaps we shouldn't shut them down before checking for leaks? > > It is better to have only one thread remaining when doing > the leak search to e.g. avoid problems when running > the libc free resource (see valgrind option > --run-libc-freeres=no|yes). > > The attached patch solves the problem by having the exiting thread > marking (using VgSrc_ExitProcess) the other threads which must > terminate due to the sys_exit_group syscall. > The exitreason VgSrc_ExitProcess is then used to detect that > the registers of an empty thread still have to be used for leak > search. > Patch has been regression tested on linux x86, amd64 and ppc64. > > Assuming the approach in the patch is deemed ok, there are still a few > additional points to cleanup e.g. > the darwin equivalent code should be done > (would be nice to have an access to a darwin system for that) > some obsolete comments about VgSrc_ExitProcess > confirming (or not) that a process can have threads of multiple > thread group, and updating the code accordingly > > Philippe > > |
|
From: Philippe W. <phi...@sk...> - 2013-09-23 18:53:04
|
On Mon, 2013-09-23 at 17:57 +0400, Timur Iskhodzhanov wrote: > Has this landed anywhere? No (not yet). The patch as it stands was discussed and found reasonable. However, it was deemed necessary to do a little bit more investigations about the absence of deadlocks e.g. when multiple threads try to exit in parallel. If time permits (and the resulting analysis shows absence of deadlock) it might land in 3.9.0. Philippe > > > 2013/8/16 Philippe Waroquiers <phi...@sk...> > On Tue, 2013-08-13 at 13:34 +0400, Alexander Potapenko wrote: > > (+timurrrr) > > For the record, another idea is to perform the leak checking > when the > > program is being terminated by an exit() call or right after > the > > return from main(). > > If I insert VALGRIND_DO_LEAK_CHECK right before "return 0" > in the > > above example, no leaks are reported, because the detached > threads are > > still live. > > Perhaps we shouldn't shut them down before checking for > leaks? > > > It is better to have only one thread remaining when doing > the leak search to e.g. avoid problems when running > the libc free resource (see valgrind option > --run-libc-freeres=no|yes). > > The attached patch solves the problem by having the exiting > thread > marking (using VgSrc_ExitProcess) the other threads which must > terminate due to the sys_exit_group syscall. > The exitreason VgSrc_ExitProcess is then used to detect that > the registers of an empty thread still have to be used for > leak > search. > Patch has been regression tested on linux x86, amd64 and > ppc64. > > Assuming the approach in the patch is deemed ok, there are > still a few > additional points to cleanup e.g. > the darwin equivalent code should be done > (would be nice to have an access to a darwin system for > that) > some obsolete comments about VgSrc_ExitProcess > confirming (or not) that a process can have threads of > multiple > thread group, and updating the code accordingly > > Philippe > > > |
|
From: Philippe W. <phi...@sk...> - 2013-10-21 20:00:12
|
On Mon, 2013-09-23 at 17:57 +0400, Timur Iskhodzhanov wrote: > Has this landed anywhere? Fix has been committed in revision 13670. Thanks for the (small) test case. Philippe |