|
From: <sv...@va...> - 2010-02-22 11:03:21
|
Author: sewardj
Date: 2010-02-22 11:03:10 +0000 (Mon, 22 Feb 2010)
New Revision: 11053
Log:
When creating a child thread, initially set its os_state.threadgroup
to have the same value as the parent. This avoids exit races leading
to hangs and strange behaviour in heavily multithreaded apps, in the
situation where threads are rapidly being created, and at the same
time an existing thread does sys_exit_group so as to terminate the
entire process. Thanks to Konstantin S for chasing this down to a
small test case. Fixes #226116.
Modified:
trunk/coregrind/m_syswrap/syswrap-amd64-linux.c
trunk/coregrind/m_syswrap/syswrap-arm-linux.c
trunk/coregrind/m_syswrap/syswrap-linux.c
trunk/coregrind/m_syswrap/syswrap-ppc32-linux.c
trunk/coregrind/m_syswrap/syswrap-ppc64-linux.c
trunk/coregrind/m_syswrap/syswrap-x86-linux.c
Modified: trunk/coregrind/m_syswrap/syswrap-amd64-linux.c
===================================================================
--- trunk/coregrind/m_syswrap/syswrap-amd64-linux.c 2010-02-21 14:52:59 UTC (rev 11052)
+++ trunk/coregrind/m_syswrap/syswrap-amd64-linux.c 2010-02-22 11:03:10 UTC (rev 11053)
@@ -251,6 +251,17 @@
ctst->sig_mask = ptst->sig_mask;
ctst->tmp_sig_mask = ptst->sig_mask;
+ /* Start the child with its threadgroup being the same as the
+ parent's. This is so that any exit_group calls that happen
+ after the child is created but before it sets its
+ os_state.threadgroup field for real (in thread_wrapper in
+ syswrap-linux.c), really kill the new thread. a.k.a this avoids
+ a race condition in which the thread is unkillable (via
+ exit_group) because its threadgroup is not set. The race window
+ is probably only a few hundred or a few thousand cycles long.
+ See #226116. */
+ ctst->os_state.threadgroup = ptst->os_state.threadgroup;
+
/* We don't really know where the client stack is, because its
allocated by the client. The best we can do is look at the
memory mappings and try to derive some useful information. We
Modified: trunk/coregrind/m_syswrap/syswrap-arm-linux.c
===================================================================
--- trunk/coregrind/m_syswrap/syswrap-arm-linux.c 2010-02-21 14:52:59 UTC (rev 11052)
+++ trunk/coregrind/m_syswrap/syswrap-arm-linux.c 2010-02-22 11:03:10 UTC (rev 11053)
@@ -201,6 +201,17 @@
ctst->sig_mask = ptst->sig_mask;
ctst->tmp_sig_mask = ptst->sig_mask;
+ /* Start the child with its threadgroup being the same as the
+ parent's. This is so that any exit_group calls that happen
+ after the child is created but before it sets its
+ os_state.threadgroup field for real (in thread_wrapper in
+ syswrap-linux.c), really kill the new thread. a.k.a this avoids
+ a race condition in which the thread is unkillable (via
+ exit_group) because its threadgroup is not set. The race window
+ is probably only a few hundred or a few thousand cycles long.
+ See #226116. */
+ ctst->os_state.threadgroup = ptst->os_state.threadgroup;
+
seg = VG_(am_find_nsegment)((Addr)sp);
if (seg && seg->kind != SkResvn) {
ctst->client_stack_highest_word = (Addr)VG_PGROUNDUP(sp);
Modified: trunk/coregrind/m_syswrap/syswrap-linux.c
===================================================================
--- trunk/coregrind/m_syswrap/syswrap-linux.c 2010-02-21 14:52:59 UTC (rev 11052)
+++ trunk/coregrind/m_syswrap/syswrap-linux.c 2010-02-22 11:03:10 UTC (rev 11053)
@@ -83,6 +83,9 @@
VG_TRACK(pre_thread_first_insn, tid);
tst->os_state.lwpid = VG_(gettid)();
+ /* Set the threadgroup for real. This overwrites the provisional
+ value set in do_clone() syswrap-*-linux.c. See comments in
+ do_clone for background, also #226116. */
tst->os_state.threadgroup = VG_(getpid)();
/* Thread created with all signals blocked; scheduler will set the
Modified: trunk/coregrind/m_syswrap/syswrap-ppc32-linux.c
===================================================================
--- trunk/coregrind/m_syswrap/syswrap-ppc32-linux.c 2010-02-21 14:52:59 UTC (rev 11052)
+++ trunk/coregrind/m_syswrap/syswrap-ppc32-linux.c 2010-02-22 11:03:10 UTC (rev 11053)
@@ -297,6 +297,17 @@
ctst->sig_mask = ptst->sig_mask;
ctst->tmp_sig_mask = ptst->sig_mask;
+ /* Start the child with its threadgroup being the same as the
+ parent's. This is so that any exit_group calls that happen
+ after the child is created but before it sets its
+ os_state.threadgroup field for real (in thread_wrapper in
+ syswrap-linux.c), really kill the new thread. a.k.a this avoids
+ a race condition in which the thread is unkillable (via
+ exit_group) because its threadgroup is not set. The race window
+ is probably only a few hundred or a few thousand cycles long.
+ See #226116. */
+ ctst->os_state.threadgroup = ptst->os_state.threadgroup;
+
/* We don't really know where the client stack is, because its
allocated by the client. The best we can do is look at the
memory mappings and try to derive some useful information. We
Modified: trunk/coregrind/m_syswrap/syswrap-ppc64-linux.c
===================================================================
--- trunk/coregrind/m_syswrap/syswrap-ppc64-linux.c 2010-02-21 14:52:59 UTC (rev 11052)
+++ trunk/coregrind/m_syswrap/syswrap-ppc64-linux.c 2010-02-22 11:03:10 UTC (rev 11053)
@@ -325,6 +325,17 @@
ctst->sig_mask = ptst->sig_mask;
ctst->tmp_sig_mask = ptst->sig_mask;
+ /* Start the child with its threadgroup being the same as the
+ parent's. This is so that any exit_group calls that happen
+ after the child is created but before it sets its
+ os_state.threadgroup field for real (in thread_wrapper in
+ syswrap-linux.c), really kill the new thread. a.k.a this avoids
+ a race condition in which the thread is unkillable (via
+ exit_group) because its threadgroup is not set. The race window
+ is probably only a few hundred or a few thousand cycles long.
+ See #226116. */
+ ctst->os_state.threadgroup = ptst->os_state.threadgroup;
+
/* We don't really know where the client stack is, because its
allocated by the client. The best we can do is look at the
memory mappings and try to derive some useful information. We
Modified: trunk/coregrind/m_syswrap/syswrap-x86-linux.c
===================================================================
--- trunk/coregrind/m_syswrap/syswrap-x86-linux.c 2010-02-21 14:52:59 UTC (rev 11052)
+++ trunk/coregrind/m_syswrap/syswrap-x86-linux.c 2010-02-22 11:03:10 UTC (rev 11053)
@@ -262,6 +262,17 @@
ctst->sig_mask = ptst->sig_mask;
ctst->tmp_sig_mask = ptst->sig_mask;
+ /* Start the child with its threadgroup being the same as the
+ parent's. This is so that any exit_group calls that happen
+ after the child is created but before it sets its
+ os_state.threadgroup field for real (in thread_wrapper in
+ syswrap-linux.c), really kill the new thread. a.k.a this avoids
+ a race condition in which the thread is unkillable (via
+ exit_group) because its threadgroup is not set. The race window
+ is probably only a few hundred or a few thousand cycles long.
+ See #226116. */
+ ctst->os_state.threadgroup = ptst->os_state.threadgroup;
+
/* We don't really know where the client stack is, because its
allocated by the client. The best we can do is look at the
memory mappings and try to derive some useful information. We
|