You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
1
(3) |
2
|
3
(2) |
4
(3) |
5
|
6
(1) |
7
|
|
8
|
9
(4) |
10
(1) |
11
(2) |
12
(1) |
13
(8) |
14
|
|
15
|
16
|
17
|
18
(2) |
19
(1) |
20
|
21
|
|
22
|
23
|
24
|
25
|
26
|
27
(1) |
28
(2) |
|
29
(6) |
30
|
31
|
|
|
|
|
|
From: Paul F. <pj...@wa...> - 2020-03-29 18:53:28
|
> > With --trace-sched=yes --trace-syscalls=yes I get the following (I've taken it from the mmap/mprotect, which I assume is the main thread setting up the memory for the child thread's stack). > > The last line looks very interesting (I hadn't seen it before). It looks like there are 4 values of umtx_op that I'm not handling. This seems to have pointed me in the right direction. Adding a ‘MayBlock’ to the flags for _umtx_op has fixed the problem. A+ Paul |
|
From: Paul F. <pj...@wa...> - 2020-03-29 16:15:48
|
> Are there some other threads still alive when the process hangs > What are these threads doing ? > Would be nice to re-run with more trace (-v -v -v -d -d -d --trace-sched=yes) > and see how has last acquired the lock without releasing it Salut Philippe The original testcase created 498 threads, but I reduced it to 1 to simplify matters. The problem is still there. >From what I've seen there is just the one chiile thread waiting in the semaphore and the main thread not advancing to post the semaphore. With just -v -v -v -d -d -d memcheck ran to completion. With --trace-sched=yes --trace-syscalls=yes I get the following (I've taken it from the mmap/mprotect, which I assume is the main thread setting up the memory for the child thread's stack). The last line looks very interesting (I hadn't seen it before). It looks like there are 4 values of umtx_op that I'm not handling. SYSCALL[40689,1](477) sys_mmap ( 0x7fffdfdfd000, 2101248, 7, 1024, 4294967295, 0x0)--40689-- di_notify_mmap-0: --40689-- di_notify_mmap-1: 0x7fffdfdfd000-0x7fffdfffdfff rwx --> [pre-success] Success(0x7fffdfdfd000) --40689-- SCHED[1]: TRC: FASTMISS --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: SYSCALL SYSCALL[40689,1]( 74) sys_mprotect ( 0x7fffdfdfd000, 4096, 0 )[sync] --> Success(0x0) --40689-- SCHED[1]: TRC: FASTMISS --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: FASTMISS --40689-- SCHED[1]: TRC: CHAIN_ME_SLOW --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_SLOW --40689-- SCHED[1]: TRC: FASTMISS --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_SLOW --40689-- SCHED[1]: TRC: FASTMISS --40689-- SCHED[1]: TRC: CHAIN_ME_SLOW --40689-- SCHED[1]: TRC: FASTMISS --40689-- SCHED[1]: TRC: FASTMISS --40689-- SCHED[1]: TRC: FASTMISS --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: FASTMISS --40689-- SCHED[1]: TRC: FASTMISS --40689-- SCHED[1]: TRC: SYSCALL SYSCALL[40689,1](455) thr_new ( 0x7ff0002d0, 104 ) --> [pre-success] Success(0x0) --40689-- SCHED[1]: releasing lock (VG_(vg_yield)) -> VgTs_Yielding --40689-- SCHED[2]: acquired lock (thread_wrapper(starting new thread)) --40689-- SCHED[2]: entering VG_(scheduler) --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CLIENTREQ --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_SLOW --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: SYSCALL SYSCALL[40689,2](477) sys_mmap ( 0x0, 131072, 3, 4098, 4294967295, 0x0)--40689-- di_notify_mmap-0: --40689-- di_notify_mmap-1: 0x4c47000-0x4ea6fff rw- --> [pre-success] Success(0x4e87000) --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_SLOW --40689-- SCHED[2]: TRC: CHAIN_ME_SLOW --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_SLOW --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_SLOW --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_SLOW --40689-- SCHED[2]: TRC: CHAIN_ME_SLOW --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_SLOW --40689-- SCHED[2]: TRC: CHAIN_ME_SLOW --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_SLOW --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_SLOW --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_SLOW --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_SLOW --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_SLOW --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: SYSCALL SYSCALL[40689,2]( 4) sys_write ( 2, 0x401330, 21 ) --> [async] ... --40689-- SCHED[2]: releasing lock (VG_(client_syscall)[async]) -> VgTs_WaitSys --40689-- SCHED[1]: acquired lock (VG_(vg_yield)) --40689-- SCHED[1]: TRC: FASTMISS --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: FASTMISS --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_SLOW --40689-- SCHED[1]: TRC: FASTMISS --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_SLOW --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_FAST --40689-- SCHED[1]: TRC: CHAIN_ME_SLOW --40689-- SCHED[1]: TRC: SYSCALL SYSCALL[40689,1](454) sys__umtx_op ( 0x4e88000, MUTEX_WAIT, 0, 0x0, 0x0) --> [async] ... --40689-- SCHED[1]: releasing lock (VG_(client_syscall)[async]) -> VgTs_WaitSys --40689-- SCHED[2]: acquired lock (VG_(client_syscall)[async]) SYSCALL[40689,2]( 4) ... [async] --> Success(0x15) --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: SYSCALL SYSCALL[40689,2](454) sys__umtx_op ( 0x4e88000, MUTEX_WAKE2, 0, 0x0, 0x0)[sync] --> Success(0x0) --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_SLOW --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: CHAIN_ME_FAST --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: FASTMISS --40689-- SCHED[2]: TRC: SYSCALL SYSCALL[40689,2](454) sys__umtx_op ( 0x601944, 23(UNKNOWN), 0, 0x0, 0x0 ) A+ Paul |
|
From: Philippe W. <phi...@sk...> - 2020-03-29 15:42:32
|
Are there some other threads still alive when the process hangs ? What are these threads doing ? Would be nice to re-run with more trace (-v -v -v -d -d -d --trace-sched=yes) and see how has last acquired the lock without releasing it Philippe On Sun, 2020-03-29 at 16:27 +0200, Paul FLOYD wrote: > Hi > > I'm trying to debug this regtest on FreeBSD. The test has functions main, fn1 anf fn2. > main: creates a load of fn1 threads > fn1: function disables error reporting then waits on a semaphore > main: post semaphore > fn1: drop off end and terminate > main: join fn1 threads > main: and created another load of threads, fn2 this time > fn2: read from feed memory then waits on the semaphore > main: semaphore post > fn2: drop off end and terminate > main: join threads > main: status reporting > > The callstack where memcheck is hanging is > > * frame #0: 0x0000000038088b92 memcheck-amd64-freebsd`do_syscall_WRK at m_syscall.c:383:1 frame #1: 0x0000000038088c2e memcheck-amd64-freebsd`vgPlain_do_syscall(sysno=3, a1=469737, a2=17224604422, a3=1, a4=0, a5=0, a6=0, a7=0, a8=0) at m_syscall.c:1097:10 frame #2: 0x0000000038070776 memcheck-amd64-freebsd`vgPlain_read(fd=469737, buf=0x0000000402aa9b06, count=1) at m_libcfile.c:247:17 frame #3: 0x0000000038173292 memcheck-amd64-freebsd`vgModuleLocal_sema_down(sema=0x0000000402006640, as_LL='\0') at sema.c:107:10 frame #4: 0x0000000038172f2a memcheck-amd64-freebsd`acquire_sched_lock(p=0x0000000402006640) at sched-lock-generic.c:69:4 frame #5: 0x0000000038172e7b memcheck-amd64-freebsd`vgModuleLocal_acquire_sched_lock(p=0x0000000402006640) at sched-lock.c:88:26 frame #6: 0x00000000380e6a7c memcheck-amd64-freebsd`vgPlain_acquire_BigLock_LL(who=0x0000000000000000) at scheduler.c:425:4 frame #7: 0x00000000380e6659 memcheck-amd64-freebsd`vgPlain_acquire_BigLock(tid=1, who="VG_(vg_yield)") at scheduler.c:349:4 frame #8: 0x00000000380e6e4f memcheck-amd64-freebsd`vgPlain_vg_yield at scheduler.c:530:4 frame #9: 0x00000000380ee88c memcheck-amd64-freebsd`vgPlain_post_syscall(tid=1) at syswrap-main.c:2470:7 frame #10: 0x00000000380ee326 memcheck-amd64-freebsd`vgPlain_client_syscall(tid=1, trc=73) at syswrap-main.c:2351:4 frame #11: 0x00000000380e86e2 memcheck-amd64-freebsd`handle_syscall(tid=1, trc=73) at scheduler.c:1211:4 frame #12: 0x00000000380e90b3 memcheck-amd64-freebsd`vgPlain_scheduler(tid=1) at scheduler.c:1529:3 frame #13: 0x0000000038100540 memcheck-amd64-freebsd`thread_wrapper(tidW=1) at syswrap-freebsd.c:95:10 frame #14: 0x00000000381006bc memcheck-amd64-freebsd`run_a_thread_NORETURN(tidW=1) at syswrap-freebsd.c:133:10 > > I'm seeing the process waiting in the read syscall to the semaphore pipe. What I don't understand is how the main thread is supposed to execute at this point in order for the semaphore to signal. Does anyone have some light to shine on how this should work? > > Possibly related, I'm also seeing sanity check errors with this case, after the mmap. > > A+ > Paul > > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers |
|
From: Paul F. <pj...@wa...> - 2020-03-29 15:28:13
|
Hi
Sorry for the formatting, I'm writing this on my FreeBSD machine and using a web mail client. This time in plain text.
The stacktrace
* frame #0: 0x0000000038088b92 memcheck-amd64-freebsd`do_syscall_WRK at m_syscall.c:383:1
frame #1: 0x0000000038088c2e memcheck-amd64-freebsd`vgPlain_do_syscall(sysno=3, a1=469737, a2=17224604422, a3=1, a4=0, a5=0, a6=0, a7=0, a8=0) at m_syscall.c:1097:10
frame #2: 0x0000000038070776 memcheck-amd64-freebsd`vgPlain_read(fd=469737, buf=0x0000000402aa9b06, count=1) at m_libcfile.c:247:17
frame #3: 0x0000000038173292 memcheck-amd64-freebsd`vgModuleLocal_sema_down(sema=0x0000000402006640, as_LL='\0') at sema.c:107:10
frame #4: 0x0000000038172f2a memcheck-amd64-freebsd`acquire_sched_lock(p=0x0000000402006640) at sched-lock-generic.c:69:4
frame #5: 0x0000000038172e7b memcheck-amd64-freebsd`vgModuleLocal_acquire_sched_lock(p=0x0000000402006640) at sched-lock.c:88:26
frame #6: 0x00000000380e6a7c memcheck-amd64-freebsd`vgPlain_acquire_BigLock_LL(who=0x0000000000000000) at scheduler.c:425:4
frame #7: 0x00000000380e6659 memcheck-amd64-freebsd`vgPlain_acquire_BigLock(tid=1, who="VG_(vg_yield)") at scheduler.c:349:4
frame #8: 0x00000000380e6e4f memcheck-amd64-freebsd`vgPlain_vg_yield at scheduler.c:530:4
frame #9: 0x00000000380ee88c memcheck-amd64-freebsd`vgPlain_post_syscall(tid=1) at syswrap-main.c:2470:7
frame #10: 0x00000000380ee326 memcheck-amd64-freebsd`vgPlain_client_syscall(tid=1, trc=73) at syswrap-main.c:2351:4
frame #11: 0x00000000380e86e2 memcheck-amd64-freebsd`handle_syscall(tid=1, trc=73) at scheduler.c:1211:4
frame #12: 0x00000000380e90b3 memcheck-amd64-freebsd`vgPlain_scheduler(tid=1) at scheduler.c:1529:3
frame #13: 0x0000000038100540 memcheck-amd64-freebsd`thread_wrapper(tidW=1) at syswrap-freebsd.c:95:10
frame #14: 0x00000000381006bc memcheck-amd64-freebsd`run_a_thread_NORETURN(tidW=1) at syswrap-freebsd.c:133:10
Source code (it's err_disable4.c with a few mods)
#include
#include
#include
#include
#include
#include
/* PTHREAD_STACK_MIN */
#include "../include/valgrind.h"
char* block = NULL;
# if !defined(VGO_darwin)
sem_t sem;
# else
sem_t *sem;
static const char *semname = "Semaphore1";
# endif
__attribute__((noinline)) void usechar ( char c )
{
// Spook gcc into believing mysterious bad things are
// happening behind its back, and that 'c' is definitely
// used in some (unknown) way.
__asm__ __volatile__("" : : "r"(c) : "memory","cc");
}
__attribute__((noinline)) void err ( void )
{
usechar( block[5] );
}
void* child_fn_1 ( void* arg )
{
// Disable error reporting, then wait to exit
VALGRIND_DISABLE_ERROR_REPORTING;
fprintf(stderr, "fn1 calling sem_wait\n");
# if !defined(VGO_darwin)
int r = sem_wait(&sem); assert(!r);
# else
int r = sem_wait(sem); assert(!r);
# endif
return NULL;
}
void* child_fn_2 ( void* arg )
{
// make an error, then wait to exit
fprintf(stderr, "fn2 calling err\n");
err();
fprintf(stderr, "fn2 calling sem_wait\n");
# if !defined(VGO_darwin)
int r = sem_wait(&sem); assert(!r);
# else
int r = sem_wait(sem); assert(!r);
# endif
return NULL;
}
#define NTHREADS 1 // VG_N_THREADS - 2
int main ( void )
{
int r, i;
pthread_t child[NTHREADS];
block = malloc(10);
free(block);
// part 1
fprintf(stderr, "\n-------- Letting %d threads exit "
"w/ errs disabled ------\n\n",
NTHREADS);
// set up the semaphore
# if !defined(VGO_darwin)
r = sem_init(&sem, 0, 0); assert(!r);
# else
sem = sem_open(semname, O_CREAT, 0777, 0); assert(!(sem == SEM_FAILED));
# endif
fprintf(stderr, "opened sem\n");
#if !defined(VGO_freebsd)
pthread_attr_t attr;
r = pthread_attr_init(&attr); assert(!r);
r = pthread_attr_setstacksize(&attr, PTHREAD_STACK_MIN);
#endif
// create N threads to do child_fn_1 ...
for (i = 0; i < NTHREADS; i++) {
r = pthread_create(&child[i], NULL, child_fn_1, NULL);
assert(!r);
}
fprintf(stderr, "created fn1\n");
// let them all exit
for (i = 0; i < NTHREADS; i++) {
# if !defined(VGO_darwin)
r = sem_post(&sem); assert(!r);
# else
r = sem_post(sem); assert(!r);
# endif
}
fprintf(stderr, "posted dem\n");
// join
for (i = 0; i < NTHREADS; i++) {
r = pthread_join(child[i], NULL); assert(!r);
}
fprintf(stderr, "joined\n");
// part 2
fprintf(stderr, "\n-------- Letting %d threads make an error "
"------\n\n",
NTHREADS);
// semaphore is already back at zero
// create N threads to do child_fn_2 ...
for (i = 0; i < NTHREADS; i++) {
r = pthread_create(&child[i], NULL, child_fn_2, NULL);
assert(!r);
}
fprintf(stderr, "created fn2\n");
// let them all exit
for (i = 0; i < NTHREADS; i++) {
# if !defined(VGO_darwin)
r = sem_post(&sem); assert(!r);
# else
r = sem_post(sem); assert(!r);
# endif
}
fprintf(stderr, "posted sem again\n");
// join
for (i = 0; i < NTHREADS; i++) {
r = pthread_join(child[i], NULL); assert(!r);
}
fprintf(stderr, "waited again\n");
// Print the final error counts. There need to be 498 errors
// in 1 context. Anything else, and something is not right.
int nerrors = VALGRIND_COUNT_ERRORS;
fprintf(stderr, "\n-------- Got %d errors (expected %d ==> %s) ------\n\n",
nerrors, NTHREADS, nerrors == NTHREADS ? "PASS" : "FAIL" );
return 0;
}
|
|
From: John R. <jr...@bi...> - 2020-03-29 15:12:49
|
Post the source code to the test, so that execution on FreeBSD can be compared to execution on other operating systems. This will also clarify the meaning of descriptions such as "creates a load of fn1 threads". Re-post the traceback, without text filling. Each "frame #N" should begin on a new line. On 3/29/20 14:27 UTC, Paul FLOYD wrote: > I'm trying to debug this regtest on FreeBSD. The test has functions main, fn1 anf fn2. > > * main: creates a load of fn1 threads [snip] |
|
From: Paul F. <pj...@wa...> - 2020-03-29 14:27:53
|
Hi I'm trying to debug this regtest on FreeBSD. The test has functions main, fn1 anf fn2. main: creates a load of fn1 threads fn1: function disables error reporting then waits on a semaphore main: post semaphore fn1: drop off end and terminate main: join fn1 threads main: and created another load of threads, fn2 this time fn2: read from feed memory then waits on the semaphore main: semaphore post fn2: drop off end and terminate main: join threads main: status reporting The callstack where memcheck is hanging is * frame #0: 0x0000000038088b92 memcheck-amd64-freebsd`do_syscall_WRK at m_syscall.c:383:1 frame #1: 0x0000000038088c2e memcheck-amd64-freebsd`vgPlain_do_syscall(sysno=3, a1=469737, a2=17224604422, a3=1, a4=0, a5=0, a6=0, a7=0, a8=0) at m_syscall.c:1097:10 frame #2: 0x0000000038070776 memcheck-amd64-freebsd`vgPlain_read(fd=469737, buf=0x0000000402aa9b06, count=1) at m_libcfile.c:247:17 frame #3: 0x0000000038173292 memcheck-amd64-freebsd`vgModuleLocal_sema_down(sema=0x0000000402006640, as_LL='\0') at sema.c:107:10 frame #4: 0x0000000038172f2a memcheck-amd64-freebsd`acquire_sched_lock(p=0x0000000402006640) at sched-lock-generic.c:69:4 frame #5: 0x0000000038172e7b memcheck-amd64-freebsd`vgModuleLocal_acquire_sched_lock(p=0x0000000402006640) at sched-lock.c:88:26 frame #6: 0x00000000380e6a7c memcheck-amd64-freebsd`vgPlain_acquire_BigLock_LL(who=0x0000000000000000) at scheduler.c:425:4 frame #7: 0x00000000380e6659 memcheck-amd64-freebsd`vgPlain_acquire_BigLock(tid=1, who="VG_(vg_yield)") at scheduler.c:349:4 frame #8: 0x00000000380e6e4f memcheck-amd64-freebsd`vgPlain_vg_yield at scheduler.c:530:4 frame #9: 0x00000000380ee88c memcheck-amd64-freebsd`vgPlain_post_syscall(tid=1) at syswrap-main.c:2470:7 frame #10: 0x00000000380ee326 memcheck-amd64-freebsd`vgPlain_client_syscall(tid=1, trc=73) at syswrap-main.c:2351:4 frame #11: 0x00000000380e86e2 memcheck-amd64-freebsd`handle_syscall(tid=1, trc=73) at scheduler.c:1211:4 frame #12: 0x00000000380e90b3 memcheck-amd64-freebsd`vgPlain_scheduler(tid=1) at scheduler.c:1529:3 frame #13: 0x0000000038100540 memcheck-amd64-freebsd`thread_wrapper(tidW=1) at syswrap-freebsd.c:95:10 frame #14: 0x00000000381006bc memcheck-amd64-freebsd`run_a_thread_NORETURN(tidW=1) at syswrap-freebsd.c:133:10 I'm seeing the process waiting in the read syscall to the semaphore pipe. What I don't understand is how the main thread is supposed to execute at this point in order for the semaphore to signal. Does anyone have some light to shine on how this should work? Possibly related, I'm also seeing sanity check errors with this case, after the mmap. A+ Paul |