From: Mark W. <ma...@kl...> - 2025-04-25 22:07:10
|
Hi all, On Fri, Apr 18, 2025 at 05:46:54PM +0200, Mark Wielaard wrote: > On Mon, 2025-04-14 at 14:06 +0200, Mark Wielaard wrote: > > On Sun, Apr 06, 2025 at 03:23:54PM +0200, Mark Wielaard wrote: > > > On Mon, Mar 31, 2025 at 11:29:41AM +0200, Mark Wielaard wrote: > > > > On Fri, Mar 28, 2025 at 07:02:28PM +0100, Mark Wielaard wrote: > > > > > On Fri, 2025-03-21 at 14:01 +0100, Florian Weimer wrote: > > > > > > Without this change, the system call wrapper function is not visible > > > > > > on the stack at the time of the system call, which causes problems > > > > > > for interception tools such as valgrind. > > > > > > > > > > > > Enhances commit 89b53077d2a58f00e7debdfe58afabe953dac60d ("nptl: Fix > > > > > > Race conditions in pthread cancellation [BZ#12683]"). > > > > > > > > > > > > Tested on i686-linux-gnu, powerpc64le-linux-gnu, x86_64-linux-gnu. > > > > > > (We're still discussing if valgrind needs this, but if it does, here's a > > > > > > patch.) > > > > > > > > > > I implemented the valgrind part of skipping the syscall_cancel frames > > > > > here: https://bugs.kde.org/show_bug.cgi?id=502126#c2 > > > > > And there is a valgrind package build for fedora rawhide: > > > > > https://koji.fedoraproject.org/koji/buildinfo?buildID=2687393 > > > > > > > > > > For ppc64le, s390x and x86_64 that patch seems enough. > > > > > > > > > > For i686 and aarch64 there does seem to be an issue with missing the > > > > > glibc calling function because of a tail call. > > > > > > > > > > Also on i686 there is another extra frame on top __libc_do_syscall. > > > > > > > > I extended the patch to cover some extra sycall wrapper function > > > > symbols on i386 and armhf and pushed it to valgrind trunk and > > > > VALGRIND_3_24_BRANCH. There are builds for fedora rawhide and > > > > f42. This does seem to show that only on arm64 the tail calls > > > > obscure observing the full call stack. > > > > > > This has now landed in fedora rawhide and f42. Test results look good, > > > except for some if the arm64 tests where the tail calls obscure > > > observing the full call stack. Please let me know if you need any more > > > input from us to get this fix in glibc. > > > > Please let me know. Valgrind test results for syscall backtraces on > > anything except arm64 look good. We are working on valgrind 3.25.0 > > now, to be released around April 24. > > valgrind 3.25.0-RC1 has been released and test results look good on > most arches. arm64 does show the issue described above where the tail > calls obscure observing the full call stack when doing system calls. valgrind 3.25.0 have been released and is now in Fedora rawhide and Fedora 42 with the new glibc syscall_cancel frames. The tail calls on aarch64 still seem to be a problem for observability of the syscall call stack. > Let me know what would be needed to get the above patch reviewed. Thanks, Mark |