Re: [Valgrind-developers] extra syscall_cancel frames

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Florian,

(Adding valgrind-developers to CC to see if someone else has some
smart ideas how to deal with this.)

On Thu, Mar 20, 2025 at 05:58:31PM +0100, Florian Weimer wrote:
> > With latest glibc on fedora rawhide (glibc-2.40.9000-37.fc43.x86_64) I
> > am seeing some extra frames in the call stack that I wonder whether to
> > specially handle in valgrind.
> >
> > Before we would report on some bad syscall argument like:
> >
> > ==1929378== Syscall param sendmsg(msg) points to uninitialised byte(s)
> > ==1929378==    at 0x4971514: sendmsg (sendmsg.c:28)
> > ==1929378==    by 0x40128B: main (sendmsg.c:46)
> > ==1929378==  Address 0x1ffefff640 is on thread 1's stack
> > ==1929378==  in frame #1, created by main (sendmsg.c:13)
> >
> > Now it looks like:
> >
> > ==2670784== Syscall param sendmsg(msg) points to uninitialised byte(s)
> > ==2670784==    at 0x48D9AE6: __internal_syscall_cancel (cancellation.c:64)
> > ==2670784==    by 0x48D9B03: __syscall_cancel (cancellation.c:75)
> > ==2670784==    by 0x49628F0: sendmsg (sendmsg.c:28)
> > ==2670784==    by 0x4005CB: main (sendmsg.c:46)
> > ==2670784==  Address 0x1ffeffff40 is on thread 1's stack
> > ==2670784==  in frame #3, created by main (sendmsg.c:13)
> >
> > Which I think is not as helpful to the user.
> > So I am wondering whether those extra frames should be handled
> > specially in valgrind and filtered out. But were these extra stack
> > frames added explicitly? And are they easily detected (symbol name
> > starting with __ and containing syscall might be a good hearistic)?
> 
> I think __internal_syscall_cancel should get inlined into
> __syscall_cancel.

It isn't, I double checked with gdb and there are always two extra
frames on top of the call stack.

> There is also another out-of-line system call in __syscall_cancel_arch,
> which you probably don't see in your example because the process is
> single-threaded.

I did indeed see that in our gdb_server testsuite, I had to filter
that out of the gdb output to make our vgdb tests pass.

> It is necessary to concentrate all cancelable system calls in one place
> for correctness reasons because we need to know if the cancelling signal
> arrives within the system call or immediately after it.  It's the only
> way to tell whether the effect of the system call has taken place or
> not.  With all system calls in one place, this is a simple address
> check.  With the previous inlining-based approach, we would have to have
> some sort of lookup table to determine whether the cancellation attempt
> happened while the system call was executing or not.
> 
> This is relevant bug:
> 
>   Race conditions in pthread cancellation
>   <https://sourceware.org/bugzilla/show_bug.cgi?id=12683>
> 
> And this commit fixed it:
> 
> commit 89b53077d2a58f00e7debdfe58afabe953dac60d
> Author: Adhemerval Zanella <adh...@li...>
> Date:   Tue Jun 25 16:17:44 2024 -0300
> 
>     nptl: Fix Race conditions in pthread cancellation [BZ#12683]

Interesting, so this is actually in 2.41? I should try the fedora 42
beta then. Do you happen to know whether people/distros have
backported this to earlier releases?

I think these extra __*syscall*cancel* frames are somewhat confusing
to the user and messes up existing suppressions. They also cause
trouble for the valgrind regtests.

I think the solution for valgrind is to just skip the top (two) frames
if they match the __*syscall*cancel* symbol address ranges. And we
only need to do that when we are creating a backtrace from a valgrind
syscall wrapper.

Looking at the glibc symtab I see four function symbol matching that
pattern:

 2140: 0000000000079840     51 FUNC    LOCAL  DEFAULT        4 __syscall_cancel_arch
 3561: 000000000006daf0     64 FUNC    LOCAL  DEFAULT        4 __syscall_cancel
 3700: 000000000006da60    140 FUNC    LOCAL  DEFAULT        4 __internal_syscall_cancel
 4566: 000000000006da00     87 FUNC    LOCAL  DEFAULT        4 __syscall_do_cancel

Can we rely on those names (and assume there are only 4) or is it
better to be flexible and just create a dynamic array for any glibc
local function that matches the __*syscall*cancel* pattern?

Thanks,

Mark