|
From: Julian S. <js...@ac...> - 2005-11-11 21:23:44
|
Valgrind simulates the kernel's page-fault-based stack growth, by inspecting signals that arrive, and silently extending the client's stack downwards under the right conditions. The logic for this is at m_signals.c around line 1566. A vki_siginfo_t structure is supplied, and part of the tests to determine if this is a stack-extension kind of event is the condition "info->si_code == 1". On all mainstream Linux distros, including ppc32-linux, this works fine. However, on MontaVista(R) Linux(R) Professional Edition 3.1 running on an PPC440GX EVB (Ocotea), info->si_code shows up with not 1 but 0x30001. This messes up the logic and causes V to treat it as a normal segfault, which kills the app. If I mask it with 0xFFFF before the compare then everything works fine. So does anybody have a clue what the significance of 0x30001 vs 0x1 is? Paul? J |
|
From: Tom H. <to...@co...> - 2005-11-12 00:04:38
|
In message <200...@ac...>
Julian Seward <js...@ac...> wrote:
> The logic for this is at m_signals.c around line 1566. A vki_siginfo_t
> structure is supplied, and part of the tests to determine if this
> is a stack-extension kind of event is the condition "info->si_code == 1".
The 1 is actually SEGV_MAPERR and would appear to be correct. We
should probably define a VKI_SEGV_MAPERR constant properly.
The si_code values are POSIX defined - well the names are. The values
aren't but they are fairly standard across different Unix variants.
> On all mainstream Linux distros, including ppc32-linux, this works fine.
> However, on MontaVista(R) Linux(R) Professional Edition 3.1 running on
> an PPC440GX EVB (Ocotea), info->si_code shows up with not 1 but
> 0x30001. This messes up the logic and causes V to treat it as a normal
> segfault, which kills the app. If I mask it with 0xFFFF before the compare
> then everything works fine.
>
> So does anybody have a clue what the significance of 0x30001 vs 0x1 is?
> Paul?
It's a kernel internal detail that shouldn't be leaking to user
space - if you look at include/asm-generic/siginfo.h in the kernel
source you will see that SEGV_MAPERR is 0x30001 if __KERNEL__ is
defined and 1 if it isn't.
When copy_siginfo_to_user in kernel/signal.c copies the siginfo
structure out to user space it deliberately casts the si_code
value to a short to discard the top half of it.
It sounds like this MontaVista kernel is a bit broken...
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Julian S. <js...@ac...> - 2005-11-12 01:37:24
|
> > So does anybody have a clue what the significance of 0x30001 vs 0x1 is? > > Paul? > > It's a kernel internal detail that shouldn't be leaking to user > space - if you look at include/asm-generic/siginfo.h in the kernel > source you will see that SEGV_MAPERR is 0x30001 if __KERNEL__ is > defined and 1 if it isn't. > > When copy_siginfo_to_user in kernel/signal.c copies the siginfo > structure out to user space it deliberately casts the si_code > value to a short to discard the top half of it. > > It sounds like this MontaVista kernel is a bit broken... It's based on 2.4.20, not that that means it's not broken. So the implication is that we should mask si_code ourselves whenever we use it. Ah well. Ok. J |
|
From: Tom H. <to...@co...> - 2005-11-12 09:00:57
|
In message <200...@ac...>
Julian Seward <js...@ac...> wrote:
> > It's a kernel internal detail that shouldn't be leaking to user
> > space - if you look at include/asm-generic/siginfo.h in the kernel
> > source you will see that SEGV_MAPERR is 0x30001 if __KERNEL__ is
> > defined and 1 if it isn't.
> >
> > When copy_siginfo_to_user in kernel/signal.c copies the siginfo
> > structure out to user space it deliberately casts the si_code
> > value to a short to discard the top half of it.
> >
> > It sounds like this MontaVista kernel is a bit broken...
>
> It's based on 2.4.20, not that that means it's not broken.
>
> So the implication is that we should mask si_code ourselves whenever
> we use it. Ah well. Ok.
It might be better to fix the siginfo structure at the start of
the signal handler(s) on ppc32 so that we only have to do it in
one or two places.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Paul M. <pa...@sa...> - 2005-11-12 06:42:55
|
Julian Seward writes: > So does anybody have a clue what the significance of 0x30001 vs 0x1 is? There's an error in the code that copies siginfo structs to userspace in the ppc32 2.4 kernel. The high half of that word is kernel stuff that should be masked off when copying the siginfo to userspace, but isn't. Paul. |