|
From: Dan K. <da...@ke...> - 2008-02-24 13:46:25
|
I'd been happily using valgrind from svn of Feb 12th
on wine's conformance test suite, and then I upgraded
my OS from Ubuntu Feisty to Ubuntu Gutsy.
Suddenly I began seeing the following kind of error.
Updating to a fresh copy of valgrind from svn didn't help;
the following error is from the fresh build.
--26623-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11
(SIGSEGV) - exiting
--26623-- si_code=2; Faulting address: 0x0; sp: 0x627C1DF8
valgrind: the 'impossible' happened:
Killed by fatal signal
==26623== at 0x38003987: check_mem_is_defined_asciiz (mc_main.c:2445)
==26623== by 0x3803A615: vgSysWrap_linux_sys_utimensat_before
(syswrap-linux.c:2834)
==26623== by 0x38038410: vgPlain_client_syscall (syswrap-main.c:850)
==26623== by 0x38035F27: vgPlain_scheduler (scheduler.c:798)
==26623== by 0x38049EE8: run_a_thread_NORETURN (syswrap-linux.c:89)
That line of mc_main.c is:
/* Ok, a is safe to read. */
if (* ((UChar*)a) == 0) {
The rest of the message is
sched status:
running_tid=1
Thread 1: status = VgTs_Runnable
==26623== at 0x40007F2: (within /lib/ld-2.6.1.so)
==26623== by 0x46F2AA3: NtSetInformationFile (file.c:1621)
==26623== by 0x47F95DC: SetFileTime (file.c:1035)
==26623== by 0x490D14D: fdi_notify_extract (cabinet_main.c:269)
==26623== by 0x491913D: FDICopy (fdi.c:2810)
==26623== by 0x490D021: Extract (cabinet_main.c:364)
==26623== by 0x497EB98: ExtractFilesA (files.c:729)
==26623== by 0x48F49C8: func_files (files.c:439)
==26623== by 0x48F6EE7: run_test (test.h:406)
==26623== by 0x48F765C: main (test.h:455)
|
|
From: Nicholas N. <nj...@cs...> - 2008-02-24 22:40:48
|
On Sun, 24 Feb 2008, Dan Kegel wrote:
> I'd been happily using valgrind from svn of Feb 12th
> on wine's conformance test suite, and then I upgraded
> my OS from Ubuntu Feisty to Ubuntu Gutsy.
> Suddenly I began seeing the following kind of error.
> Updating to a fresh copy of valgrind from svn didn't help;
> the following error is from the fresh build.
>
> --26623-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11
> (SIGSEGV) - exiting
> --26623-- si_code=2; Faulting address: 0x0; sp: 0x627C1DF8
>
> valgrind: the 'impossible' happened:
> Killed by fatal signal
> ==26623== at 0x38003987: check_mem_is_defined_asciiz (mc_main.c:2445)
> ==26623== by 0x3803A615: vgSysWrap_linux_sys_utimensat_before
> (syswrap-linux.c:2834)
> ==26623== by 0x38038410: vgPlain_client_syscall (syswrap-main.c:850)
> ==26623== by 0x38035F27: vgPlain_scheduler (scheduler.c:798)
> ==26623== by 0x38049EE8: run_a_thread_NORETURN (syswrap-linux.c:89)
>
> That line of mc_main.c is:
> /* Ok, a is safe to read. */
> if (* ((UChar*)a) == 0) {
Since Memcheck first checks if the address is accessible before reading it,
it looks like Memcheck's idea of what is accessible has somehow gotten out
of sync with reality. As for what causes that, who knows...
Nick
|
|
From: Julian S. <js...@ac...> - 2008-02-25 03:45:07
|
> > --26623-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11
> > (SIGSEGV) - exiting
> > --26623-- si_code=2; Faulting address: 0x0; sp: 0x627C1DF8
> >
> > valgrind: the 'impossible' happened:
> > Killed by fatal signal
> > ==26623== at 0x38003987: check_mem_is_defined_asciiz (mc_main.c:2445)
> > ==26623== by 0x3803A615: vgSysWrap_linux_sys_utimensat_before
> > (syswrap-linux.c:2834)
> > ==26623== by 0x38038410: vgPlain_client_syscall (syswrap-main.c:850)
> > ==26623== by 0x38035F27: vgPlain_scheduler (scheduler.c:798)
> > ==26623== by 0x38049EE8: run_a_thread_NORETURN (syswrap-linux.c:89)
> >
> > That line of mc_main.c is:
> > /* Ok, a is safe to read. */
> > if (* ((UChar*)a) == 0) {
>
> Since Memcheck first checks if the address is accessible before reading it,
> it looks like Memcheck's idea of what is accessible has somehow gotten out
> of sync with reality. As for what causes that, who knows...
That's potentially serious. Do you have a way we can repro this?
J
|
|
From: Dan K. <da...@ke...> - 2008-02-25 03:57:31
|
On Sun, Feb 24, 2008 at 7:41 PM, Julian Seward <js...@ac...> wrote: > > Since Memcheck first checks if the address is accessible before reading it, > > it looks like Memcheck's idea of what is accessible has somehow gotten out > > of sync with reality. As for what causes that, who knows... > > That's potentially serious. Do you have a way we can repro this? Sure, just build Wine's conformance test suite and run one of the ten or so tests that fail like that. I'll whip up a script anon. - Dan |
|
From: Dan K. <da...@ke...> - 2008-02-25 07:21:23
|
On Sun, Feb 24, 2008 at 7:57 PM, Dan Kegel <da...@ke...> wrote: \> > That's potentially serious. Do you have a way we can repro this? > > Sure, just build Wine's conformance test suite and run one of the ten or > so tests that fail like that. I'll whip up a script anon. OK. http://kegel.com/wine/valgrind/repro-inner.sh should reproduce the problem. I'm testing it now by running it on a fresh gutsy instance in vmware ( if you want a quick fresh gutsy instance, see http://kegel.com/wine/valgrind/repro-outer.sh ). - Dan |
|
From: Tom H. <to...@co...> - 2008-02-25 09:17:44
|
In message <a71...@ma...>
Dan Kegel <da...@ke...> wrote:
> I'd been happily using valgrind from svn of Feb 12th
> on wine's conformance test suite, and then I upgraded
> my OS from Ubuntu Feisty to Ubuntu Gutsy.
> Suddenly I began seeing the following kind of error.
> Updating to a fresh copy of valgrind from svn didn't help;
> the following error is from the fresh build.
>
> --26623-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11
> (SIGSEGV) - exiting
> --26623-- si_code=2; Faulting address: 0x0; sp: 0x627C1DF8
>
> valgrind: the 'impossible' happened:
> Killed by fatal signal
> ==26623== at 0x38003987: check_mem_is_defined_asciiz (mc_main.c:2445)
> ==26623== by 0x3803A615: vgSysWrap_linux_sys_utimensat_before
> (syswrap-linux.c:2834)
> ==26623== by 0x38038410: vgPlain_client_syscall (syswrap-main.c:850)
> ==26623== by 0x38035F27: vgPlain_scheduler (scheduler.c:798)
> ==26623== by 0x38049EE8: run_a_thread_NORETURN (syswrap-linux.c:89)
>
> That line of mc_main.c is:
> /* Ok, a is safe to read. */
> if (* ((UChar*)a) == 0) {
>
> The rest of the message is
>
> sched status:
> running_tid=1
>
> Thread 1: status = VgTs_Runnable
> ==26623== at 0x40007F2: (within /lib/ld-2.6.1.so)
> ==26623== by 0x46F2AA3: NtSetInformationFile (file.c:1621)
That line in wine is doing futimes() to set the timestamp on a
file descriptor, and I suspect your OS upgrade got you a new glibc
that uses the new utimensat() system call to implement futimes().
Our wrapper for that system call appears to be wrong as it doesn't
allow a null pointer for ARG2 which the kernel does seem to do - if
the filename is null then it takes the fd as the file to update
rather than the directory to resolve the filename relative to.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Julian S. <js...@ac...> - 2008-02-25 12:26:57
|
> Our wrapper for that system call appears to be wrong as it doesn't
> allow a null pointer for ARG2 which the kernel does seem to do - if
> the filename is null then it takes the fd as the file to update
> rather than the directory to resolve the filename relative to.
So then a correct fix is merely to change
PRE_MEM_RASCIIZ( "utimensat(filename)", ARG2 );
to
if (ARG2)
PRE_MEM_RASCIIZ( "utimensat(filename)", ARG2 );
in PRE(sys_utimensat) in syswrap-linux.c. Yes? Dan, can you try that?
J
|
|
From: Dan K. <da...@ke...> - 2008-02-25 13:10:20
|
On Mon, Feb 25, 2008 at 4:23 AM, Julian Seward <js...@ac...> wrote: > So then a correct fix is merely to change > > PRE_MEM_RASCIIZ( "utimensat(filename)", ARG2 ); > > to > > if (ARG2) > PRE_MEM_RASCIIZ( "utimensat(filename)", ARG2 ); > > in PRE(sys_utimensat) in syswrap-linux.c. Yes? Dan, can you try that? My glibc evidently uses futimesat, not utimensat. I tried the one you pointed at first, no joy, but doing the same fix for futimesat fixed. Resulting patch at http://kegel.com/wine/valgrind/futimesat.patch Thanks! (And my repro script was in fact correct if not optimal.) http://sources.redhat.com/ml/libc-hacker/2005-11/msg00015.html and/or a moment's reflection suggests that touch x is an easier testcase than the one I found. http://www.ussg.iu.edu/hypermail/linux/kernel/0707.1/0103.html suggests yet another syscall, futimensat, to watch out for; doesn't seem to be mentioned in valgrind yet. - Dan |
|
From: Dan K. <da...@ke...> - 2008-02-25 13:15:56
|
On Mon, Feb 25, 2008 at 5:10 AM, Dan Kegel <da...@ke...> wrote: > http://sources.redhat.com/ml/libc-hacker/2005-11/msg00015.html > and/or a moment's reflection > suggests that touch x is an easier testcase than the one I found. Hrmph. Without the patch, valgrind touch x fails with a normal valgrind error, not a crash: ==6938== Syscall param utimensat(filename) points to unaddressable byte(s) ==6938== at 0x40007F2: (within /lib/ld-2.6.1.so) ==6938== by 0x410F4B6: futimesat (in /lib/tls/i686/cmov/libc-2.6.1.so) ==6938== by 0x804D60F: (within /bin/touch) ==6938== by 0x80497E9: (within /bin/touch) ==6938== by 0x405704F: (below main) (in /lib/tls/i686/cmov/libc-2.6.1.so) ==6938== Address 0x0 is not stack'd, malloc'd or (recently) free'd So both need to be patched, but only the wine case is debilitating for some reason? |
|
From: Julian S. <js...@ac...> - 2008-02-25 14:03:43
|
On Monday 25 February 2008 14:16, Dan Kegel wrote: > On Mon, Feb 25, 2008 at 5:10 AM, Dan Kegel <da...@ke...> wrote: > > http://sources.redhat.com/ml/libc-hacker/2005-11/msg00015.html > > and/or a moment's reflection > > suggests that touch x is an easier testcase than the one I found. > > Hrmph. Without the patch, > valgrind touch x > fails with a normal valgrind error, not a crash: > ==6938== Syscall param utimensat(filename) points to unaddressable byte(s) > ==6938== at 0x40007F2: (within /lib/ld-2.6.1.so) > ==6938== by 0x410F4B6: futimesat (in /lib/tls/i686/cmov/libc-2.6.1.so) > ==6938== by 0x804D60F: (within /bin/touch) > ==6938== by 0x80497E9: (within /bin/touch) > ==6938== by 0x405704F: (below main) (in > /lib/tls/i686/cmov/libc-2.6.1.so) ==6938== Address 0x0 is not stack'd, > malloc'd or (recently) free'd I'm confused. What does your PRE(sys_utimensat) routine now look like? J |
|
From: Dan K. <da...@ke...> - 2008-02-25 16:28:45
|
On Mon, Feb 25, 2008 at 5:59 AM, Julian Seward <js...@ac...> wrote: > I'm confused. What does your PRE(sys_utimensat) routine now look like? With the patch I sent you, http://kegel.com/wine/valgrind/futimesat.patch applied, it looks like this: PRE(sys_utimensat) { PRINT("sys_utimensat ( %d, %p(%s), %p )", ARG1,ARG2,ARG2,ARG3); PRE_REG_READ3(long, "utimensat", int, dfd, char *, filename, struct timespec *, tvp); if (ARG2 != 0) PRE_MEM_RASCIIZ( "utimensat(filename)", ARG2 ); if (ARG3 != 0) PRE_MEM_READ( "utimensat(tvp)", ARG3, sizeof(struct vki_timespec) ); } and there are no errors from either wine or touch. - Dan |