|
From: Michael S. <ms...@xi...> - 2006-03-02 13:04:53
|
Hi all,
I'm seeing a _very_ strange problem here trying to use valgrind-SVN.
I'm on an x86 system running ubuntu dapper - the system installed
valgrind ("valgrind-3.1.0-Debian") starts fine.
However, if I try to start my locally installed version (SVN from
around half an hour ago, built with "./autogen.sh=20
--prefix=3D/home/msmith && make && make install" - however, this was
also happening with an older build I had sitting around in my PATH
from - probably - a few weeks ago), I get this:
msmith@freeze:~$ valgrind
Killed
Further investigation with the help of strace shows that the valgrind
gets as far as starting the memcheck binary
/home/msmith/lib/valgrind/x86-linux/memcheck (if I use a different
tool, it starts a different binary, but the behaviour is identical),
then I get (at the end):
execve("/home/msmith/lib/valgrind/x86-linux/memcheck", ["valgrind"],
[/* 44 vars */]) =3D 0
+++ killed by SIGKILL +++
msmith@freeze:~$
If I try to directly run memcheck:
msmith@freeze:~$ strace /home/msmith/lib/valgrind/x86-linux/memcheck
execve("/home/msmith/lib/valgrind/x86-linux/memcheck",
["/home/msmith/lib/valgrind/x86-li"...], [/* 43 vars */]) =3D 0
+++ killed by SIGKILL +++
msmith@freeze:~$
If I start memcheck in gdb, set breakpoints on a few likely spots
(main, _start, _start_in_C), and run it, I never get to any of the
breakpoints, it looks like the SIGKILL arrives (from wherever - the
kernel?) before anything happens at all.
So, at this point I've exhausted my knowledge of how to debug this
sort of thing - I don't understand process startup in anywhere near
sufficient detail.
Has anyone else seen this? Any ideas at all about what is going on?
Oh, and finally, some versions of things that might be important:
gcc:
gcc (GCC) 4.0.3 20060212 (prerelease) (Ubuntu 4.0.2-9ubuntu1)
ld
GNU ld version 2.16.91 20060118 Debian GNU/Linux
kernel:
Linux version 2.6.15-16-386 (buildd@vernadsky) (gcc version 4.0.3
20060212 (prerelease) (Ubuntu 4.0.2-9ubuntu1)) #1 PREEMPT Mon Feb 20
16:38:26 UTC 2006
Thanks,
Mike
|
|
From: Julian S. <js...@ac...> - 2006-03-02 13:13:42
|
> On Thursday 02 March 2006 13:04, Michael Smith wrote: > I'm seeing a _very_ strange problem here trying to use valgrind-SVN. That's a bit strange. However, none of the 8 or so overnight builds reported any such problems last night or even in the last month or so. Are you sure you have a completely clean tree? The build system isn't too hot at dependency tracking. What happens if you check out a new tree and build that? If that doesn't fix it, try running with -d -d, which shows lots of startup detail. --trace-flags=10000000 is also worth adding. J |
|
From: Michael S. <ms...@xi...> - 2006-03-02 15:10:41
|
On 3/2/06, Julian Seward <js...@ac...> wrote: > > > On Thursday 02 March 2006 13:04, Michael Smith wrote: > > I'm seeing a _very_ strange problem here trying to use valgrind-SVN. > > That's a bit strange. However, none of the 8 or so overnight > builds reported any such problems last night or even in the last > month or so. Indeed, very strange! > > Are you sure you have a completely clean tree? The build system > isn't too hot at dependency tracking. What happens if you check out > a new tree and build that? I had actually done a 'make clean' before building it, knowing that sometimes problems like these are just due to dirty builds. However, I just tried checking out a fresh tree, rebuilding, and get precisely the same problem. Output shown below is from this fresh build. > > If that doesn't fix it, try running with -d -d, which shows lots of > startup detail. --trace-flags=3D10000000 is also worth adding. -d -d shows what I expected: things go fine until valgrind tries to launch the tool binary, then the tool is immediately killed. --trace-flags=3D10000000 produces no output at all (except the 'Killed' message that I get when running 'valgrind' with no arguments at all). With -d -d: msmith@freeze:~$ valgrind -d -d --28454:1:debuglog DebugLog system started by Stage 1, level 2 logging requested--28454:1:launcher no tool requested, defaulting to 'memcheck' --28454:1:launcher no client specified, defaulting platform to 'x86-linux' --28454:1:launcher launching /home/msmith/lib/valgrind/x86-linux/memcheck Killed msmith@freeze:~$ If I start with "valgrind --tool=3Dnone" the same happens (though obviously it says it's launching 'none' rather than memcheck. If I specify a non-existent tool I get the correct error about the tool not existing. So, the problem appears (to me) to be that the kernel doesn't like the tool binaries. This suggests that perhaps there's something wrong with how it's linked - but my linker knowledge is far too poor to know what to look at here. Happy to try any suggestions. Mike |
|
From: Michael S. <ms...@xi...> - 2006-03-02 18:58:38
|
On 3/2/06, Julian Seward <js...@ac...> wrote:
>
> > On Thursday 02 March 2006 13:04, Michael Smith wrote:
> > I'm seeing a _very_ strange problem here trying to use valgrind-SVN.
>
> That's a bit strange. However, none of the 8 or so overnight
> builds reported any such problems last night or even in the last
> month or so.
I have a "fix" for my problem now. Why does this help? I have
absolutely no idea. I don't understand the startup stuff in valgrind
at all:
Index: configure.in
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- configure.in (revision 5707)
+++ configure.in (working copy)
@@ -123,7 +123,7 @@
i?86)
AC_MSG_RESULT([ok (${host_cpu})])
VG_ARCH=3D"x86"
- valt_load_address_normal=3D"0xb0000000"
+ valt_load_address_normal=3D"0x70000000"
valt_load_address_inner=3D"0xa0000000"
;;
I tried 0xa0000000, 0x90000000, and 0x80000000, which didn't work.
This value does.
Some kernel thing on this ubuntu kernel presumably doesn't like
loading the code that high, but as I said, I don't understand any of
this stuff properly.
Mike
|
|
From: Julian S. <js...@ac...> - 2006-03-02 19:06:17
|
> I have a "fix" for my problem now. Why does this help? It changes where the tool asks to be loaded to be < 2G. Maybe you have some strange address space layout (dictated by the kernel) of 2G available and 2G for the kernel? What does 'cat /proc/self/maps' (natively) show? J |
|
From: Tom H. <to...@co...> - 2006-03-02 19:25:15
|
In message <200...@ac...>
Julian Seward <js...@ac...> wrote:
> > I have a "fix" for my problem now. Why does this help?
>
> It changes where the tool asks to be loaded to be < 2G. Maybe you
> have some strange address space layout (dictated by the kernel)
> of 2G available and 2G for the kernel? What does 'cat /proc/self/maps'
> (natively) show?
It does sound like he is using a 2+2 kernel yes, which is quite
unusual I think. A 3+1 split is traditional I think, and modern
systems often use a 4+0 split.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Michael S. <ms...@xi...> - 2006-03-02 19:27:25
|
On 3/2/06, Julian Seward <js...@ac...> wrote: > > > I have a "fix" for my problem now. Why does this help? > > It changes where the tool asks to be loaded to be < 2G. Maybe you > have some strange address space layout (dictated by the kernel) > of 2G available and 2G for the kernel? What does 'cat /proc/self/maps' > (natively) show? > > J > It's a standard ubuntu dapper kernel (how 'standard' this actually is I don't know - but I haven't done anything funny with it) 08048000-0804c000 r-xp 00000000 08:02 4922006 /bin/cat 0804c000-0804d000 rw-p 00003000 08:02 4922006 /bin/cat 0804d000-0806e000 rw-p 0804d000 00:00 0 [heap] 77d33000-77d66000 r--p 00000000 08:02 5121458 =20 /usr/lib/locale/en_AU.utf8/LC_CTYPE 77d66000-77e3d000 r--p 00000000 08:02 5121461 =20 /usr/lib/locale/en_AU.utf8/LC_COLLATE 77e3d000-77e3e000 rw-p 77e3d000 00:00 0 77e3e000-77f67000 r-xp 00000000 08:02 346883 =20 /lib/tls/i686/cmov/libc-2.3.6.so 77f67000-77f6a000 rw-p 00129000 08:02 346883 =20 /lib/tls/i686/cmov/libc-2.3.6.so 77f6a000-77f6d000 rw-p 77f6a000 00:00 0 77f7a000-77f7b000 r--p 00000000 08:02 5121459 =20 /usr/lib/locale/en_AU.utf8/LC_NUMERIC 77f7b000-77f7c000 r--p 00000000 08:02 2012091 =20 /usr/lib/locale/en_AU.utf8/LC_TIME 77f7c000-77f7d000 r--p 00000000 08:02 2012253 =20 /usr/lib/locale/en_AU.utf8/LC_MONETARY 77f7d000-77f7e000 r--p 00000000 08:02 5121464 =20 /usr/lib/locale/en_AU.utf8/LC_MESSAGES/SYS_LC_MESSAGES 77f7e000-77f7f000 r--p 00000000 08:02 5121465 =20 /usr/lib/locale/en_AU.utf8/LC_PAPER 77f7f000-77f80000 r--p 00000000 08:02 5121466 =20 /usr/lib/locale/en_AU.utf8/LC_NAME 77f80000-77f81000 r--p 00000000 08:02 2012254 =20 /usr/lib/locale/en_AU.utf8/LC_ADDRESS 77f81000-77f82000 r--p 00000000 08:02 2012255 =20 /usr/lib/locale/en_AU.utf8/LC_TELEPHONE 77f82000-77f83000 r--p 00000000 08:02 5121469 =20 /usr/lib/locale/en_AU.utf8/LC_MEASUREMENT 77f83000-77f84000 r--p 00000000 08:02 2012256 =20 /usr/lib/locale/en_AU.utf8/LC_IDENTIFICATION 77f84000-77f87000 rw-p 77f84000 00:00 0 77f87000-77f9c000 r-xp 00000000 08:02 343871 /lib/ld-2.3.6.so 77f9c000-77f9d000 rw-p 00014000 08:02 343871 /lib/ld-2.3.6.so 7fa87000-7fa9c000 rw-p 7fa87000 00:00 0 [stack] ffffe000-fffff000 ---p 00000000 00:00 0 [vdso] Mike |
|
From: Julian S. <js...@ac...> - 2006-03-02 19:37:58
|
> 7fa87000-7fa9c000 rw-p 7fa87000 00:00 0 [stack] > ffffe000-fffff000 ---p 00000000 00:00 0 [vdso] That does indeed look like a 2+2 setup. As Tom says this is really quite unusual since it constrains the available address space for no obvious reason. Can you ask the Ubuntu kernel people why they are not using a traditional 3+1 rig? Your fix should work OK, I guess. It might be worth just sending the result of valgrind -d -d --tool=none date so we can check there's no bad interaction between V and anything else, like the stack. J |
|
From: Michael S. <ms...@xi...> - 2006-03-03 11:57:47
|
On 3/2/06, Julian Seward <js...@ac...> wrote: > > > 7fa87000-7fa9c000 rw-p 7fa87000 00:00 0 [stack] > > ffffe000-fffff000 ---p 00000000 00:00 0 [vdso] > > That does indeed look like a 2+2 setup. As Tom says this is > really quite unusual since it constrains the available address > space for no obvious reason. Can you ask the Ubuntu kernel > people why they are not using a traditional 3+1 rig? Ok, I'm trying to track down the appropriate person to ask. Might take a day or two... > > Your fix should work OK, I guess. It might be worth just > sending the result of valgrind -d -d --tool=3Dnone date so we > can check there's no bad interaction between V and anything > else, like the stack. This seems to work fine. Output below. I also ran a reasonably large, heavily threaded, app using this newly-built valgrind yesterday without any problems (well, there were problems: the app kept crashing. That's why I wanted to build valgrind! I've now fixed my bug...). Thanks Julian, your assistance is, yet again, appreciated. Mike msmith@freeze:~$ valgrind -d -d --tool=3Dnone date --28401:1:debuglog DebugLog system started by Stage 1, level 2 logging requested--28401:1:launcher tool 'none' requested --28401:1:launcher selected platform 'x86-linux' --28401:1:launcher launching /home/msmith/lib/valgrind/x86-linux/none --28401:1:debuglog DebugLog system started by Stage 2 (main), level 2 logging requested --28401:1:main Welcome to Valgrind version 3.2.0.SVN debug logging --28401:1:main Checking current stack is plausible --28401:1:main Checking initial stack was noted --28401:1:main Starting the address space manager --28401:2:aspacem sp_at_startup =3D 0x007F83AB20 (supplied) --28401:2:aspacem minAddr =3D 0x0004000000 (computed) --28401:2:aspacem maxAddr =3D 0x007F839FFF (computed) --28401:2:aspacem cStart =3D 0x0004000000 (computed) --28401:2:aspacem vStart =3D 0x0041C1D000 (computed) --28401:2:aspacem suggested_clstack_top =3D 0x007E83AFFF (computed) --28401:2:aspacem <<< SHOW_SEGMENTS: Initial layout (5 segments, 0 segna= mes) --28401:2:aspacem 0: RSVN 0000000000-0003FFFFFF 64m ----- SmFixed --28401:2:aspacem 1: 0004000000-0041C1CFFF 988m --28401:2:aspacem 2: RSVN 0041C1D000-0041C1DFFF 4096 ----- SmFixed --28401:2:aspacem 3: 0041C1E000-007F839FFF 988m --28401:2:aspacem 4: RSVN 007F83A000-00FFFFFFFF 2055m ----- SmFixed --28401:2:aspacem >>> --28401:2:aspacem Reading /proc/self/maps --28401:2:aspacem <<< SHOW_SEGMENTS: With contents of /proc/self/maps (12 segments, 1 segnames) --28401:2:aspacem ( 0) /home/msmith/lib/valgrind/x86-linux/none --28401:2:aspacem 0: RSVN 0000000000-0003FFFFFF 64m ----- SmFixed --28401:2:aspacem 1: 0004000000-0041C1CFFF 988m --28401:2:aspacem 2: RSVN 0041C1D000-0041C1DFFF 4096 ----- SmFixed --28401:2:aspacem 3: 0041C1E000-006FFFFFFF 739m --28401:2:aspacem 4: FILE 0070000000-007013DFFF 1302528 r-x-- d=3D0x802 i=3D3041841 o=3D0 (0) --28401:2:aspacem 5: FILE 007013E000-007013EFFF 4096 rw--- d=3D0x802 i=3D3041841 o=3D1302528 (0) --28401:2:aspacem 6: ANON 007013F000-007073FFFF 6295552 rw--- --28401:2:aspacem 7: 0070740000-007F826FFF 240m --28401:2:aspacem 8: ANON 007F827000-007F83BFFF 86016 rw--- --28401:2:aspacem 9: RSVN 007F83C000-00FFFFDFFF 2055m ----- SmFixed --28401:2:aspacem 10: ANON 00FFFFE000-00FFFFEFFF 4096 ----- --28401:2:aspacem 11: RSVN 00FFFFF000-00FFFFFFFF 4096 ----- SmFixed --28401:2:aspacem >>> --28401:1:main Address space manager is running --28401:1:main Starting the dynamic memory manager --28401:1:mallocfr newSuperblock at 0x41C1E000 (pszB 1048560) owner VALGRIND/tool --28401:1:main Dynamic memory manager is running --28401:1:main Getting stage1's name --28401:1:main Get hardware capabilities ... --28401:1:main ... arch =3D X86, hwcaps =3D x86-sse1-sse2 --28401:1:main Split up command line --28401:1:main Preprocess command line opts --28401:1:main Loading client --28401:1:main Setup client env --28401:2:main preload_string: --28401:2:main "/home/msmith/lib/valgrind/x86-linux/vgpreload_core.= so" --28401:1:main Setup client stack --28401:2:main Client info: initial_IP=3D0x4000790 initial_SP=3D0x7E83A210 initial_TOC=3D0x0 brk_base=3D0x8054000 --28401:1:main Setup client data (brk) segment --28401:1:main Setup file descriptors --28401:1:main Create fake /proc/<pid>/cmdline --28401:1:main Initialise the tool part 1 (pre_clo_init) --28401:1:main Print help and quit, if requested --28401:1:main Process Valgrind's command line options, setup logging --28401:1:main Print the preamble... =3D=3D28401=3D=3D Nulgrind, a binary JIT-compiler. =3D=3D28401=3D=3D Copyright (C) 2002-2005, and GNU GPL'd, by Nicholas Nethe= rcote. =3D=3D28401=3D=3D Using LibVEX rev 1404, a library for dynamic binary trans= lation. =3D=3D28401=3D=3D Copyright (C) 2004-2005, and GNU GPL'd, by OpenWorks LLP. =3D=3D28401=3D=3D Using valgrind-3.2.0.SVN, a dynamic binary instrumentatio= n framework. =3D=3D28401=3D=3D Copyright (C) 2000-2005, and GNU GPL'd, by Julian Seward = et al. =3D=3D28401=3D=3D For more details, rerun with: -v =3D=3D28401=3D=3D --28401:1:main ...finished the preamble --28401:1:main Initialise the tool part 2 (post_clo_init) --28401:1:main Initialise TT/TC --28401:2:transtab cache: 8 sectors of 5870592 bytes each =3D 46964736 to= tal --28401:2:transtab table: 524168 total entries, max occupancy 419328 (80%= ) --28401:1:main Initialise redirects --28401:1:mallocfr newSuperblock at 0x41D99000 (pszB 1048560) owner VALGRIND/symtab --28401:1:main Load initial debug info --28401:1:mallocfr newSuperblock at 0x41E99000 (pszB 1048560) owner VALGRIND/symtab --28401:1:mallocfr newSuperblock at 0x41F99000 (pszB 1048560) owner VALGRIND/symtab --28401:1:redir transfer ownership V -> C of 0x7000D000 .. 0x7000DFFF --28401:1:main Tell tool about initial permissions --28401:2:main tell tool about 0004000000-0004014FFF r-x --28401:2:main tell tool about 0004015000-0004015FFF rw- --28401:2:main tell tool about 0008048000-0008052FFF r-x --28401:2:main tell tool about 0008053000-0008053FFF rw- --28401:2:main tell tool about 0008054000-0008054FFF rwx --28401:2:main tell tool about 007000D000-007000DFFF r-x --28401:2:main tell tool about 007E83A000-007E83AFFF rwx --28401:2:main mark stack inaccessible 007E83A000-007E83A20F --28401:1:main Initialise scheduler --28401:1:main Initialise thread 1's state --28401:1:main Initialise signal management --28401:1:mallocfr newSuperblock at 0x42099000 (pszB 1048560) owner VALGRIND/core --28401:2:stacks register 0x7E83A000-0x7E83AFFF as stack 0 --28401:1:main --28401:1:main --28401:1:aspacem <<< SHOW_SEGMENTS: Memory layout at client startup (25 segments, 4 segnames) --28401:1:aspacem ( 0) /home/msmith/lib/valgrind/x86-linux/none --28401:1:aspacem ( 1) /bin/date --28401:1:aspacem ( 2) /lib/ld-2.3.6.so --28401:1:aspacem 0: RSVN 0000000000-0003FFFFFF 64m ----- SmFixed --28401:1:aspacem 1: file 0004000000-0004014FFF 86016 r-x-- d=3D0x802 i=3D343871 o=3D0 (2) --28401:1:aspacem 2: file 0004015000-0004015FFF 4096 rw--- d=3D0x802 i=3D343871 o=3D81920 (2) --28401:1:aspacem 3: 0004016000-0008047FFF 64m --28401:1:aspacem 4: file 0008048000-0008052FFF 45056 r-x-- d=3D0x802 i=3D4922087 o=3D0 (1) --28401:1:aspacem 5: file 0008053000-0008053FFF 4096 rw--- d=3D0x802 i=3D4922087 o=3D40960 (1) --28401:1:aspacem 6: anon 0008054000-0008054FFF 4096 rwx-- --28401:1:aspacem 7: RSVN 0008055000-0008853FFF 8384512 ----- SmLower --28401:1:aspacem 8: 0008854000-0041C1CFFF 915m --28401:1:aspacem 9: RSVN 0041C1D000-0041C1DFFF 4096 ----- SmFixed --28401:1:aspacem 10: ANON 0041C1E000-0042198FFF 5746688 rwx-- --28401:1:aspacem 11: 0042199000-006FFFFFFF 734m --28401:1:aspacem 12: FILE 0070000000-007000CFFF 53248 r-x-- d=3D0x802 i=3D3041841 o=3D0 (0) --28401:1:aspacem 13: file 007000D000-007000DFFF 4096 r-x-- d=3D0x802 i=3D3041841 o=3D53248 (0) --28401:1:aspacem 14: FILE 007000E000-007013DFFF 1245184 r-x-- d=3D0x802 i=3D3041841 o=3D57344 (0) --28401:1:aspacem 15: FILE 007013E000-007013EFFF 4096 rw--- d=3D0x802 i=3D3041841 o=3D1302528 (0) --28401:1:aspacem 16: ANON 007013F000-007073FFFF 6295552 rw--- --28401:1:aspacem 17: 0070740000-007E03AFFF 216m --28401:1:aspacem 18: RSVN 007E03B000-007E839FFF 8384512 ----- SmUpper --28401:1:aspacem 19: anon 007E83A000-007E83AFFF 4096 rwx-- --28401:1:aspacem 20: 007E83B000-007F826FFF 15m --28401:1:aspacem 21: ANON 007F827000-007F83BFFF 86016 rw--- --28401:1:aspacem 22: RSVN 007F83C000-00FFFFDFFF 2055m ----- SmFixed --28401:1:aspacem 23: ANON 00FFFFE000-00FFFFEFFF 4096 ----- --28401:1:aspacem 24: RSVN 00FFFFF000-00FFFFFFFF 4096 ----- SmFixed --28401:1:aspacem >>> --28401:1:main --28401:1:main --28401:1:main Running thread 1 --28401:1:syswrap- entering VG_(main_thread_wrapper_NORETURN) --28401:1:aspacem allocated thread stack at 0x42199000 size 81920 --28401:1:syswrap- run_a_thread_NORETURN(tid=3D1): pre-thread_wrapper --28401:1:syswrap- thread_wrapper(tid=3D1): entry --28401:1:transtab allocate sector 0 --28401:1:mallocfr newSuperblock at 0x42C07000 (pszB 65520) owner VALGRIND/ttaux --28401:1:signals extending a stack base 0x7E83A000 down by 4096 --28401:2:stacks change stack 0 from 0x7E83A000-0x7E83AFFF to 0x7E839000-0x7E83AFFF Fri Mar 3 12:55:00 CET 2006 =3D=3D28401=3D=3D msmith@freeze:~$ |
|
From: Julian S. <js...@ac...> - 2006-03-03 14:38:04
|
> msmith@freeze:~$ valgrind -d -d --tool=none date Looks fine to me. If you discover why you've got a 2+2 setup I'd be interested to know. J |
|
From: Michael S. <ms...@xi...> - 2006-03-10 15:53:04
|
On 3/3/06, Julian Seward <js...@ac...> wrote: > > > msmith@freeze:~$ valgrind -d -d --tool=3Dnone date > > Looks fine to me. If you discover why you've got a 2+2 setup > I'd be interested to know. I didn't quite manage to get a clear answer to this apart from some vague mumbling about suspend on laptops, but ubuntu dapper has now apparently returned to a 3/1 split as of the most recent kernel update. I'll check that it works correctly when I next reboot. Thanks for your help. Mike |