|
From: Alex B. <ker...@be...> - 2006-09-06 16:54:20
|
Hi, I'm trying to get to the bottom of a crash I'm seeing while valgrinding: ==15125== Invalid read of size 8 ==15125== at 0x3B731073FE: do_lookup_x (in /lib64/ld-2.3.4.so) ==15125== by 0x3B7310782D: _dl_lookup_symbol_x (in /lib64/ld-2.3.4.so) ==15125== by 0x3B7310A609: fixup (in /lib64/ld-2.3.4.so) ==15125== by 0x3B7310A4D1: _dl_runtime_resolve (in /lib64/ld-2.3.4.so) ==15125== by 0x70258ED7: moveFd(int, bool) ==15125== by 0x700CEFFA: Fuse::openDebugFile() ==15125== by 0x700CF1EB: Fuse::getDebugOutputFD() ==15125== by 0x700C5C40: ErrorReporter::reportError(SeverityLevel,ErrorType, char const*, ...) ==15125== by 0x702B08C4: getLock(unsigned*, ThreadContext*) Unfortunately this doesn't get hit outside of valgrind. I suspect that the act of valgrinding makes us hit a deadlock check in getLock which sends us down the reportError case. However I've been unable to attach gdb to investigate. I have to Ctrl-C valgrind at which point I get: ==15125== Invalid read of size 1 ==15125== at 0x40016C7: _vgnU_freeres (vg_preloaded.c:56) ==15125== Address 0xFFFFFFFFFFFFFFFC is not stack'd, malloc'd or (recently) free'd ==15125== ==15125== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==15125== Access not within mapped region at address 0xFFFFFFFFFFFFFFFC ==15125== at 0x40016C7: _vgnU_freeres (vg_preloaded.c:56) And then wait a while as valgrind calculates all the memory which will obviously be lost due to the early exit. It would be useful to see the data that reportError is about spew. Unfortunately the vgcore left over makes even less sense: 17:49 alexjb@strada [~] >gdb $APP_PATH vgcore.15436 GNU gdb Core was generated by `'. Program terminated with signal 11, Segmentation fault. #0 0x00000000040016c7 in ?? () (gdb) info threads * 1 process 15436 0x00000000040016c7 in ?? () (gdb) frame 0 #0 0x00000000040016c7 in ?? () (gdb) bt #0 0x00000000040016c7 in ?? () #1 0x0000000000000000 in ?? () (gdb) So I think I have a bug in my code, unfortunately I'm not sure if its exposed a bug in valgrind as well. It seems odd that I'm seeing a bad read in the linker code. Any ideas? -- Alex, homepage: http://www.bennee.com/~alex/ Ask not for whom the tolls. |
|
From: Alex B. <al...@tr...> - 2006-09-06 17:28:37
|
On Wed, 2006-09-06 at 17:54 +0100, Alex Bennee wrote: > So I think I have a bug in my code, unfortunately I'm not sure if its > exposed a bug in valgrind as well. It seems odd that I'm seeing a bad > read in the linker code. Further to my previous comments I think it may well be a valgrind bug. I instrumented my code with a few printfs and I can see where the lock value get written and what gets read back. I also added a client side check to the getPid() function to see if it was getting trampled. The output was something like: setPid tc=0x424e170 pid=15979 getLock tc=0x424e170, pid=15979 getLock tc=0x424e170, pid=15979 getLock tc=0x424e170, pid=15979 getLock tc=0x424e170, pid=15979 getLock tc=0x424e170, pid=15979 getLock tc=0x424e170, pid=15979 getLock tc=0x424e170, pid=15979 getLock tc=0x424e170, pid=15979 getLock tc=0x424e170, pid=15979 ==15979== Warning: set address range perms: large range 198787072 (defined) getLock tc=0x424e170, pid=0 Before locking up. No client requests fired. The "set address range perms" is interestingly placed. Whats it mean? > > Any ideas? > > -- > Alex, homepage: http://www.bennee.com/~alex/ > Ask not for whom the tolls. > > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users -- Alex Bennee - al...@tr... Your job is being a professor and researcher: That's one hell of a good excuse for some of the brain-damages of minix. (Linus Torvalds to Andrew Tanenbaum) |
|
From: Tom H. <to...@co...> - 2006-09-06 17:49:27
|
In message <115...@ok...>
Alex Bennee <al...@tr...> wrote:
> I also added a client side check to the getPid() function to see if it
> was getting trampled. The output was something like:
>
> setPid tc=0x424e170 pid=15979
> getLock tc=0x424e170, pid=15979
> getLock tc=0x424e170, pid=15979
> getLock tc=0x424e170, pid=15979
> getLock tc=0x424e170, pid=15979
> getLock tc=0x424e170, pid=15979
> getLock tc=0x424e170, pid=15979
> getLock tc=0x424e170, pid=15979
> getLock tc=0x424e170, pid=15979
> getLock tc=0x424e170, pid=15979
> ==15979== Warning: set address range perms: large range 198787072
> (defined)
> getLock tc=0x424e170, pid=0
I assume you mean getpid() and not some special getPid() function of
your own? There is nothing special about the getpid system call under
valgrind so it is unlikely to be a valgrind bug.
Try adding --trace-syscalls=yes and see if the getpid system call is
really being called at that point.
> The "set address range perms" is interestingly placed. Whats it mean?
It means something has caused 190Mb of memory to be marked as
defined in one fell swoop.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Tom H. <to...@co...> - 2006-09-06 17:51:16
|
In message <115...@ok...>
Alex Bennee <ker...@be...> wrote:
> I have to Ctrl-C valgrind at which point I get:
>
> ==15125== Invalid read of size 1
> ==15125== at 0x40016C7: _vgnU_freeres (vg_preloaded.c:56)
> ==15125== Address 0xFFFFFFFFFFFFFFFC is not stack'd, malloc'd or
> (recently) free'd
> ==15125==
> ==15125== Process terminating with default action of signal 11
> (SIGSEGV): dumping core
> ==15125== Access not within mapped region at address 0xFFFFFFFFFFFFFFFC
> ==15125== at 0x40016C7: _vgnU_freeres (vg_preloaded.c:56)
I'm not sure this is anything to worry about - it has crashed
trying to run the glibc cleanup code but if glibc has already
corrupted it's state then it's not entirely surprising if that
fails.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Alex B. <ker...@be...> - 2006-09-07 09:31:51
|
On Wed, 2006-09-06 at 18:49 +0100, Tom Hughes wrote: > In message <115...@ok...> > Alex Bennee <al...@tr...> wrote: > > > I also added a client side check to the getPid() function to see if it > > was getting trampled. The output was something like: > > > > setPid tc=0x424e170 pid=15979 > > getLock tc=0x424e170, pid=15979 > > getLock tc=0x424e170, pid=15979 > > getLock tc=0x424e170, pid=15979 > > getLock tc=0x424e170, pid=15979 > > getLock tc=0x424e170, pid=15979 > > getLock tc=0x424e170, pid=15979 > > getLock tc=0x424e170, pid=15979 > > getLock tc=0x424e170, pid=15979 > > getLock tc=0x424e170, pid=15979 > > ==15979== Warning: set address range perms: large range 198787072 > > (defined) > > getLock tc=0x424e170, pid=0 > > I assume you mean getpid() and not some special getPid() function of > your own? There is nothing special about the getpid system call under > valgrind so it is unlikely to be a valgrind bug. Sorry the pid mentioned in getLock is a value stored in our task structure. It is only ever set in the setPid(0 function and after that is a read only value. One option could be the variable is being overwritten by an access to some of the structures before it in our tc class. I'm guessing as Valgrind has no knowledge of the internal structure of the class it couldn't tell if being overwritten by the same sized write was a bad thing. I had a quick look at the client request mechanism and I couldn't see if there is a way to write-protect an area of memory once you know its been set and shouldn't change. I shall try adding some overflow structures in and seeing if that changes anything. > > Try adding --trace-syscalls=yes and see if the getpid system call is > really being called at that point. > > > The "set address range perms" is interestingly placed. Whats it mean? > > It means something has caused 190Mb of memory to be marked as > defined in one fell swoop. > > Tom -- Alex, homepage: http://www.bennee.com/~alex/ The hearing ear is always found close to the speaking tongue, a custom whereof the memory of man runneth not howsomever to the contrary, nohow. |
|
From: Alex B. <ker...@be...> - 2006-09-07 14:05:17
|
On Thu, 2006-09-07 at 10:31 +0100, Alex Bennee wrote: > On Wed, 2006-09-06 at 18:49 +0100, Tom Hughes wrote: > > In message <115...@ok...> > > Alex Bennee <al...@tr...> wrote: > Sorry the pid mentioned in getLock is a value stored in our task > structure. It is only ever set in the setPid(0 function and after that > is a read only value. Additionally running the program native with the instrumentation never clobbers the pid variable to 0. I think this points more towards a a bug with Valgrind. Is there a way to log the loads and stores valgrind makes? -- Alex, homepage: http://www.bennee.com/~alex/ Conversation, n.: A vocal competition in which the one who is catching his breath is called the listener. |
|
From: Tom H. <to...@co...> - 2006-09-07 14:10:04
|
In message <115...@ok...>
Alex Bennee <ker...@be...> wrote:
> On Thu, 2006-09-07 at 10:31 +0100, Alex Bennee wrote:
>> On Wed, 2006-09-06 at 18:49 +0100, Tom Hughes wrote:
>> > In message <115...@ok...>
>> > Alex Bennee <al...@tr...> wrote:
>> Sorry the pid mentioned in getLock is a value stored in our task
>> structure. It is only ever set in the setPid(0 function and after that
>> is a read only value.
>
> Additionally running the program native with the instrumentation never
> clobbers the pid variable to 0. I think this points more towards a a bug
> with Valgrind.
I think it's very unlikely. I can't think of a single case of such a
bug in valgrind where it was wrongly writing to client memory.
Or are you suggesting a translation error so that valgrind is
producing incorrect code? Those are pretty rare as well...
Far more likely is that running under valgrind has changed the
layout of memory in your program so that an existing problem in
your program is manifesting itself by overwriting something more
important.
> Is there a way to log the loads and stores valgrind makes?
No. I'm not even entirely sure what loads and stores you are talking
about, but the answer is no anyway.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Alex B. <ker...@be...> - 2006-09-07 14:26:52
|
On Thu, 2006-09-07 at 15:09 +0100, Tom Hughes wrote: > In message <115...@ok...> > Alex Bennee <ker...@be...> wrote: > > Additionally running the program native with the instrumentation never > > clobbers the pid variable to 0. I think this points more towards a a bug > > with Valgrind. > > I think it's very unlikely. I can't think of a single case of such a > bug in valgrind where it was wrongly writing to client memory. > > Or are you suggesting a translation error so that valgrind is > producing incorrect code? Those are pretty rare as well... > > Far more likely is that running under valgrind has changed the > layout of memory in your program so that an existing problem in > your program is manifesting itself by overwriting something more > important. Could be I guess. But that still leads to the initial problem of how do identify the thing that is clobbering the variable. The value itself is stored in a uint32_t so I tried adding a uint16_t in front of it to see if it was getting lucky doing a 32 bit write into that address. However nothing came up from that. Setting the variable to a 64 bit one didn't pick up a rouge 32 bit write either (if valgrind would pick that up?). > > > Is there a way to log the loads and stores valgrind makes? > > No. I'm not even entirely sure what loads and stores you are talking > about, but the answer is no anyway. Well something is writing the memory. As we known the address of the memory in question is there a way to hook into valgrind and ask it "let me know the 2nd time a value is written to this address"? > > Tom > -- Alex, homepage: http://www.bennee.com/~alex/ WARNING TO ALL PERSONNEL: Firings will continue until morale improves. |
|
From: Tom H. <to...@co...> - 2006-09-07 14:32:14
|
In message <115...@ok...>
Alex Bennee <ker...@be...> wrote:
>> > Is there a way to log the loads and stores valgrind makes?
>>
>> No. I'm not even entirely sure what loads and stores you are talking
>> about, but the answer is no anyway.
>
> Well something is writing the memory. As we known the address of the
> memory in question is there a way to hook into valgrind and ask it "let
> me know the 2nd time a value is written to this address"?
There's the old watchpoint patch somewhere but I have no idea if
it still applies... That will only catches writes made by your
code though.
There is no way to spot writes made by valgrind itself other than
by running it under gdb and using the normal gdb watchpoint support.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Alex B. <ker...@be...> - 2006-09-07 15:00:52
|
On Thu, 2006-09-07 at 15:32 +0100, Tom Hughes wrote: > In message <115...@ok...> > Alex Bennee <ker...@be...> wrote: > > >> > Is there a way to log the loads and stores valgrind makes? > >> > >> No. I'm not even entirely sure what loads and stores you are talking > >> about, but the answer is no anyway. > > > > Well something is writing the memory. As we known the address of the > > memory in question is there a way to hook into valgrind and ask it "let > > me know the 2nd time a value is written to this address"? > > There's the old watchpoint patch somewhere but I have no idea if > it still applies... That will only catches writes made by your > code though. I'll give that a try and see it I can get it to apply. How come it never got merged in the first place? It does seem like quite a useful facility > There is no way to spot writes made by valgrind itself other than > by running it under gdb and using the normal gdb watchpoint support. (gdb) r --smc-check=all dynamite -z config applu < applu.test.in Starting program: /usr/local/bin/valgrind --smc-check=all dynamite -z config applu < applu.test.in ==6286== Memcheck, a memory error detector. ==6286== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. ==6286== Using LibVEX rev 1426, a library for dynamic binary translation. ==6286== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. ==6286== Using valgrind-3.3.0.SVN, a dynamic binary instrumentation framework. ==6286== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. ==6286== For more details, rerun with: -v ==6286== Program received signal SIGSEGV, Segmentation fault. 0x00000004033d2f66 in ?? () (gdb) x/i $pc 0x4033d2f66: mov %ebx,0x0(%r12) (gdb) p/x $r12 $1 = 0x7feffef7c (gdb) x/w $1 0x7feffef7c: Cannot access memory at address 0x7feffef7c (gdb) Hmmm not a good sign that it doesn't get as far in gdb as running from the command line. Now our app does move around in memory, I wonder if the combination of Valgrind, Our app and gdb's memory demands are just too much to handle? > Tom > -- Alex, homepage: http://www.bennee.com/~alex/ Kansas state law requires pedestrians crossing the highways at night to wear tail lights. |
|
From: Tom H. <to...@co...> - 2006-09-07 15:16:40
|
In message <115...@ok...>
Alex Bennee <ker...@be...> wrote:
>> There is no way to spot writes made by valgrind itself other than
>> by running it under gdb and using the normal gdb watchpoint support.
>
> (gdb) r --smc-check=all dynamite -z config applu < applu.test.in
> Starting program: /usr/local/bin/valgrind --smc-check=all dynamite -z
> config applu < applu.test.in
> ==6286== Memcheck, a memory error detector.
> ==6286== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al.
> ==6286== Using LibVEX rev 1426, a library for dynamic binary
> translation.
> ==6286== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP.
> ==6286== Using valgrind-3.3.0.SVN, a dynamic binary instrumentation
> framework.
> ==6286== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al.
> ==6286== For more details, rerun with: -v
> ==6286==
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x00000004033d2f66 in ?? ()
> (gdb) x/i $pc
> 0x4033d2f66: mov %ebx,0x0(%r12)
> (gdb) p/x $r12
> $1 = 0x7feffef7c
> (gdb) x/w $1
> 0x7feffef7c: Cannot access memory at address 0x7feffef7c
> (gdb)
>
> Hmmm not a good sign that it doesn't get as far in gdb as running from
> the command line. Now our app does move around in memory, I wonder if
> the combination of Valgrind, Our app and gdb's memory demands are just
> too much to handle?
It is normal for some SEGVs to occur under valgrind - they are caused
by the client program's stack growing beyond it's current limit. They
are caught and fixed up by valgrind though.
See README_DEVELOPERS for instructions on running under gdb.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Olly B. <ol...@su...> - 2006-09-07 15:55:23
|
On 2006-09-07, Alex Bennee <ker...@be...> wrote:
> Could be I guess. But that still leads to the initial problem of how do
> identify the thing that is clobbering the variable. The value itself is
> stored in a uint32_t so I tried adding a uint16_t in front of it to see
> if it was getting lucky doing a 32 bit write into that address. However
> nothing came up from that. Setting the variable to a 64 bit one didn't
> pick up a rouge 32 bit write either (if valgrind would pick that up?).
Bear in mind that valgrind works at a lower level than C, so it is
unlikely to be able to trap "wrong-sized" writes in the way you seem to
hope.
For an example of why this can't really be done, the C compiler could
optimise the following code so that f actually performs a single 32-bit
write:
struct {
short a, b;
} foo;
void f(short x, short y) {
foo.a = x;
foo.b = y;
}
(although for this particular example, GCC doesn't seem to actually do
this currently).
You could try inserting the padding into your structure and marking it
"no access" using a valgrind client request. Or just run under gdb and
stick a watchpoint on the struct member which gets changed, as I think
Tom suggested elsewhere.
Cheers,
Olly
|
|
From: Alex B. <ker...@be...> - 2006-09-07 17:17:23
|
On Thu, 2006-09-07 at 15:09 +0100, Tom Hughes wrote: > In message <115...@ok...> > Alex Bennee <ker...@be...> wrote: > > <snip> > > Additionally running the program native with the instrumentation never > > clobbers the pid variable to 0. I think this points more towards a a bug > > with Valgrind. > <snip> > Far more likely is that running under valgrind has changed the > layout of memory in your program so that an existing problem in > your program is manifesting itself by overwriting something more > important. Doh! I should of looked more closely at the syscalls setPid 15817 (@0x424f5f8) .. .. getLock tc=0x424e208, pid=15817 SYSCALL[15817,1]( 1) ... [async] --> Success(0x20) SYSCALL[15817,1]( 9) sys_mmap ( 0x6C000, 198787072, 7, 50, -1, 0 )==15817== Warning: set address range perms: large range 198787072 (defined) --> [pre-success] Success(0x6C000) SYSCALL[15817,1]( 9) sys_mmap ( 0x2A96000000, 4096, 1, 34, -1, 0 ) --> [pre-success] Success(0xBE00000) SYSCALL[15817,1]( 11) sys_munmap ( 0xBE00000, 4096 )[sync] --> Success(0x0) SYSCALL[15817,1]( 1) sys_write ( 2, 0x7FEFFC070, 28 ) --> [async] ... getLock tc=0x424e208, pid=0 We let our subject application mmap right over our heap area. I tried messing around with valgrind's configure to leave enough space for us, however: cat /proc/15817/maps <snip> 6ff00000-70377000 r-xp 00000000 00:18 1552482 dynamite 70477000-7048a000 rw-p 00477000 00:18 1552482 dynamite 7048a000-704d3000 rw-p 7048a000 00:00 0 704d3000-704d4000 rwxp 704d3000 00:00 0 78000000-78169000 r-xp 00000000 08:03 35037 /usr/local/lib/valgrind/amd64-linux/memcheck 78269000-7826a000 rw-p 00169000 08:03 35037 /usr/local/lib/valgrind/amd64-linux/memcheck <snip> It looks as though our binary is where it wants to be the heap itself has moved down into memory. There are two options I think of: 1. Stop Valgrind relocating the heap 2. Tweak our program to discover the heap and then disallow maps in the region. We currently assume that the heap follows straight after bss but I don't think it does when we are being Valgrinded. Is there anyway to force Valgrind to use a particular location for heap? -- Alex, homepage: http://www.bennee.com/~alex/ Trying to be happy is like trying to build a machine for which the only specification is that it should run noiselessly. |
|
From: Tom H. <to...@co...> - 2006-09-08 18:58:27
|
In message <115...@ok...>
Alex Bennee <ker...@be...> wrote:
> I should of looked more closely at the syscalls
>
> setPid 15817 (@0x424f5f8)
> ..
> ..
> getLock tc=0x424e208, pid=15817
> SYSCALL[15817,1]( 1) ... [async] --> Success(0x20)
> SYSCALL[15817,1]( 9) sys_mmap ( 0x6C000, 198787072, 7, 50, -1, 0 )==15817== Warning: set address range perms: large range 198787072 (defined) --> [pre-success] Success(0x6C000)
> SYSCALL[15817,1]( 9) sys_mmap ( 0x2A96000000, 4096, 1, 34, -1, 0 ) --> [pre-success] Success(0xBE00000)
> SYSCALL[15817,1]( 11) sys_munmap ( 0xBE00000, 4096 )[sync] --> Success(0x0)
> SYSCALL[15817,1]( 1) sys_write ( 2, 0x7FEFFC070, 28 ) --> [async] ...
> getLock tc=0x424e208, pid=0
>
> We let our subject application mmap right over our heap area.
Using a fixed mmap is a fairly dangerous thing to do...
You are mmaping 0x6c000 to 0xbe00000 as a fixed mapping which makes me
susprised that anything ever works as even without valgrind that is
likely to wipe out the executable and the heap - this is the map for
a basic process on one of my 64 bit machines without valgrind:
00400000-00405000 r-xp 00000000 fd:00 18350129 /bin/cat
00504000-00506000 rw-p 00004000 fd:00 18350129 /bin/cat
00506000-00527000 rw-p 00506000 00:00 0 [heap]
347f100000-347f11a000 r-xp 00000000 fd:00 6144028 /lib64/ld-2.4.so
347f219000-347f21a000 r--p 00019000 fd:00 6144028 /lib64/ld-2.4.so
347f21a000-347f21b000 rw-p 0001a000 fd:00 6144028 /lib64/ld-2.4.so
347f300000-347f43f000 r-xp 00000000 fd:00 6144043 /lib64/libc-2.4.so
347f43f000-347f53f000 ---p 0013f000 fd:00 6144043 /lib64/libc-2.4.so
347f53f000-347f543000 r--p 0013f000 fd:00 6144043 /lib64/libc-2.4.so
347f543000-347f544000 rw-p 00143000 fd:00 6144043 /lib64/libc-2.4.so
347f544000-347f549000 rw-p 347f544000 00:00 0
2aaaaaaab000-2aaaaaaac000 rw-p 2aaaaaaab000 00:00 0
2aaaaaad5000-2aaaaaad7000 rw-p 2aaaaaad5000 00:00 0
7fff1e76f000-7fff1e785000 rw-p 7fff1e76f000 00:00 0 [stack]
ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vdso]
Your fixed mapping would wipe out the executable and heap on that...
> I tried messing around with valgrind's configure to leave enough space for
> us, however:
>
> cat /proc/15817/maps
> <snip>
> 6ff00000-70377000 r-xp 00000000 00:18 1552482 dynamite
> 70477000-7048a000 rw-p 00477000 00:18 1552482 dynamite
> 7048a000-704d3000 rw-p 7048a000 00:00 0
> 704d3000-704d4000 rwxp 704d3000 00:00 0
> 78000000-78169000 r-xp 00000000 08:03 35037 /usr/local/lib/valgrind/amd64-linux/memcheck
> 78269000-7826a000 rw-p 00169000 08:03 35037 /usr/local/lib/valgrind/amd64-linux/memcheck
> <snip>
>
> It looks as though our binary is where it wants to be the heap itself
> has moved down into memory. There are two options I think of:
>
> 1. Stop Valgrind relocating the heap
Valgrind doesn't relocate the heap as such - it creates an emulated
heap for the client. On x86 it winds up immediately after the BSS of
the main executable, but on amd64 there is a gap as the executable
is loaded very low and valgrind only uses memory below 64Mb when
specifically asked so the heap will go in the first available space
which is at 64Mb:
--11248:1:aspacem <<< SHOW_SEGMENTS: Memory layout at client startup (33 segments, 4 segnames)
--11248:1:aspacem ( 0) /usr/lib64/valgrind/amd64-linux/memcheck
--11248:1:aspacem ( 1) /bin/bash
--11248:1:aspacem ( 2) /lib64/ld-2.4.so
--11248:1:aspacem 0: RSVN 0000000000-00003FFFFF 4194304 ----- SmFixed
--11248:1:aspacem 1: file 0000400000-00004AEFFF 716800 r-x-- d=0xFD00 i=17858591 o=0 (1)
--11248:1:aspacem 2: RSVN 00004AF000-00005ADFFF 1044480 ----- SmFixed
--11248:1:aspacem 3: file 00005AE000-00005B7FFF 40960 rw--- d=0xFD00 i=17858591 o=712704 (1)
--11248:1:aspacem 4: anon 00005B8000-00005BCFFF 20480 rw---
--11248:1:aspacem 5: RSVN 00005BD000-00006B6FFF 1024000 ----- SmFixed
--11248:1:aspacem 6: file 00006B7000-00006BEFFF 32768 rw--- d=0xFD00 i=17858591 o=749568 (1)
--11248:1:aspacem 7: RSVN 00006BF000-0003FFFFFF 57m ----- SmFixed
--11248:1:aspacem 8: anon 0004000000-0004000FFF 4096 rwx--
--11248:1:aspacem 9: RSVN 0004001000-00047FFFFF 8384512 ----- SmLower
--11248:1:aspacem 10: 0004800000-0037FFFFFF 824m
--11248:1:aspacem 11: FILE 0038000000-0038026FFF 159744 r-x-- d=0xFD00 i=6209928 o=0 (0)
--11248:1:aspacem 12: file 0038027000-0038027FFF 4096 r-x-- d=0xFD00 i=6209928 o=159744 (0)
--11248:1:aspacem 13: FILE 0038028000-003815BFFF 1261568 r-x-- d=0xFD00 i=6209928 o=163840 (0)
--11248:1:aspacem 14: 003815C000-003825AFFF 1044480
--11248:1:aspacem 15: FILE 003825B000-003825BFFF 4096 rw--- d=0xFD00 i=6209928 o=1421312 (0)
--11248:1:aspacem 16: ANON 003825C000-0038DFAFFF 11m rw---
--11248:1:aspacem 17: 0038DFB000-0401FFFFFF 15506m
--11248:1:aspacem 18: RSVN 0402000000-0402000FFF 4096 ----- SmFixed
--11248:1:aspacem 19: ANON 0402001000-0402798FFF 7962624 rwx--
--11248:1:aspacem 20: 0402799000-07FE600FFF 16318m
--11248:1:aspacem 21: RSVN 07FE601000-07FEFFDFFF 9m ----- SmUpper
--11248:1:aspacem 22: anon 07FEFFE000-07FF000FFF 12288 rwx--
--11248:1:aspacem 23: 07FF001000-07FFFFFFFF 15m
--11248:1:aspacem 24: RSVN 0800000000-347F0FFFFF 182257m ----- SmFixed
--11248:1:aspacem 25: file 347F100000-347F119FFF 106496 r-x-- d=0xFD00 i=6144028 o=0 (2)
--11248:1:aspacem 26: RSVN 347F11A000-347F218FFF 1044480 ----- SmFixed
--11248:1:aspacem 27: file 347F219000-347F21AFFF 8192 rw--- d=0xFD00 i=6144028 o=102400 (2)
--11248:1:aspacem 28: RSVN 347F21B000-7FFF7A296FFF 130859g ----- SmFixed
--11248:1:aspacem 29: ANON 7FFF7A297000-7FFF7A2ACFFF 90112 rw---
--11248:1:aspacem 30: RSVN 7FFF7A2AD000-FFFFFFFFFF5FFFFF 16383e ----- SmFixed
--11248:1:aspacem 31: ANON FFFFFFFFFF600000-FFFFFFFFFFDFFFFF 8388608 -----
--11248:1:aspacem 32: RSVN FFFFFFFFFFE00000-FFFFFFFFFFFFFFFF 2097152 ----- SmFixed
--11248:1:aspacem >>>
Segments 8 and 9 are the heap (segment 8 is the allocated space and
segment 9 is the reserved space that it will grow into). As you can
see there is a 57Mb gap between the executable and the heap.
> 2. Tweak our program to discover the heap and then disallow maps in the
> region.
>
> We currently assume that the heap follows straight after bss but I don't
> think it does when we are being Valgrinded. Is there anyway to force
> Valgrind to use a particular location for heap?
There is no way to force the heap location except by changing the
source - reducing aspacem_minAddr will cause it to be allocated lower
down, so if it was low enough then the heap would wind up immediately
after the program I guess.
What your doing sounds horribly reliant on current kernel/glibc
behaviour though, even without valgrind... Why do you need to map
at a fixed location rather than just letting the kernel choose an
address for you?
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Dirk M. <dm...@gm...> - 2006-09-08 18:18:07
|
On Wednesday 06 September 2006 19:51, Tom Hughes wrote:
> I'm not sure this is anything to worry about - it has crashed
> trying to run the glibc cleanup code but if glibc has already
> corrupted it's state then it's not entirely surprising if that
> fails.
I don't think so - I see it too everywhere even for clean applications. and=
=20
looking at the code its not a surprise:=20
VALGRIND_DO_CLIENT_REQUEST(res, 0 /* default */,
VG_USERREQ__LIBC_FREERES_DONE,=B7
0, 0, 0, 0, 0);
/*NOTREACHED*/
*(int *)0 =3D 'x';
}
The NOTREACHED part is executed for me. Julian said that it might be becaus=
e=20
the compiler optimizes the DO_CLIENT_REQUEST away, but I haven't checked=20
that.=20
Dirk
|
|
From: Julian S. <js...@ac...> - 2006-09-10 08:23:19
|
> I don't think so - I see it too everywhere even for clean applications. a= nd > looking at the code its not a surprise: > > VALGRIND_DO_CLIENT_REQUEST(res, 0 /* default */, > VG_USERREQ__LIBC_FREERES_DONE,=B7 > 0, 0, 0, 0, 0); > /*NOTREACHED*/ > *(int *)0 =3D 'x'; > } > > The NOTREACHED part is executed for me. Julian said that it might be > because the compiler optimizes the DO_CLIENT_REQUEST away, but I haven't > checked that. Ah, you mentioned this a couple of weeks ago. I installed openSuse 10.2=20 alpha 3 for x86 on vmware, but couldn't reproduce this problem with the=20 svn trunk - all the regression tests looked ok. Are you sure you don't=20 have some confusion with header files, such as accidentally picking up=20 the header files (valgrind.h, memcheck.h) from a 3.1.X install? The client request implementation changed from 3.1.X to 3.2.X in such a way that code compiled against a 3.1.X header will not be recognised at run time by 3.2.X as a client request. J |
|
From: Dirk M. <dm...@gm...> - 2006-09-10 18:24:52
|
On Sunday, 10. September 2006 10:23, Julian Seward wrote: > Ah, you mentioned this a couple of weeks ago. I installed openSuse 10.2 > alpha 3 for x86 on vmware, but couldn't reproduce this problem with the > svn trunk - all the regression tests looked ok. Are you sure you don't > have some confusion with header files, such as accidentally picking up > the header files (valgrind.h, memcheck.h) from a 3.1.X install? I found out that it was caused by the DRD patch. No client request besides the drd ones seemed to work anymore. Dirk |