|
From: Nicholas N. <nj...@cs...> - 2005-02-13 18:42:23
|
Hi,
I'm trying to get self-hosting to work on my machine. I just switched to
recent versions of gcc (3.4.3) and ld (binutils 2.15) which support PIE.
But once I built Valgrind, I get a seg fault at startup. Diagnostic printfs
tell me that things are ok until at least the just before the call to
jmp_with_stack() at the end of stage1.c:main2(). But the seg fault
manifests before the first statement in vg_main.c:main() executes.
The distro on the machine is an oldish Debian one; in particular the glibc
version is 2.2.5:
[~/grind/head2] /lib/libc.so.6
GNU C Library stable release version 2.2.5, by Roland McGrath et al.
Copyright (C) 1992-2001, 2002 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 2.95.4 20011002 (Debian prerelease).
Compiled on a Linux 2.4.18 system on 2005-01-07.
Available extensions:
GNU libio by Per Bothner
crypt add-on version 2.1 by Michael Glad and others
linuxthreads-0.9 by Xavier Leroy
BIND-8.2.3-T5B
libthread_db work sponsored by Alpha Processor Inc
NIS(YP)/NIS+ NSS modules 0.19 by Thorsten Kukuk
Report bugs using the `glibcbug' script to <bu...@gn...>.
Is it possible that the dynamic linker is too old to handle PIE executables?
But a toy program compiled with "gcc -fpie a.c" worked ok.
Kernel version is 2.4.29, in case that's relevant:
[~/grind/head2] uname -a
Linux charco.cs.utexas.edu 2.4.29 #1 SMP Mon Jan 24 09:20:36 CST 2005 i686
unknown
Valgrind was working fine with older, non-PIE versions of gcc and ld.
Any suggestions what the problem might be, or how to hunt it down? Thanks.
N
|
|
From: <je...@go...> - 2005-02-14 00:13:31
|
Quoting Nicholas Nethercote <nj...@cs...>:
> I'm trying to get self-hosting to work on my machine. I just switched =
to
> recent versions of gcc (3.4.3) and ld (binutils 2.15) which support PIE=
.
> But once I built Valgrind, I get a seg fault at startup. Diagnostic pr=
intfs
> tell me that things are ok until at least the just before the call to
> jmp_with_stack() at the end of stage1.c:main2(). But the seg fault
> manifests before the first statement in vg_main.c:main() executes.
What's the fault address? What's the faulting instruction?
> The distro on the machine is an oldish Debian one; in particular the g=
libc
> version is 2.2.5:
In principle there should be no problem, but it wouldn't surprise me if t=
here
were some bug which prevents PIE from working. Or something we haven't t=
aken
into account.
J
|
|
From: Nicholas N. <nj...@cs...> - 2005-02-16 03:40:15
Attachments:
x
|
On Sun, 13 Feb 2005 je...@go... wrote: >> I'm trying to get self-hosting to work on my machine. I just switched to >> recent versions of gcc (3.4.3) and ld (binutils 2.15) which support PIE. >> But once I built Valgrind, I get a seg fault at startup. Diagnostic printfs >> tell me that things are ok until at least the just before the call to >> jmp_with_stack() at the end of stage1.c:main2(). But the seg fault >> manifests before the first statement in vg_main.c:main() executes. > > What's the fault address? What's the faulting instruction? gdb tells me: Program terminated with signal 11, Segmentation fault. #0 0xb805ab88 in ?? () if that's any help. I've attached the strace output, too. How do I get the faulting address? N |
|
From: Jeremy F. <je...@go...> - 2005-02-16 07:45:30
|
On Tue, 2005-02-15 at 21:40 -0600, Nicholas Nethercote wrote: > gdb tells me: > > Program terminated with signal 11, Segmentation fault. > #0 0xb805ab88 in ?? () > > if that's any help. I've attached the strace output, too. > > How do I get the faulting address? "x/i $eip" will show you the faulting instuction; from that you can determine the effective address it was trying to access. You can get register values with "print $<reg>". J |
|
From: Nicholas N. <nj...@cs...> - 2005-02-17 01:00:14
|
On Tue, 15 Feb 2005, Jeremy Fitzhardinge wrote: > "x/i $eip" will show you the faulting instuction; from that you can > determine the effective address it was trying to access. You can get > register values with "print $<reg>". I get the following [~/grind/head2] gdb coregrind/valgrind core GNU gdb 5.3 [...] This GDB was configured as "i686-pc-linux-gnu"... Core was generated by `/u/njn/grind/head2/coregrind/valgrind date'. Program terminated with signal 11, Segmentation fault. #0 0xb805ab88 in ?? () (gdb) x/i $eip 0xb805ab88: Cannot access memory at address 0xb805ab88 Does that mean it jumped to an address that had no underlying mapping? N |
|
From: Jeremy F. <je...@go...> - 2005-02-17 01:34:40
|
Nicholas Nethercote wrote:
> I get the following
>
> [~/grind/head2] gdb coregrind/valgrind core
> GNU gdb 5.3
> [...]
> This GDB was configured as "i686-pc-linux-gnu"...
> Core was generated by `/u/njn/grind/head2/coregrind/valgrind date'.
> Program terminated with signal 11, Segmentation fault.
> #0 0xb805ab88 in ?? ()
> (gdb) x/i $eip
> 0xb805ab88: Cannot access memory at address 0xb805ab88
>
> Does that mean it jumped to an address that had no underlying mapping?
Yep, that means there was nothing at that address; the SEGV was from the
instruction fetch rather than from something the instruction did.
While it is sitting crashed in GDB, you can look at /proc/<pid>/maps to
see where 0xb805ab88 lies in relationship to everything else in the
address space. That might give you a clue about what's going wrong.
What happens if you try running the PIE valgrind under a non-PIE
valgrind with (mem|addr)check; does that give any more information? The
trouble with jumps into the void is that there's very little information
about where it came from (hence bug #98993). "x/x $esp" will show you
the top of the stack, which might be a return address.
J
|
|
From: Nicholas N. <nj...@cs...> - 2005-02-17 02:17:06
|
On Wed, 16 Feb 2005, Jeremy Fitzhardinge wrote: > While it is sitting crashed in GDB, you can look at /proc/<pid>/maps to > see where 0xb805ab88 lies in relationship to everything else in the > address space. That might give you a clue about what's going wrong. What <pid> do I use? > What happens if you try running the PIE valgrind under a non-PIE > valgrind with (mem|addr)check; does that give any more information? Doesn't work: Executable range (nil)-0x203f00 is outside theacceptable range 0x50000000-0x52bfe000 valgrind: failed to load /u/njn/grind/head4/.in_place/stage2: Cannot allocate memory ie. the outer (non-PIE) Valgrind can't even load the PIE one. Now I'm confused. N |
|
From: Jeremy F. <je...@go...> - 2005-02-17 02:50:40
|
Nicholas Nethercote wrote:
> What <pid> do I use?
The pid of your valgrind which is under the control of gdb. Ie, look at
its maps while it is stopped in gdb.
>> What happens if you try running the PIE valgrind under a non-PIE
>> valgrind with (mem|addr)check; does that give any more information?
>
>
> Doesn't work:
>
> Executable range (nil)-0x203f00 is outside theacceptable range
> 0x50000000-0x52bfe000
> valgrind: failed to load /u/njn/grind/head4/.in_place/stage2: Cannot
> allocate memory
>
> ie. the outer (non-PIE) Valgrind can't even load the PIE one. Now I'm
> confused.
What does "readelf -hl .in_place/stage2" say?
J
|
|
From: Nicholas N. <nj...@cs...> - 2005-02-17 03:12:27
Attachments:
x
|
On Wed, 16 Feb 2005, Jeremy Fitzhardinge wrote: >> What <pid> do I use? > > The pid of your valgrind which is under the control of gdb. Ie, look at > its maps while it is stopped in gdb. I wasn't running Valgrind under gdb... I was just using gdb to analyse the core file. Anyway, I run Valgrind under GDB and the maps are: 00000000-08048000 ---p 00000000 03:06 16 /tmp/.pad.11393.1 (deleted) 08048000-080a3000 r-xp 00000000 00:12 8428197 /v/filer3/v1q009/njn/grind/head2/inst/bin/valgrind 080a3000-080a6000 rw-p 0005a000 00:12 8428197 /v/filer3/v1q009/njn/grind/head2/inst/bin/valgrind 080a6000-080ca000 rwxp 00000000 00:00 0 080ca000-b1000000 ---p 00000000 03:06 16 /tmp/.pad.11393.1 (deleted) b1000000-b1013000 r-xp 00000000 03:01 26832 /lib/ld-2.2.5.so b1013000-b1014000 rw-p 00013000 03:01 26832 /lib/ld-2.2.5.so b1030000-b1032000 r-xp 00000000 03:01 26903 /lib/libdl-2.2.5.so b1032000-b1033000 rw-p 00001000 03:01 26903 /lib/libdl-2.2.5.so b1033000-b1146000 r-xp 00000000 03:01 26889 /lib/libc-2.2.5.so b1146000-b114c000 rw-p 00113000 03:01 26889 /lib/libc-2.2.5.so b114c000-b1151000 rw-p 00000000 00:00 0 b8048000-b8103000 r-xp 00000000 00:12 8428199 /v/filer3/v1q009/njn/grind/head2/inst/lib/valgrind/stage2 b8103000-b8105000 rw-p 000ba000 00:12 8428199 /v/filer3/v1q009/njn/grind/head2/inst/lib/valgrind/stage2 b8105000-b825b000 rw-p 00000000 00:00 0 bfffe000-c0000000 rwxp fffff000 00:00 0 The bad address was 0xb805ab88, ie. within stage2, which doesn't seem unreasonable. >> Executable range (nil)-0x203f00 is outside theacceptable range >> 0x50000000-0x52bfe000 >> valgrind: failed to load /u/njn/grind/head4/.in_place/stage2: Cannot >> allocate memory >> >> ie. the outer (non-PIE) Valgrind can't even load the PIE one. Now I'm >> confused. > > What does "readelf -hl .in_place/stage2" say? See attachment. N |
|
From: Jeremy F. <je...@go...> - 2005-02-17 05:55:32
|
Nicholas Nethercote wrote:
> The bad address was 0xb805ab88, ie. within stage2, which doesn't seem
> unreasonable.
It looks like it has been linked as a normal executable, which we're
relocating to somewhere it doesn't expect.
>ELF Header:
> Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
> Class: ELF32
> Data: 2's complement, little endian
> Version: 1 (current)
> OS/ABI: UNIX - System V
> ABI Version: 0
> Type: EXEC (Executable file)
>
>
Hm, this is a normal executable, not an -fpie executable (which would be
DYN)...
> Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
> PHDR 0x000034 0x08048034 0x08048034 0x000e0 0x000e0 R E 0x4
> INTERP 0x000114 0x08048114 0x08048114 0x00013 0x00013 R 0x1
> [Requesting program interpreter: /lib/ld-linux.so.2]
> LOAD 0x000000 0x08048000 0x08048000 0xba440 0xba440 R E 0x1000
>
>
...loaded at the normal executable address.
J
|
|
From: Nicholas N. <nj...@cs...> - 2005-02-19 01:23:42
|
On Wed, 16 Feb 2005, Jeremy Fitzhardinge wrote: >> The bad address was 0xb805ab88, ie. within stage2, which doesn't seem >> unreasonable. > > It looks like it has been linked as a normal executable, which we're > relocating to somewhere it doesn't expect. > >> ELF Header: >> Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 >> Class: ELF32 >> Data: 2's complement, little endian >> Version: 1 (current) >> OS/ABI: UNIX - System V >> ABI Version: 0 >> Type: EXEC (Executable file) >> >> > Hm, this is a normal executable, not an -fpie executable (which would be > DYN)... > >> Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align >> PHDR 0x000034 0x08048034 0x08048034 0x000e0 0x000e0 R E 0x4 >> INTERP 0x000114 0x08048114 0x08048114 0x00013 0x00013 R 0x1 >> [Requesting program interpreter: /lib/ld-linux.so.2] >> LOAD 0x000000 0x08048000 0x08048000 0xba440 0xba440 R E 0x1000 >> >> > ...loaded at the normal executable address. That's strange. For non-PIE builds, that LOAD VirtAddr should be 0xb0000000, right? I think the problem is that my GCC seems to be using binutils 2.13.2.1 (which doesn't recognise --pie) not the one in my PATH, which is 2.15. I guess GCC was configured to use that binutils upon installation? Any idea how to change it? Maybe I'll have to install a new GCC, and point it to a more recent binutils. N |
|
From: Jeremy F. <je...@go...> - 2005-02-19 02:28:18
|
Nicholas Nethercote wrote:
> That's strange. For non-PIE builds, that LOAD VirtAddr should be
> 0xb0000000, right?
Yep, but for PIE it doesn't matter, so we don't set anything, and ld is
using its defaults.
> I think the problem is that my GCC seems to be using binutils 2.13.2.1
> (which doesn't recognise --pie) not the one in my PATH, which is
> 2.15. I guess GCC was configured to use that binutils upon
> installation? Any idea how to change it? Maybe I'll have to install
> a new GCC, and point it to a more recent binutils.
Maybe. Sounds like a bit of a mess. I'm not sure if PIE is a
RedHat/Fedora local change, or something which is in the baseline
tools. I notice that SuSE 9.2 doesn't support PIE at all.
J
|
|
From: Naveen K. <g_n...@ya...> - 2005-02-17 15:02:19
|
well what you could do is find the process mapping
just before it dumped core. I usually insert a {
while(1) sleep(5); } at that point. You can then look
at proc/[pid]/map to find out which object is loaded
around that address. Once you find that you can use
elfdump or some other tool to find out what
function/instruction this failed at. I do this if
loading the symbol-file for the object doesn't work
with gdb(no debugging symbols found).
Naveen
--- Nicholas Nethercote <nj...@cs...> wrote:
> On Tue, 15 Feb 2005, Jeremy Fitzhardinge wrote:
>
> > "x/i $eip" will show you the faulting instuction;
> from that you can
> > determine the effective address it was trying to
> access. You can get
> > register values with "print $<reg>".
>
> I get the following
>
> [~/grind/head2] gdb coregrind/valgrind core
> GNU gdb 5.3
> [...]
> This GDB was configured as "i686-pc-linux-gnu"...
> Core was generated by
> `/u/njn/grind/head2/coregrind/valgrind date'.
> Program terminated with signal 11, Segmentation
> fault.
> #0 0xb805ab88 in ?? ()
> (gdb) x/i $eip
> 0xb805ab88: Cannot access memory at address
> 0xb805ab88
>
> Does that mean it jumped to an address that had no
> underlying mapping?
>
> N
>
>
>
-------------------------------------------------------
> SF email is sponsored by - The IT Product Guide
> Read honest & candid reviews on hundreds of IT
> Products from real users.
> Discover which products truly live up to the hype.
> Start reading now.
>
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
> _______________________________________________
> Valgrind-developers mailing list
> Val...@li...
>
https://lists.sourceforge.net/lists/listinfo/valgrind-developers
>
__________________________________
Do you Yahoo!?
The all-new My Yahoo! - Get yours free!
http://my.yahoo.com
|