From: Blaisorblade <bla...@ya...> - 2007-01-23 08:10:15
|
On Monday 22 January 2007 21:59, Jeff Dike wrote: > On Sat, Jan 20, 2007 at 12:18:30AM +0100, Blaisorblade wrote: > > Ok, I hope I remembered correctly how to debug such faults (I'm posting > > the full procedure so you can give a look) > > Correct. > > > 0x00000000619a592f: mov %edx,%fs:(%rcx) #faulting instruction. > > This and the registers involved are usually all you need. > > > RCX there is (long)regs->skas.regs[11] = -64, and for FS, since HOST_FS = > > 25, I get: > > > > print/x regs->skas.regs[25] > > $45 = 0x63 Since include/asm-x86_64/segment.h has: #define FS_TLS_SEL ((GDT_ENTRY_TLS_MIN+FS_TLS)*8 + 3) (which gives 0x63), this is the default value, ok. > The presence of %fs in the instruction immediately suggests a TLS problem. Yes, but the problem depends on miscompilation of UML (or rather, on using a different compiler), and arch_prctl_skas does not do a lot of work on x86_64 (for what I can see). I'm astonished by the fact that less users complain about TLS on amd64 than on x86 (there are much less users, ok, but the difference is too high). > Also, the trap number in cases like this should be 13, rather than the > 14 you get with a normal page fault. I remember that I saw 14 there, indeed. And indeed, it is so. See below: (gdb) print regs->skas $5 = {regs = {1, 547608288328, 547608288344, 4294967295, 547608288096, 1077341640, 514, 0, 3399988123389603631, 18374403900871474943, 18446744073709551614, 18446744073709551552, 2, 0, 1076283584, 18446744073709551615, 1078130991, 51, 66067, 547608288008, 43, 0, 0, 0, 0, 99, 0}, fp = {895, 0, 0, 281470681751424, 0 <repeats 60 times>, 140733672859384}, faultinfo = {error_code = 6, cr2 = 1613887512, trap_no = 14}, syscall = -1, is_user = 1} (gdb) print/x regs->skas $7 = {regs = {0x1, 0x7f7fff5c48, 0x7f7fff5c58, 0xffffffff, 0x7f7fff5b60, 0x4036edc8, 0x202, 0x0, 0x2f2f2f2f2f2f2f2f, 0xfefefefefefefeff, 0xfffffffffffffffe, 0xffffffffffffffc0, 0x2, 0x0, 0x4026c8c0, 0xffffffffffffffff, 0x4042f92f, 0x33, 0x10213, 0x7f7fff5b08, 0x2b, 0x0, 0x0, 0x0, 0x0, 0x63, 0x0}, fp = {0x37f, 0x0, 0x0, 0xffff00001f80, 0x0 <repeats 60 times>, 0x7fff1c9426f8}, faultinfo = {error_code = 0x6, cr2 = 0x6031f818, trap_no = 0xe}, syscall = 0xffffffffffffffff, is_user = 0x1} Then, new things: (gdb) print $rsp $9 = (void *) 0x60a23da0 > %fs isn't 0, so that's one thing that's not wrong. What's in the > corresponding segment? regs[FS_BASE,GS_BASE] are both 0. Instead, from strace I can see that brk(0) = 0x6031f000 brk(0x6031f880) = 0x6031f880 arch_prctl(ARCH_SET_FS, 0x6031f858) = 0 is called at UML boot, very early, by glibc. All of 4 runs have the same address in this call (no randomization on this? Strange!). On the working binary, instead, this is what I get: brk(0) = 0x6031b000 brk(0x6031b880) = 0x6031b880 # malloc? arch_prctl(ARCH_SET_FS, 0x6031b858) = 0 #use the result, with some offset. I do not think the exact address makes a difference (it will likely depend on the binary size). I'll give a better look at some later time. Bye! -- Inform me of my mistakes, so I can add them to my list! Paolo Giarrusso, aka Blaisorblade http://www.user-mode-linux.org/~blaisorblade Chiacchiera con i tuoi amici in tempo reale! http://it.yahoo.com/mail_it/foot/*http://it.messenger.yahoo.com |