On Tue, 1 Mar 2005, Tim Fletcher wrote:
> > I'd recommend bochs for testing this sort of thing. Can you send me
> > the file /openbsd/pxeboot? Also, if you happen to have a direct link
> > to the source code for this file, it would save me tracking it down.
>
> I'll try and give it a go with bochs later on today but here are links
> to the various bits of openbsd pxeboot:
Thanks for the links. It was remarkably straightforward to track down.
bochs is a piece of software that I just keep getting more impressed with,
not least because of the accuracy of its x86 emulation even under obscure
error conditions.
Enabling -DTRACE_PXE gave me
Me: 10.254.254.1, DHCP: 10.254.254.2, TFTP: 10.254.254.2, Gateway 10.0.0.6
Loading 10.254.254.2:pxeboot ...(PXE).....................................done
probing: pc0 com0 pci pxe![2.1][PXENV_GET_CACHED_INFO 3]
and an immediate lockup. bochs showed the CPU quite literally stuck at
the real-mode address 4012:403c; attempting to single-step to the next
instruction would not advance the program counter.
I added a BOCHSBP instruction in _pxe_in_call_far in realmode_asm.S and
examined the cpu state on entry to the first PXE API call
(PXENV_GET_CACHED_INFO), then stepped over the API call and started
single-stepping back into pxeboot. Five instructions later, I hit the
lockup point at 4012:403c. The instruction causing the problem is:
addrsize opsize lgdt [ds:0x45e80]
which is the line marked "Load the GDT" in the following code from
pxe_call.S in the OpenBSD source:
/*
* real_to_prot()
*
* Switch the processor back into protected mode.
*/
.globl real_to_prot
real_to_prot:
.code16
xorw %ax, %ax
movw %ax, %ds /* Load %ds so we can get at Gdtr */
data32 addr32 lgdt Gdtr /* Load the GDT */
...
Note the address [ds:0x45e80] that this resolves to in the pxeboot binary.
In particular, note that the offset contains five hexadecimal digits.
We're allegedly in real-mode at this point. We can't access more than 64k
in each segment, yet this instruction is trying to access data at an
offset of approximately 279k. The CPU doesn't like this.
I suspect that OpenBSD's pxeboot cheats and switches into flat real mode
(with 4GB limits) rather than genuine real mode (with 64kB limits). It
probably assumes that the PXE stack will not perform any mode switching,
and so assumes that when the PXE API call returns the CPU will still be in
flat real mode, and that it can get away with using addresses like
%ds:0x45e80. It shouldn't make this assumption.
Etherboot transitions into protected mode and back again whenever a PXE
API call is made. It exits in genuine real mode (64kB limits). Callers
should assume that this is the mode of the CPU upon exit from the PXE API
call (since callers are supposed to make the PXE API call in genuine real
mode).
The following patch to arch/i386/core/realmode_asm.S works around the
problem by changing Etherboot's real-mode GDT to be a flat real-mode GDT
instead of a genuine real-mode GDT:
--- arch/i386/core/realmode_asm.S 3 Feb 2005 08:34:57 -0000 1.12
+++ arch/i386/core/realmode_asm.S 1 Mar 2005 19:55:15 -0000
@@ -371,11 +371,11 @@
p2r_rmcs:
/* 16 bit real mode code segment */
.word 0xffff,(0&0xffff)
- .byte (0>>16),0x9b,0x00,(0>>24)
+ .byte (0>>16),0x9b,0x80,(0>>24)
p2r_rmds:
/* 16 bit real mode data segment */
.word 0xffff,(0&0xffff)
- .byte (0>>16),0x93,0x00,(0>>24)
+ .byte (0>>16),0x93,0x80,(0>>24)
p2r_gdt_end:
/* This is the end of the trampoline prefix code. When used
With this patch, I get as far as:
Me: 10.254.254.1, DHCP: 10.254.254.2, TFTP: 10.254.254.2, Gateway 10.0.0.6
Loading 10.254.254.2:pxeboot ...(PXE).....................................done
probing: pc0 com0 pci pxe![2.1] mem[639K 29M a20=on]
disk: fd0 fd1
net: mac fe:fd:00:00:00:01, ip 10.254.254.1, server 10.254.254.2
>> OpenBSD/i386 PXEBOOT 1.00
boot>
booting tftp:/bsd: open tftp:/bsd: No such file or directory
failed(2). will try /obsd
boot>
booting tftp:/obsd: open tftp:/obsd: No such file or directory
failed(2). will try /bsd.old
boot>
booting tftp:/bsd.old: open tftp:/bsd.old: No such file or directory
failed(2). will try /bsd
boot>
booting tftp:/bsd: open tftp:/bsd: No such file or directory
failed(2). will try /obsd
boot>
booting tftp:/obsd: open tftp:/obsd: No such file or directory
failed(2). will try /bsd.old
boot>
booting tftp:/bsd.old: open tftp:/bsd.old: No such file or directory
failed(2). will try /bsd
Turning timeout off.
boot>
i.e. it seems to now be working (except that I don't actually have a BSD
kernel for it to boot, only the pxeboot bootloader itself).
I don't think we should add this patch to Etherboot, because it's OpenBSD
that's broken and the proper solution is to fix OpenBSD. Do you want to
contact the people responsible for pxeboot and get them involved?
Michael
|