Thread: Re: [Etherboot-developers] Q: does xstart work?
Brought to you by:
marty_connor,
stefanhajnoczi
|
From: <ke...@us...> - 2002-07-23 05:43:50
|
>Staring at the assembly in xstart there is the following code >snippet: > pushl %eax > ADDR32 LJMPI(_execaddr-_start16) >1: >... >_execaddr: > .long 0 > >What I don't get is how this snippet of code successfully jumps >to an application. ADDR32 should ljmpi should take a m16:32. >Where the offset which comes first is 32 bits, and the segment >that comes after is 16bits. I just don't see how that works when >we pass it a m16:16 in execaddr. _execaddr is filled in with a linear address. |
|
From: <ke...@us...> - 2002-07-23 07:02:55
|
>> > _execaddr is filled in with a linear address. > >My best guess is that the ADDR32 prefix is just ignored somewhere. >As a linear address doesn't make any sense in 16bit real mode. But it's in 32-bit PM at that point. |
|
From: Eric W B. <ebi...@ln...> - 2002-07-23 07:07:05
|
ke...@us... (Ken Yap) writes: > >> > _execaddr is filled in with a linear address. > > > >My best guess is that the ADDR32 prefix is just ignored somewhere. > >As a linear address doesn't make any sense in 16bit real mode. > > But it's in 32-bit PM at that point. Hmm. The LJMPI sure looks like it is in 32PM to me.... xstart: pushl %ebp movl %esp,%ebp pushl %ebx pushl %esi pushl %edi movl 8(%ebp),%eax movl %eax,_execaddr movl 12(%ebp),%ebx movl 16(%ebp),%ecx /* bootp record (32bit pointer) */ shll $12,%ecx /* convert to segment:offset form */ shrw $12,%cx call _prot_to_real .code16 pushl %ecx /* bootp record */ pushl %ebx /* file header */ movl $((RELOC<<12)+(1f-RELOC)),%eax pushl %eax ADDR32 LJMPI(_execaddr-_start) 1: addw $8,%sp /* XXX or is this 10 in case of a 16bit "ret" */ DATA32 call _real_to_prot .code32 popl %edi popl %esi popl %ebx popl %ebp ret _execaddr: .long 0 Eric |
|
From: <ke...@us...> - 2002-07-23 07:22:55
|
>> But it's in 32-bit PM at that point. > >Hmm. The LJMPI sure looks like it is in 32PM to me.... >xstart: > pushl %ebp > movl %esp,%ebp > pushl %ebx > pushl %esi > pushl %edi > movl 8(%ebp),%eax > movl %eax,_execaddr > movl 12(%ebp),%ebx > movl 16(%ebp),%ecx /* bootp record (32bit pointer) */ > shll $12,%ecx /* convert to segment:offset form */ > shrw $12,%cx > call _prot_to_real > .code16 > pushl %ecx /* bootp record */ > pushl %ebx /* file header */ > movl $((RELOC<<12)+(1f-RELOC)),%eax > pushl %eax > ADDR32 LJMPI(_execaddr-_start) >1: > addw $8,%sp /* XXX or is this 10 in case of a 16bit "ret" >*/ > DATA32 call _real_to_prot > .code32 No, you were right, it's in 16-bit RM at that point and so interprets the 32-bit word as segment:offset. For a moment I was worried that I was generating non-compliant tagged images in mknbi. I don't know what ADDR32 does in .code16 but nothing harmful apparently. Maybe it's needed for the indirection address. |
|
From: <ke...@us...> - 2002-07-23 08:12:10
|
>> ADDR32 LJMPI(_execaddr-_start) Incidentally if you feel motivated to convert this bit of data segment-modifying code to a push onto the stack followed by a ret, please do. |
|
From: Eric W B. <ebi...@ln...> - 2002-07-23 08:56:42
|
ke...@us... (Ken Yap) writes: > >> ADDR32 LJMPI(_execaddr-_start) > > Incidentally if you feel motivated to convert this bit of data > segment-modifying code to a push onto the stack followed by a ret, > please do. I will or something similiar, accessing the 32bit data segment from 16bit is a fairly bad thing to do. After staring at the problem of relocating the part of etherboot that must run in 16bit real mode I have finally come up with a clean design, that will be easy to convert to. - The 16bit code will normally live above 1MB. - The 16bit code will be pushed on the 16bit stack just before use. - _prot_to_real && _real_to_prot will be replaced by _real_call which preservs all registers except %esp, and %eax. _real_call pushes the 16bit code onto the real mode stack. An example from my test code. pushl $ 1f pushl $ 2f - 1f call _real_call 1: .code16 movl $0x88880000, %ebp shll $16, %ebx shll $16, %ecx shll $16, %edx shll $16, %esi shll $16, %edi ret .code32 2: Eric |
|
From: Eric W B. <ebi...@ln...> - 2002-07-23 11:40:10
|
Eric W Biederman <ebi...@ln...> writes:
> ke...@us... (Ken Yap) writes:
>
> > >> ADDR32 LJMPI(_execaddr-_start)
> >
> > Incidentally if you feel motivated to convert this bit of data
> > segment-modifying code to a push onto the stack followed by a ret,
> > please do.
>
> I will or something similiar, accessing the 32bit data segment
> from 16bit is a fairly bad thing to do.
O.k. I have all of the routines converted.
xstart16 still needs a little work to get the bootp data structure below
1MB. But othewise it should work fine.
The open issues are:
- How to copy/move bootp low for 16bit code.
- How to properly allocate the real mode stack.
I am thinking I can just use the initial stack and push
the code on there.
For reference the trampoline _real_call pushes on the real mode stack
is just 80 bytes. 80 bytes shouldn't overflow even a small stack.
The worst cases is meme820 which allocates a 280 byte buffer for the
memory map.
Eric
.globl xstart16
xstart16:
pushl %ebp
movl %esp,%ebp
pushl %ebx
pushl %esi
pushl %edi
movl 8(%ebp),%edx
movl 12(%ebp),%ebx
/* FIXME handle the bootp record */
movl 16(%ebp),%ecx /* bootp record (32bit pointer) */
shll $12,%ecx /* convert to segment:offset form */
shrw $12,%cx
pushl $ 10f
pushl $ 20f - 10f
call _real_call
.section ".text16"
10: .code16
popw %ax /* get the return ip addr */
pushl %ecx /* bootp record */
pushl %ebx /* file header */
pushw %cs /* Setup the far return address */
pushw %ax
pushl %edx /* Setup the far address to call */
lret /* Back into the routine I'm calling */
20: .code32
.previous
popl %edi
popl %esi
popl %ebx
popl %ebp
ret
|
|
From: <ke...@us...> - 2002-07-23 11:55:21
|
>xstart16 still needs a little work to get the bootp data structure below >1MB. But othewise it should work fine. > >The open issues are: >- How to copy/move bootp low for 16bit code. Also the segment descriptor array. Maybe this ties in with the memory allocator routines. Ask it for a chunk of memory below 1MB and memcpy the bootp and segment descriptor structures before calling xstart. You could also ask the memory allocator for a chunk for use as stack. I'm not so concerned about the text segment size of Etherboot when Etherboot is converted to run high. In the current version the main concern is the 48kB limitation. I think there are few machines that can't spare a couple of hundred kB from the top of memory during the image loading phase. One is not supposed to load memory to brim anyway, Linux will need room to expand when it gains control. |
|
From: Eric W B. <ebi...@ln...> - 2002-07-23 12:30:36
|
ke...@us... (Ken Yap) writes: > >xstart16 still needs a little work to get the bootp data structure below > >1MB. But othewise it should work fine. > > > >The open issues are: > >- How to copy/move bootp low for 16bit code. > > Also the segment descriptor array. Hmm. I don't quite follow you there. I only need 6 bytes for the global descriptor pointer, and then I can use a global descriptor table > 1MB. That part is working fine. > Maybe this ties in with the memory allocator routines. Ask it for a > chunk of memory below 1MB and memcpy the bootp and segment descriptor > structures before calling xstart. You could also ask the memory > allocator for a chunk for use as stack. Probably the fun part is keeping track of which areas have been allocated to the partially loaded image. > I'm not so concerned about the text segment size of Etherboot when > Etherboot is converted to run high. In the current version the main > concern is the 48kB limitation. I think there are few machines that > can't spare a couple of hundred kB from the top of memory during the > image loading phase. One is not supposed to load memory to brim anyway, > Linux will need room to expand when it gains control. Except when you have a ramdisk. But a sane implemenation will load the ramdisk low, and then push it up higher if required. Though I seem to like the insane varieties. I agree the uncompressed size is not a big deal. The compressed size continues to concern me. Even with LinuxBIOS I am loading it in a rom chip. A much bigger one than etherboot traditionally lives in true, but still a rom. And the primary reason I am working with etherboot is that it is small, and my other alternatives are not. So I don't want to mess that up. As far as stack size. I really only care there except memory < 1MB is a limited resource in high demand. What clicked for me a little while ago is that I don't have to keep anything permantnely below 1MB. I can just push all of my code on the real mode stack. And I worry a little if I am using someone else's stack that I will overflow it. The code is tight enough now that overflowing stacks is no longer an issue. Though it was quite fun reducing the footprint from 300 bytes to 80 :) Mostly I did it to combat my small observation that whenever I touch etherboot in a nontrivial the core gets bigger. O.k. I am so tired I am rambling. After I get some sleep I will check do a little cleanup and see what parts of my code I can check in. Eric |
|
From: Anselm M. H. <an...@ho...> - 2002-07-24 21:18:55
|
Hello Eric, list, EWB> Probably the fun part is keeping track of which areas have been allocated EWB> to the partially loaded image. >> I'm not so concerned about the text segment size of Etherboot when >> Etherboot is converted to run high. In the current version the main >> concern is the 48kB limitation. I think there are few machines that >> can't spare a couple of hundred kB from the top of memory during the >> image loading phase. One is not supposed to load memory to brim anyway, >> Linux will need room to expand when it gains control. EWB> Except when you have a ramdisk. But a sane implemenation will load EWB> the ramdisk low, and then push it up higher if required. Though I seem EWB> to like the insane varieties. As I see it, Eric, you used the last room for just beyond land's end for SLAM downloaded data stuff. BTW, I copied that idea, so the last [image-block-num]*[image-block-size] bytes of memory are used for that purpose in TFTM too. No problem to spare the last say 300kB, but one must not forget! Best regards, Anselm mailto:an...@ho... |
|
From: Eric W B. <ebi...@ln...> - 2002-07-25 03:56:35
|
Anselm Martin Hoffmeister <an...@ho...> writes: > Hello Eric, list, > > EWB> Probably the fun part is keeping track of which areas have been allocated > EWB> to the partially loaded image. > > >> I'm not so concerned about the text segment size of Etherboot when > >> Etherboot is converted to run high. In the current version the main > >> concern is the 48kB limitation. I think there are few machines that > >> can't spare a couple of hundred kB from the top of memory during the > >> image loading phase. One is not supposed to load memory to brim anyway, > >> Linux will need room to expand when it gains control. > > EWB> Except when you have a ramdisk. But a sane implemenation will load > EWB> the ramdisk low, and then push it up higher if required. Though I seem > EWB> to like the insane varieties. > > As I see it, Eric, you used the last room for just beyond land's end > for SLAM downloaded data stuff. BTW, I copied that idea, so the last > [image-block-num]*[image-block-size] bytes of memory are used for that > purpose in TFTM too. No problem to spare the last say 300kB, but one > must not forget! Agreed. That has been a FIXME in my code for a while. Getting etherboot to relocate itself was the interesting part of this adventure. When I get the code checked in there will be a lot of small little fixes that need to propogate through etherboot to fix this. Unless the multicast file has always loads into a fixed offset in ram there is a tradeoff between scaleability and memory usage. And I really don't see a practical way to fix that. In the normal case of increasing offsets going to increasingly higher addresses in ram the the memory overhead actually becomes a constant. This is because you can overwrite memory that you have already loaded to it's final location. Eric |
|
From: <ke...@us...> - 2002-07-23 12:47:46
|
>> Also the segment descriptor array.
>
>Hmm. I don't quite follow you there. I only need 6 bytes for the
>global descriptor pointer, and then I can use a global descriptor
>table > 1MB. That part is working fine.
The image is passed two arguments, a pointer to the tagged image header,
and a pointer to the bootp structure. The image header is needed for the
trampoline segment first-{linux,*dos}.S (in mknbi) to locate the
parameter string, among other things. You could drop support for
first-linux.S, but it would still be needed for first-*dos.S.
>I agree the uncompressed size is not a big deal. The compressed size
>continues to concern me. Even with LinuxBIOS I am loading it in
>a rom chip. A much bigger one than etherboot traditionally lives in
>true, but still a rom. And the primary reason I am working with
>etherboot is that it is small, and my other alternatives are not. So
>I don't want to mess that up.
Don't worry, I'm a memory skinflint too. :-)
|
|
From: Eric W B. <ebi...@ln...> - 2002-07-23 05:49:40
|
ke...@us... (Ken Yap) writes:
> >Staring at the assembly in xstart there is the following code
> >snippet:
> > pushl %eax
> > ADDR32 LJMPI(_execaddr-_start16)
> >1:
> >...
> >_execaddr:
> > .long 0
> >
> >What I don't get is how this snippet of code successfully jumps
> >to an application. ADDR32 should ljmpi should take a m16:32.
> >Where the offset which comes first is 32 bits, and the segment
> >that comes after is 16bits. I just don't see how that works when
> >we pass it a m16:16 in execaddr.
>
> _execaddr is filled in with a linear address.
For 32 bit code yes. But for 16bit code it is clearly a filled with
a segment offset to jump to.
From the nbi spec:
| |
| Initial Magic No. | 4 bytes
+---------------------+
| |
| Flags and length | double word
+---------------------+
| |
| Location Address | double word in ds:bx format
+---------------------+
| |
| Execute Address | double word in cs:ip format
+---------------------+
______________________________________________________________________
And Execute Address is the ultimate source of the address.
Eric
|
|
From: Eric W B. <ebi...@ln...> - 2002-07-23 06:11:11
|
Eric W Biederman <ebi...@ln...> writes: > ke...@us... (Ken Yap) writes: > > > >Staring at the assembly in xstart there is the following code > > >snippet: > > > pushl %eax > > > ADDR32 LJMPI(_execaddr-_start16) > > >1: > > >... > > >_execaddr: > > > .long 0 > > > > > >What I don't get is how this snippet of code successfully jumps > > >to an application. ADDR32 should ljmpi should take a m16:32. > > >Where the offset which comes first is 32 bits, and the segment > > >that comes after is 16bits. I just don't see how that works when > > >we pass it a m16:16 in execaddr. > > > > _execaddr is filled in with a linear address. My best guess is that the ADDR32 prefix is just ignored somewhere. As a linear address doesn't make any sense in 16bit real mode. Eric |