Thread: Re: [Etherboot-developers] Seperating out the x86 BIOS calls...
Brought to you by:
marty_connor,
stefanhajnoczi
|
From: <ke...@us...> - 2002-07-17 05:43:39
|
>Regarding the loader, I am starting to seperate out the decompressor >from the rest of the code, so I can use the decompressor when booted >under LinuxBIOS. Saving 6 out of 17k is pretty significant in my >book. What I am aiming at is an inplace (or at least in bss) >decompressor that runs in 32bit mode. There are some limits of the de/compression code, Markus can explain those. >If that works we can permanently move the decompressor out of loader.S >and possibly make that file readable, instead of #ifdef spagetti. Bearing in mind that the decompressor isn't really part of the running Etherboot image. It's the first part of the ROM image and the Etherboot image is appended onto that. When the ROM gains control, it does a "self-extract" so to speak. |
|
From: Markus G. <ma...@gu...> - 2002-07-17 06:33:55
|
Ken Yap wrote: > There are some limits of the de/compression code, Markus can explain > those. The decompressor is based on the code in contrib/compressor/lzhuf.c, but has been aggressively optimized for size. If you want to make any changes to the assembly code, you should probably first read the C source and try to understand it. The decompressor currently runs in real mode only. As such, it uses segment registers and cannot extract more than at most 64kB (there probably is a limit even before 64kB is reached, but I don't remember all the details). As we know that we are never going to deal with more than 64kB of data, we can also tell that the pattern frequency can never get bigger than 2^16 (in fact, it'll stay considerably smaller than that). Thus, I stripped out the reconst() function which would never get called. If you want to deal with larger images, you probably have to implement it, though. Also, I managed to shrink the code size, by compressing all the preinitialized tables (using some kind of run length encoding); and I stripped off a few more bytes by ordering these tables in a way so that once I have initialized one table my registers already have the right values that I will need for initializing the next one. Due to the small register file, there always is considerable register pressure when writing code for the x86. I managed to keep almost all data in registers, but this required me to split some of the registers into 4bit nybbles. You might need to keep this in mind if you have to make changes to the code. Markus -- Markus Gutschke 3637 Fillmore Street #106 San Francisco, CA 94123-1600 +1-415-567-8449 ma...@gu... |
|
From: Eric W B. <ebi...@ln...> - 2002-07-17 07:25:24
|
Markus Gutschke <ma...@gu...> writes: > Ken Yap wrote: > > There are some limits of the de/compression code, Markus can explain > > those. > > The decompressor is based on the code in contrib/compressor/lzhuf.c, > but has been aggressively optimized for size. If you want to make any > changes to the assembly code, you should probably first read the C > source and try to understand it. That is where I started. Though I was after more of a feel of the code than actual understanding. > The decompressor currently runs in real mode only. As such, it uses > segment registers and cannot extract more than at most 64kB (there > probably is a limit even before 64kB is reached, but I don't remember > all the details). As we know that we are never going to deal with more > than 64kB of data, we can also tell that the pattern frequency can > never get bigger than 2^16 (in fact, it'll stay considerably smaller > than that). Thus, I stripped out the reconst() function which would > never get called. If you want to deal with larger images, you probably > have to implement it, though. When I start dealing with larger images I will look. At the present my initial goal is to get it working for the same set of images as the existing code just running in protected mode. Since the addresses are flat I don't need the segment registers. > Also, I managed to shrink the code size, by compressing all the > preinitialized tables (using some kind of run length encoding); and I > stripped off a few more bytes by ordering these tables in a way so > that once I have initialized one table my registers already have the > right values that I will need for initializing the next one. > > Due to the small register file, there always is considerable register > pressure when writing code for the x86. I managed to keep almost all > data in registers, but this required me to split some of the registers > into 4bit nybbles. You might need to keep this in mind if you have to > make changes to the code. I will. At a first pass the only changes needed where to ensure the upper halves of the 32bit registers were 0 in the appropriate locations. Because in protected mode you can't index by (%si) or (%bx)... As for the register pressure on x86. I am very familiar with it having written several thousand lines of assembly with the purpose of enabling RAM, on various chipsets. And there I don't have memory so life is much more interesting. Thanks for the pointers, they ring true with my reading of the code so I can't be to far off. The challenging part is I have to redo how the BSS section is allocated. Addresses simply starting at 0, aren't very useful. It is refreshing to know that some of the more gratuituous changes from the C code were size optimizations. Both because that has been done, and because there was a reason for them :) Eric |
|
From: Eric W B. <ebi...@ln...> - 2002-07-19 17:37:38
|
Gak. I've been doing this to long. I converted the code to 32bit assembly and it worked on my very first try. Eric |
|
From: Markus G. <ma...@gu...> - 2002-07-19 18:53:01
|
Eric W Biederman wrote: > Gak. I've been doing this to long. I converted the code to 32bit assembly > and it worked on my very first try. I guess, it is time for you to take some vacation ;-) How about coming for LinuxExpo... Markus |
|
From: Eric W B. <ebi...@ln...> - 2002-07-19 19:14:45
|
Markus Gutschke <ma...@gu...> writes: > Eric W Biederman wrote: > > Gak. I've been doing this to long. I converted the code to 32bit assembly > > and it worked on my very first try. > > I guess, it is time for you to take some vacation ;-) How about coming for > LinuxExpo... Unfortunately I believe I have a scheduling conflict something about testing etherboot on 1000 nodes simultaneously. But other than that it would be fun. Eric |