etherboot-developers Mailing List for Etherboot (Page 234)
Brought to you by:
marty_connor,
stefanhajnoczi
You can subscribe to this list here.
| 2000 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
(10) |
Sep
(3) |
Oct
(10) |
Nov
(47) |
Dec
(20) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2001 |
Jan
(41) |
Feb
(107) |
Mar
(76) |
Apr
(103) |
May
(66) |
Jun
(72) |
Jul
(27) |
Aug
(31) |
Sep
(33) |
Oct
(18) |
Nov
(33) |
Dec
(67) |
| 2002 |
Jan
(25) |
Feb
(62) |
Mar
(79) |
Apr
(74) |
May
(67) |
Jun
(104) |
Jul
(155) |
Aug
(234) |
Sep
(87) |
Oct
(93) |
Nov
(54) |
Dec
(114) |
| 2003 |
Jan
(146) |
Feb
(104) |
Mar
(117) |
Apr
(189) |
May
(96) |
Jun
(40) |
Jul
(133) |
Aug
(136) |
Sep
(113) |
Oct
(142) |
Nov
(99) |
Dec
(185) |
| 2004 |
Jan
(233) |
Feb
(151) |
Mar
(109) |
Apr
(96) |
May
(200) |
Jun
(175) |
Jul
(162) |
Aug
(118) |
Sep
(107) |
Oct
(77) |
Nov
(121) |
Dec
(114) |
| 2005 |
Jan
(201) |
Feb
(271) |
Mar
(113) |
Apr
(119) |
May
(69) |
Jun
(46) |
Jul
(21) |
Aug
(37) |
Sep
(13) |
Oct
(4) |
Nov
(19) |
Dec
(46) |
| 2006 |
Jan
(10) |
Feb
(18) |
Mar
(85) |
Apr
(2) |
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
|
| 2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(10) |
Jul
(20) |
Aug
(9) |
Sep
(11) |
Oct
(4) |
Nov
(1) |
Dec
(40) |
| 2008 |
Jan
(19) |
Feb
(8) |
Mar
(37) |
Apr
(28) |
May
(38) |
Jun
(63) |
Jul
(31) |
Aug
(22) |
Sep
(37) |
Oct
(38) |
Nov
(49) |
Dec
(24) |
| 2009 |
Jan
(48) |
Feb
(51) |
Mar
(80) |
Apr
(55) |
May
(34) |
Jun
(57) |
Jul
(20) |
Aug
(83) |
Sep
(17) |
Oct
(81) |
Nov
(53) |
Dec
(40) |
| 2010 |
Jan
(55) |
Feb
(28) |
Mar
(36) |
Apr
(7) |
May
|
Jun
|
Jul
(7) |
Aug
|
Sep
|
Oct
(1) |
Nov
(3) |
Dec
|
| 2011 |
Jan
(1) |
Feb
|
Mar
(3) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(6) |
Oct
|
Nov
(10) |
Dec
|
| 2012 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2013 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|
From: <ke...@us...> - 2002-08-31 00:34:18
|
>The final issue is how to handle unitialized areas of an image, the >bss parts of a segment. For ELF it is required that they be zeroed, >and the zeroing is cheap and it increases reproducibility of behavior. >For tagged images we do all of the zeroing before we load anything to >ram so the zeroing it will not cause problems. If it is felt that the >zeroing is a problem or that the unitialized memory length check is >a problem we can simply not pass that information to prep_segment. As I said what do do with the gap between the image length and the memory length wasn't considered in the NBI spec. You are placing an interpretation on it that wasn't in its design. I may simply make them the same in future versions of mknbi, but Etherboot has to work with existing NBI images which are compliant. If you feel that strongly about zeroing RAM for NBI you can make it a compile option. If a loaded program depends on undefined areas being zero, I consider that a bug and zeroing it masks the bug. |
|
From: Eric W B. <ebi...@ln...> - 2002-08-30 23:21:41
|
ke...@us... (Ken Yap) writes: > >In this case the first 512 bytes was being placed at 0x92200. One segment > >of the NBI file had image from 0x90200 to 0x91600 and BSS from 91600 to > >9A000, so the BSS overlapped the 0x92200 region reserved for the > >directory. > > I think I understand what happened. There is a misapprehension here > that the difference between the image length and the memory length is > the BSS. There is no preassigned meaning such as BSS to this gap or to > any area. The meaning of BSS is data is an area of the image is not loaded from a file. > The loaded areas are just areas, not necessarily code, could > be data, disk images, ramdisk, whatever. Etherboot is just a memory > loader + a start address, not a program loader like an OS. It's not even > a Linux specific loader. Etherboot does not zero any areas. If in the > rare case that a segment is to zeroed before use, it should be arranged > by running code. Disk loaders don't zero memory before loading and > neither should Etherboot. Disk loaders for kernels do not represent the industry best practices. Loaders for executables in operating systems represent the industry best practices. And under an OS memory is zeroed. The practical problem we have run into with etherboot is people attempting to load images into areas where there is no ram, or the ram is reserved by a disk on chip. I have seen multiple occasions where bugs have popped on the etherboot-users list where checking the memory areas a tagged image uses will catch that. So I am convinced calling prep_segment on tagged images is the right thing to do. I do not really care about checking for overlapping areas, (besides etherboot). And as we have a way to make all existing images working there is no reason to implement checking at this time. Having overlapping images feels like playing with fire to me. But I see no need to force this view on others. The final issue is how to handle unitialized areas of an image, the bss parts of a segment. For ELF it is required that they be zeroed, and the zeroing is cheap and it increases reproducibility of behavior. For tagged images we do all of the zeroing before we load anything to ram so the zeroing it will not cause problems. If it is felt that the zeroing is a problem or that the unitialized memory length check is a problem we can simply not pass that information to prep_segment. Eric |
|
From: Michael B. <mb...@fe...> - 2002-08-30 21:58:54
|
On Sat, 31 Aug 2002, Ken Yap wrote: > >In this case the first 512 bytes was being placed at 0x92200. One segment > >of the NBI file had image from 0x90200 to 0x91600 and BSS from 91600 to > >9A000, so the BSS overlapped the 0x92200 region reserved for the > >directory. > I think I understand what happened. There is a misapprehension here > that the difference between the image length and the memory length is > the BSS. There is no preassigned meaning such as BSS to this gap or to > any area. The loaded areas are just areas, not necessarily code, could > be data, disk images, ramdisk, whatever. Etherboot is just a memory > loader + a start address, not a program loader like an OS. It's not even > a Linux specific loader. Etherboot does not zero any areas. If in the > rare case that a segment is to zeroed before use, it should be arranged > by running code. Disk loaders don't zero memory before loading and > neither should Etherboot. OK, so is the consensus that the gap area should never be zeroed? Michael Brown http://www.fensystems.co.uk |
|
From: Timothy L. <tl...@ro...> - 2002-08-30 21:06:09
|
> Having a simple assembly routine that dumps to the video display works > as well. The advantage of a null modem cable is that you capture > all of the output on another machine, with scroll back and other > nice amenities. And if you already have 2 machines it is quite > handy. I have the cable I will start trying to figure out the linux side. When I know what I am doing, I will let you know. Tim |
|
From: <ke...@us...> - 2002-08-30 17:39:19
|
>In this case the first 512 bytes was being placed at 0x92200. One segment >of the NBI file had image from 0x90200 to 0x91600 and BSS from 91600 to >9A000, so the BSS overlapped the 0x92200 region reserved for the >directory. I think I understand what happened. There is a misapprehension here that the difference between the image length and the memory length is the BSS. There is no preassigned meaning such as BSS to this gap or to any area. The loaded areas are just areas, not necessarily code, could be data, disk images, ramdisk, whatever. Etherboot is just a memory loader + a start address, not a program loader like an OS. It's not even a Linux specific loader. Etherboot does not zero any areas. If in the rare case that a segment is to zeroed before use, it should be arranged by running code. Disk loaders don't zero memory before loading and neither should Etherboot. True, the declared memory areas in the NBI image overlap, but this is a potential overlap, there is no actual overlap in loading. In the next version of mknbi I shall restrict the memory length of the setup segment to 8kB. The figure of 40 or so k comes from maximum allowed to setup.S in the Linux kernel. There's also a "overlap" between the kernel and the initrd because I arbitrarily assigned a memory length of 16MB to the kernel, but the initrd is loaded right after it. The memory length is used in exactly one place in 5.0.7's osloader.c, to compute the start address where the segment is specified to follow the previous one. In practice, only absolute segments have been used in mknbi and the other three types have not turned out to be of any use yet. So only absolute segments were supported in ELF. I think that osloader should do overlap checking only to protect Etherboot itself. There may be legitimate reasons why someone may want overlapping segments (real overlap, from the image length). Etherboot should butt out and let the user hang himself if he wants to. It's also good to exit with an error rather than loop on a truncated image but beyond protecting itself, Etherboot should let the user do silly things to memory if the user wishes to. I know Jamie Honan personally and I am sure that he was not thinking of BSS or anything like that when he wrote the NBI spec for the first netboot (written in MASM and supporting only the WD8003). Most likely had a vague notion that the memory length would serve to delimit areas without having to load all of it from the wire. |
|
From: Timothy L. <tl...@ro...> - 2002-08-30 16:10:02
|
> OK, it works. I can now boot a full Linux kernel from EB 5.1.x, using the > same image as for EB 5.0.x. Have checked in two changes to osloader.c, > both of which I'm not 100% sure about: I can confirm that the current cvs about 1300 Atlantic time fixed one of my problems. The Pentium is now able to boot a ltsp kernel start x and seems to work fine. I have not tried the 486 yet, but I am assuming that I will need to address that with the serial debugging. Tim |
|
From: <ke...@us...> - 2002-08-30 14:25:34
|
>so the problem looks to be that segment 4 (Kernel setup) is given a memory >length of 40448 when it should be 8192. Error in 5.1.2 interpreting the NBI spec. The original intention was that the memory length would specify how large the segment could grow to, but currently and for the forseeable future conveys no useful information and should be ignored. What should be loaded is just the image length. No need to prezero the area either, the spec doesn't require that. |
|
From: Michael B. <mb...@fe...> - 2002-08-30 14:15:12
|
On Fri, 30 Aug 2002, Ken Yap wrote:
> >In this case the first 512 bytes was being placed at 0x92200. One segment
> >of the NBI file had image from 0x90200 to 0x91600 and BSS from 91600 to
> >9A000, so the BSS overlapped the 0x92200 region reserved for the
> >directory.
> >This was generated with mknbi-1.2.7.
> That's not where mknbi places them. The Etherboot image goes to 0x94000
> and the bss follows the text and data.
> Please do a mknbi on the image to show what addresses and sizes the
> segments have.
> The load map is:
> 0x07C00-0x07FFF 0.5 kB floppy boot sector if loaded from floppy
> 0x0F???-0x0FFFF ? kB large Etherboot data buffers (deprecated)
> 0x10000-0x8FFFF 512.0 kB kernel (from tagged image)
> 0x90000-0x901FF 0.5 kB Linux floppy boot sector (from Linux image)
> 0x90200-0x921FF 8.0 kB kernel setup (from Linux image)
> 0x92200-0x923FF 0.5 kB tagged image header ("directory")
> 0x92400-0x927FF 1.0 kB kernel parameters (generated by mknbi)
> 0x92800-0x93FFF 6.0 kB this program (generated by mknbi)
> 0x94000-0x9FFFF 48.0 kB Etherboot (top few kB may be used by BIOS)
> Normally Etherboot starts at 0x94000
> 0x100000- kernel (if bzImage) (from tagged image)
> after bzImage kernel ramdisk (optional) (from tagged image)
> moved to below top of memory by this program
> but not higher than 896kB or what the
> limit in setup.S says
disnbi.pl reports:
Type: NBI
Header location: 9220:0000
Start address: 9280:0000
Flags:
Vendor data: mknbi-linux-1.2-7
Segment number 1
Load address: 00092800
Image length: 4608
Memory length: 6144
Position: Absolute
Vendor tag: 16
Segment number 2
Load address: 00092400
Image length: 512
Memory length: 2048
Position: Absolute
Vendor tag: 17
Segment number 3
Load address: 00090000
Image length: 512
Memory length: 512
Position: Absolute
Vendor tag: 18
Segment number 4
Load address: 00090200
Image length: 5120
Memory length: 40448
Position: Absolute
Vendor tag: 19
Segment number 5
Load address: 00100000
Image length: 882176
Memory length: 15728640
Position: Absolute
Vendor tag: 20
Segment number 6
Load address: 001d8000
Image length: 348160
Memory length: 348160
Position: Absolute
Vendor tag: 21
Vendor data: "
Vendor data in hex: 00 00 00 00
which represents, errors in arithmetic notwithstanding:
0x090000-0x0901ff Segment 3 (Linux floppy boot sector)
0x090200-0x0915ff Segment 4 (Kernel setup)
0x091600-0x099fff Segment 4 (empty)
********
0x092400-0x0925ff Segment 2 (Kernel parameters)
0x092600-0x0927ff Segment 2 (empty)
0x092800-0x0939ff Segment 1 ("this program")
0x093a00-0x093fff Segment 1 (empty)
0x100000- Segments 5 and 6
so the problem looks to be that segment 4 (Kernel setup) is given a memory
length of 40448 when it should be 8192.
Michael Brown
http://www.fensystems.co.uk
|
|
From: <ke...@us...> - 2002-08-30 13:35:03
|
>In this case the first 512 bytes was being placed at 0x92200. One segment
>of the NBI file had image from 0x90200 to 0x91600 and BSS from 91600 to
>9A000, so the BSS overlapped the 0x92200 region reserved for the
>directory.
>
>This was generated with mknbi-1.2.7.
That's not where mknbi places them. The Etherboot image goes to 0x94000
and the bss follows the text and data.
Please do a mknbi on the image to show what addresses and sizes the
segments have.
The load map is:
0x07C00-0x07FFF 0.5 kB floppy boot sector if loaded from floppy
0x0F???-0x0FFFF ? kB large Etherboot data buffers (deprecated)
0x10000-0x8FFFF 512.0 kB kernel (from tagged image)
0x90000-0x901FF 0.5 kB Linux floppy boot sector (from Linux image)
0x90200-0x921FF 8.0 kB kernel setup (from Linux image)
0x92200-0x923FF 0.5 kB tagged image header ("directory")
0x92400-0x927FF 1.0 kB kernel parameters (generated by mknbi)
0x92800-0x93FFF 6.0 kB this program (generated by mknbi)
0x94000-0x9FFFF 48.0 kB Etherboot (top few kB may be used by BIOS)
Normally Etherboot starts at 0x94000
0x100000- kernel (if bzImage) (from tagged image)
after bzImage kernel ramdisk (optional) (from tagged image)
moved to below top of memory by this program
but not higher than 896kB or what the
limit in setup.S says
|
|
From: Michael B. <mb...@fe...> - 2002-08-30 13:25:38
|
On Fri, 30 Aug 2002, Ken Yap wrote: > >In particular the first 512 bytes were overlapped by an unitialized > >memory region. > The convention is the first 512 bytes is the directory and must either > live inside Etherboot's bss or in the area reserved for it at 0x92200. > It cannot be not deposited in low memory as the kernel could go there. > As long as this directory doesn't get overwritten by following segments > that's ok. In this case the first 512 bytes was being placed at 0x92200. One segment of the NBI file had image from 0x90200 to 0x91600 and BSS from 91600 to 9A000, so the BSS overlapped the 0x92200 region reserved for the directory. This was generated with mknbi-1.2.7. Michael Brown http://www.fensystems.co.uk |
|
From: Ronald G M. <rmi...@la...> - 2002-08-30 13:08:21
|
On 29 Aug 2002, Eric W. Biederman wrote: > > > Does Linux need a special patch to run on the smartcore? Or how does > > > it calibrate the tsc? > > > > You know, I don't know. We just diagnosed the problem and applied this > > fix, and linux seemed to be happy. I never looked at this. Now you've made > > me curious :-) > > If Linux has something fairly universal for x86 we can use that, > otherwise I want a config option. If you mean for etherboot, that timer.c code I sent in has two modes op operation: - Plain 'ol TIMER2 mode, which is the default - The other mode, creating timing by POST to port 0x80, enabled by: CONFIG_NO_TIMER2 Does that fit the requirement? ron |
|
From: <ke...@us...> - 2002-08-30 12:57:02
|
>In particular the first 512 bytes were overlapped by an unitialized >memory region. The convention is the first 512 bytes is the directory and must either live inside Etherboot's bss or in the area reserved for it at 0x92200. It cannot be not deposited in low memory as the kernel could go there. As long as this directory doesn't get overwritten by following segments that's ok. |
|
From: <ke...@us...> - 2002-08-30 12:45:44
|
>O.k. I will look, but we did have an overlap, and that was the problem. >So some version of mknbi generated it. > >In particular the first 512 bytes were overlapped by an unitialized >memory region. mknbi knows the size of each of the segments and how much space they are allowed to have, it doesn't blindly copy everything. The only possibility for overlap is if the kernel is > 576kB which may not have been checked in the past, but in that case it won't load at all even in 5.0. mknbi also assumes that Etherboot is not using the first 576kB. If Etherboot is run from 0x20000, then a zImage kernel would obviously overwrite it. A bzImage kernel should be ok. This is the Perl mknbi I'm talking about. I take no responsibility for the old mknbi from netboot. |
|
From: <ebi...@ln...> - 2002-08-30 12:36:00
|
ke...@us... (Ken Yap) writes: > >O.k. so we have an image with overlapping segments. The tagged image > >specification does not mention them. As the order of processing > >the segments is not clearly defined we have an undefined case in > >the tagged image specification. > > No, mknbi doesn't generate overlapping segments. Have a look at the load > map in first32.c to see where everything should go. Also disnbi will show > you where the segments want to be. O.k. I will look, but we did have an overlap, and that was the problem. So some version of mknbi generated it. In particular the first 512 bytes were overlapped by an unitialized memory region. Eric |
|
From: <ebi...@ln...> - 2002-08-30 12:32:18
|
Michael Brown <mb...@fe...> writes: > OK - I can see only one potential problem with that: if an image is > corrupted by being incomplete (missing bytes from the end) then Etherboot > will (I think) lock up in an endless loop with len=0, eof=1 and > tctx.seglen != 0. That sounds right. So the question becomes how do we handle a truncated file cleanly. The cleanest solution looks like clearing eof if at the top of the file len == 0. (committed). I'd love to do it they way we handle it for elf and a.out files except those are prone to actually executing the truncated files which is another kind of problem. Eric |
|
From: <ke...@us...> - 2002-08-30 12:27:28
|
>O.k. so we have an image with overlapping segments. The tagged image >specification does not mention them. As the order of processing >the segments is not clearly defined we have an undefined case in >the tagged image specification. No, mknbi doesn't generate overlapping segments. Have a look at the load map in first32.c to see where everything should go. Also disnbi will show you where the segments want to be. |
|
From: Michael B. <mb...@fe...> - 2002-08-30 11:54:58
|
On 30 Aug 2002, Eric W. Biederman wrote: > > OK, it works. I can now boot a full Linux kernel from EB 5.1.x, using the > > same image as for EB 5.0.x. Have checked in two changes to osloader.c, > > both of which I'm not 100% sure about: > I have just put in some updated fixes, that solve the problems more > cleanly (assuming they work.) Sorry - it broke. I now get "segment [00000054, 92800054) overlaps etherboot [00020000, 0002FA30)" Found the problem - sh was being initialised to the start of the NBI header rather than the start of the segment table. Fixed, checked in, now works. > > 2. When load_block() is called with eof=1, os_download() gets called > > twice: once with the normal data block and once more as > > os_download(NULL,0,eof). This is how tagged_download() expects to be > > used. > > Q: Will this break the other *_download() routines? > It won't break them because they honor the eof flag, and so the > code will never be called. So mostly the second call is bloat. > I have instead changed the while loop in tagged download: > data += i; > - } while (len > 0) > + } while ((len > 0) || eof); > return 0; OK - I can see only one potential problem with that: if an image is corrupted by being incomplete (missing bytes from the end) then Etherboot will (I think) lock up in an endless loop with len=0, eof=1 and tctx.seglen != 0. Michael Brown http://www.fensystems.co.uk |
|
From: <ebi...@ln...> - 2002-08-30 11:30:08
|
Michael Brown <mb...@fe...> writes:
> OK, it works. I can now boot a full Linux kernel from EB 5.1.x, using the
> same image as for EB 5.0.x. Have checked in two changes to osloader.c,
> both of which I'm not 100% sure about:
I have just put in some updated fixes, that solve the problems more
cleanly (assuming they work.)
> 1. The BSS is no longer zeroed, in order to avoid vapourising the NBI
> header.
>
> Q: How important is it to zero the BSS? Linux seems to boot fine
> without it, haven't tried anything else.
Zeroing the BSS isn't terribly important (though it is speced for ELF
images). The nice attribute is that it gives predictable behavior.
Bugs that depend on the contents of memory are nasty to track.
The alternative solution is to call prep_segment before ``loading''
the first 512 bytes. In that case the zeroing does not matter. Plus
since we know all of the segments must fit in 512 bytes a more lock
resistant sanity check can be implemented.
> 2. When load_block() is called with eof=1, os_download() gets called
> twice: once with the normal data block and once more as
> os_download(NULL,0,eof). This is how tagged_download() expects to be
> used.
>
> Q: Will this break the other *_download() routines?
It won't break them because they honor the eof flag, and so the
code will never be called. So mostly the second call is bloat.
I have instead changed the while loop in tagged download:
data += i;
- } while (len > 0)
+ } while ((len > 0) || eof);
return 0;
And the tagged_download should honor the eof flag.
The real potential for problems come from other callers (like
the disk code), and differences in requirements between the probe
routines.
This is good debugging by the way, thanks.
Eric
|
|
From: Michael B. <mb...@fe...> - 2002-08-30 10:02:05
|
On Fri, 30 Aug 2002, Michael Brown wrote:
> > O.k. so we have an image with overlapping segments. The tagged image
> > specification does not mention them. As the order of processing
> > the segments is not clearly defined we have an undefined case in
> > the tagged image specification.
> > It is quite reasonable to forbid overlapping segments, in etherboot
> > because it makes no sense to put complicated logic into a network
> > bootloader, and it makes even less sense to rely on complicated
> > logic being present. The ELF spec even forbids it for normal
> > executables.
> > Given that there are bad images out there, putting detection
> > of them into prep_segment sounds like the only reasonable option.
> > I only did not do it the first time because the logic seemed
> > complicated, and unneeded. If the logic gets hary we can compile it
> > out.
> > Also mknbi needs to be fixed to generate valid images.
> These are bad images that work with Etherboot 5.0.x, however. I think it
> would be best to try to ensure that any image that worked with Etherboot
> 5.0.x also works with 5.1.x.
> > A quick fix that we can use to verify that everything else
> > is working is to comment out the memset in prep_segment.
> Already tried this and it gets one stage further. There's still one
> (hopefully final) obstacle: although the image is now loaded correctly, it
> never gets executed because tagged_download() never gets called after EOF.
> Under 5.0.x, os_download() got called one final time after EOF. Have
> re-added this code to osloader.c and am in process of testing and checking
> in...
OK, it works. I can now boot a full Linux kernel from EB 5.1.x, using the
same image as for EB 5.0.x. Have checked in two changes to osloader.c,
both of which I'm not 100% sure about:
1. The BSS is no longer zeroed, in order to avoid vapourising the NBI
header.
Q: How important is it to zero the BSS? Linux seems to boot fine
without it, haven't tried anything else.
2. When load_block() is called with eof=1, os_download() gets called
twice: once with the normal data block and once more as
os_download(NULL,0,eof). This is how tagged_download() expects to be
used.
Q: Will this break the other *_download() routines?
Michael Brown
http://www.fensystems.co.uk
|
|
From: Michael B. <mb...@fe...> - 2002-08-30 09:54:39
|
On 30 Aug 2002, Eric W. Biederman wrote: > > > Then the problem is in tagged_probe? The final segment marker does not > > > appear, and it must be in that first segment. > > > Is it simply the algorithm for incrementing sh that is incorrect? > > > Does tagged probe generate incorrect code? > > > Is the memcpy bad? > > > Sorry I am quite mystified why this doesn't work. > > OK - tracked down the problem but not sure yet how to solve it. > > Problem occurs when sh=0x92254, with a segment that has > > loadaddress = 0x90200, imagelength = 0x1400, memlength = 0x9e00 > > After prep_segment() runs on this segment, the memory at sh has been > > zeroed, hence the rest of the segment list is missing and the routine > > aborts because sh->length == 0. > > It looks as though the initial calculation of tctx.segaddr must be > > incorrect; a segment wants to be loaded at a point in virtual memory that > > is already allocated. Should prep_segment() catch this and report an > > error? > Yes. Thinking and justifications follow. > O.k. so we have an image with overlapping segments. The tagged image > specification does not mention them. As the order of processing > the segments is not clearly defined we have an undefined case in > the tagged image specification. > It is quite reasonable to forbid overlapping segments, in etherboot > because it makes no sense to put complicated logic into a network > bootloader, and it makes even less sense to rely on complicated > logic being present. The ELF spec even forbids it for normal > executables. > Given that there are bad images out there, putting detection > of them into prep_segment sounds like the only reasonable option. > I only did not do it the first time because the logic seemed > complicated, and unneeded. If the logic gets hary we can compile it > out. > Also mknbi needs to be fixed to generate valid images. These are bad images that work with Etherboot 5.0.x, however. I think it would be best to try to ensure that any image that worked with Etherboot 5.0.x also works with 5.1.x. > A quick fix that we can use to verify that everything else > is working is to comment out the memset in prep_segment. Already tried this and it gets one stage further. There's still one (hopefully final) obstacle: although the image is now loaded correctly, it never gets executed because tagged_download() never gets called after EOF. Under 5.0.x, os_download() got called one final time after EOF. Have re-added this code to osloader.c and am in process of testing and checking in... Michael Brown http://www.fensystems.co.uk |
|
From: <ebi...@ln...> - 2002-08-30 09:25:24
|
Michael Brown <mb...@fe...> writes: > On 30 Aug 2002, Eric W. Biederman wrote: > > > > >And then the big mystery. NBI's appear to work for you, and not for > > > > >others. Why and where their first packet is getting corrupted is a > > > > >mystery. > > > > I think it's because I'm using mkelf-linux and they are using > > > > mknbi-linux. If that's the case then I think the FreeDOS image will crash > > > > the same way as theirs (DOS has to be a real mode entry image). > > > OK, there's no corruption in the first packet. I've added a little hex > > > dump utility to misc.c - call as hex_dump(data,len) and it'll produce a > > > nicely formatted hex dump of an area of memory, complete with a more-style > > > pager. It's #ifdeffed on DEBUG_UTILS, so no worries about code bloat. > > > Hex dump of the 512-byte data block passed to tagged_probe() shows that it > > > is identical to the first 512 bytes of the .nbi file, so the drivers are > > > totally ruled out as a source of the problem. > > Then the problem is in tagged_probe? The final segment marker does not > > appear, and it must be in that first segment. > > Is it simply the algorithm for incrementing sh that is incorrect? > > Does tagged probe generate incorrect code? > > Is the memcpy bad? > > Sorry I am quite mystified why this doesn't work. > > OK - tracked down the problem but not sure yet how to solve it. > > Problem occurs when sh=0x92254, with a segment that has > loadaddress = 0x90200, imagelength = 0x1400, memlength = 0x9e00 > > After prep_segment() runs on this segment, the memory at sh has been > zeroed, hence the rest of the segment list is missing and the routine > aborts because sh->length == 0. > > It looks as though the initial calculation of tctx.segaddr must be > incorrect; a segment wants to be loaded at a point in virtual memory that > is already allocated. Should prep_segment() catch this and report an > error? Yes. Thinking and justifications follow. O.k. so we have an image with overlapping segments. The tagged image specification does not mention them. As the order of processing the segments is not clearly defined we have an undefined case in the tagged image specification. It is quite reasonable to forbid overlapping segments, in etherboot because it makes no sense to put complicated logic into a network bootloader, and it makes even less sense to rely on complicated logic being present. The ELF spec even forbids it for normal executables. Given that there are bad images out there, putting detection of them into prep_segment sounds like the only reasonable option. I only did not do it the first time because the logic seemed complicated, and unneeded. If the logic gets hary we can compile it out. Also mknbi needs to be fixed to generate valid images. A quick fix that we can use to verify that everything else is working is to comment out the memset in prep_segment. Eric |
|
From: Michael B. <mb...@fe...> - 2002-08-30 08:23:12
|
On 30 Aug 2002, Eric W. Biederman wrote: > > > >And then the big mystery. NBI's appear to work for you, and not for > > > >others. Why and where their first packet is getting corrupted is a > > > >mystery. > > > I think it's because I'm using mkelf-linux and they are using > > > mknbi-linux. If that's the case then I think the FreeDOS image will crash > > > the same way as theirs (DOS has to be a real mode entry image). > > OK, there's no corruption in the first packet. I've added a little hex > > dump utility to misc.c - call as hex_dump(data,len) and it'll produce a > > nicely formatted hex dump of an area of memory, complete with a more-style > > pager. It's #ifdeffed on DEBUG_UTILS, so no worries about code bloat. > > Hex dump of the 512-byte data block passed to tagged_probe() shows that it > > is identical to the first 512 bytes of the .nbi file, so the drivers are > > totally ruled out as a source of the problem. > Then the problem is in tagged_probe? The final segment marker does not > appear, and it must be in that first segment. > Is it simply the algorithm for incrementing sh that is incorrect? > Does tagged probe generate incorrect code? > Is the memcpy bad? > Sorry I am quite mystified why this doesn't work. OK - tracked down the problem but not sure yet how to solve it. Problem occurs when sh=0x92254, with a segment that has loadaddress = 0x90200, imagelength = 0x1400, memlength = 0x9e00 After prep_segment() runs on this segment, the memory at sh has been zeroed, hence the rest of the segment list is missing and the routine aborts because sh->length == 0. It looks as though the initial calculation of tctx.segaddr must be incorrect; a segment wants to be loaded at a point in virtual memory that is already allocated. Should prep_segment() catch this and report an error? Michael Brown http://www.fensystems.co.uk |
|
From: <ebi...@ln...> - 2002-08-30 08:16:47
|
Michael Brown <mb...@fe...> writes: > On Fri, 30 Aug 2002, Ken Yap wrote: > > >And then the big mystery. NBI's appear to work for you, and not for > > >others. Why and where their first packet is getting corrupted is a > > >mystery. > > I think it's because I'm using mkelf-linux and they are using > > mknbi-linux. If that's the case then I think the FreeDOS image will crash > > the same way as theirs (DOS has to be a real mode entry image). > > OK, there's no corruption in the first packet. I've added a little hex > dump utility to misc.c - call as hex_dump(data,len) and it'll produce a > nicely formatted hex dump of an area of memory, complete with a more-style > pager. It's #ifdeffed on DEBUG_UTILS, so no worries about code bloat. > > Hex dump of the 512-byte data block passed to tagged_probe() shows that it > is identical to the first 512 bytes of the .nbi file, so the drivers are > totally ruled out as a source of the problem. Then the problem is in tagged_probe? The final segment marker does not appear, and it must be in that first segment. Is it simply the algorithm for incrementing sh that is incorrect? Does tagged probe generate incorrect code? Is the memcpy bad? Sorry I am quite mystified why this doesn't work. Also we should have: if (!(sh->flags & 0x04)) return 0; After the for loop to not even attempt to load an image that we cannot execute. Eric |
|
From: Michael B. <mb...@fe...> - 2002-08-30 08:02:08
|
On Fri, 30 Aug 2002, Ken Yap wrote: > >And then the big mystery. NBI's appear to work for you, and not for > >others. Why and where their first packet is getting corrupted is a > >mystery. > I think it's because I'm using mkelf-linux and they are using > mknbi-linux. If that's the case then I think the FreeDOS image will crash > the same way as theirs (DOS has to be a real mode entry image). OK, there's no corruption in the first packet. I've added a little hex dump utility to misc.c - call as hex_dump(data,len) and it'll produce a nicely formatted hex dump of an area of memory, complete with a more-style pager. It's #ifdeffed on DEBUG_UTILS, so no worries about code bloat. Hex dump of the 512-byte data block passed to tagged_probe() shows that it is identical to the first 512 bytes of the .nbi file, so the drivers are totally ruled out as a source of the problem. Michael Brown http://www.fensystems.co.uk |
|
From: <ebi...@ln...> - 2002-08-30 06:06:18
|
ke...@us... (Ken Yap) writes: > >Cool. Next round since the relocation basically works for you, as you > >can use a comppressed etherboot on your machine, could you add > >-DRELOCATION in the config file. That should fix the clash in low > >memory. > > Will do. > > >And then the big mystery. NBI's appear to work for you, and not for > >others. Why and where their first packet is getting corrupted is a > >mystery. > > I think it's because I'm using mkelf-linux and they are using > mknbi-linux. If that's the case then I think the FreeDOS image will crash > the same way as theirs (DOS has to be a real mode entry image). O.k. That makes sense. A big thanks to you, and everyone else helping with 5.1.2+ It is very nice to get good feedback, on a developer codebase. Eric |