Bug? with e820

  • davenportc

    davenportc - 2007-02-13

    I'm trying to use gujin's tiny.exe in a DOS environment to boot a sles9sp3 kernel. When I boot the system from PXE or a CD, the kernel is able to use the e820 memory map, but when I use gujin/tiny, it fails and falls back to e801. Debug Kernel output below, from a gujin boot, notice that it's used e801 to get memory, and that it's reporting only 256MB of memory (the machine has 8 gig).

    I think the reduced memory report from e801 is a ROM/BIOS issue, but the failure of the e820 map is only occurring when I use gujin to load the kernel/initrd.

    Any ideas?

    2.6.5-7.244-e820 (geeko@buildhost) (gcc version 3.3.3 (SuSE Linux)) #18 SMP Mon Feb 12 12:50:20 CST 2007
    naga: machine_specific_memory_setup: E820_MAP is 0xc048ed10, E820_MAP_NR is 0x0, ALT_MEM_K is 0x3fc00, EXT_MEM_K is 0x3c00
    naga: sanitize_e820: *pnr_map is: 0
    naga: copy_e820_map: nr_map is 0
    naga: copy_e820_map: nr_map is 0, so we return
    naga: machine_specific_memory_setup: copy failed. choosing e801. mem_size is 0x3fc00
    naga: machine_specific: e820.nr_map is now: 2
    BIOS-provided physical RAM map:
    BIOS-e801: 0000000000000000 - 000000000009f000 (usable)
    BIOS-e801: 0000000000100000 - 0000000010000000 (usable)
    naga: find_max_pfn: should traverse the E820 map since e820.nr_map is 2
    naga: find_max_pfn: should traverse the E820 map since e820.nr_map is 2
    naga: find_max_pfn: so max_pfn is 0x10000
    naga: setup_memory: start_pfn is 0x520, max_pfn is 0x10000
    naga: vmalloc reserve default is: 0x8000000, vmalloc reserve min is: 0x2000000, vmalloc reserve max is: 0x32000000
    naga: setup_memory: max pfn is lower, so vmalloc reserve is lower than kernel_maxmem
    naga: setup.c: 750MB vmalloc/ioremap area available.
    naga: find_max_low_pfn: too bad that max_low_pfn: 0x10000 is < MAXMEM_PFN: 0x111cb. Also, highmem_pages is 0xffffffff
    naga: find_max_low_pfn: so we return with max_low_pfn: 0x10000
    naga: setup_memory: start_pfn is 0x520, max_pfn is 0x10000, max_low_pfn is 0x10000
    naga: setup.c: 0MB HIGHMEM available.
    naga: setup.c: 256MB LOWMEM available.

    • Etienne LORRAIN

      Etienne LORRAIN - 2007-02-13

      Yes, I have seen that one - it is not really a bug of Gujin but a feature
      of the DOS driver HIMEM.SYS.
      The E820 interrupt is intercepted by HIMEM.SYS and returns an error to Gujin,
      so Gujin uses the working interrupt E801.
      If you can (i.e. do not use DOS=HIGH or some drivers needing HIMEM support),
      then you should remove the line "DEVICE=HIMEM.SYS" in config.sys.
      Some DOS environment provide HIMEM services even without the driver, I think
      FreeDOS does it - I am not sure it disables the E820 interrupt. You can check
      by using dbgload.exe instead of loading by tiny.exe and open the DBG file created
      after having booted a kernel (If you can recompile, you can also do "make lotiny.exe"
      to get the DBG file using lotiny.exe instead of dbgload.exe).


    • davenportc

      davenportc - 2007-02-13

      I played with the memory managers.. my current config.sys doesn't have those. I do remember when I tried to load himem.sys, it complained that there was an Extended Memory Manager already present. perhaps built into the 98SE version of MSDOS

      REM DEVICE=A:\NET\EMM386.EXE  noems i=b000-b7ff /y=a:\net\emm386.exe

      DOS command "ver" reports:
      Windows Millennium [Version 4.90.3000]

      I've seen the issue under other versions of DOS as well, though I don't know what versions those were offhand.

      I'm willing to try a different DOS for testing purposes, but I think I'm ultimately stuck with the one I've got because it's part of a prepackaged solution.

      Some A20/e820 information from dbgload.exe:

      set_A20 (enabled = 1): PS2_enable_A20: check support: returns 0x8601, bitfield 0x0 error.
      BIOS_enable_A20: get A20 handler: no handler.
      (note NVRAM@0x2D = 0x18 bit 2 set when Fast gate A20 operation enabled) initial inb(0x64) = 0x3C
      get_A20:  {flush_8042: [inb (0x64) = 0x3C] done in 0 loops.} timeout while reading I/O port.
      Read hard A20 1: 0xFFFFFFFF
      Checking Virtual86 mode, getsw: 0x10, First disabling fast A20 using outb (0x92, 0x2 & ~3)
      ========= later =============
      [HIMEM: present, using XMS] menu_load_system: using HIMEM.SYS freemem: 2094336 Kb, BIOS totalmem: 262144 Kb
      ========= later =============
      [EMM*X abscent] set_A20 (enabled = 1): PS2_enable_A20: check support: returns 0x8601, bitfield 0x0 error.
      BIOS_enable_A20: get A20 handler: no handler.
      (note NVRAM@0x2D = 0x18 bit 2 set when Fast gate A20 operation enabled) initial inb(0x64) = 0x1C
      get_A20:  {flush_8042: [inb (0x64) = 0x1C] done in 0 loops.} timeout while reading I/O port.
      Read hard A20 1: 0xFFFFFFFF
      Checking Virtual86 mode, getsw: 0x10, First disabling fast A20 using outb (0x92, 0x2 & ~3)
      ========= later =============
      vmlinuz_E820: try _BIOS_QueryMemoryMap (bufsize 18 at 0xE838):
      end with ret 2989, cont_val 0, bufsize 18
      0 E820_entries:
      end E820_entries.

    • Etienne LORRAIN

      Etienne LORRAIN - 2007-02-14

        The Windows Millennium DOS floppy disk seems to simply have incorporated HIMEM.SYS into DOS,
      and this driver hides some  memory for his own use, so do not want anybody else to access
      the real memory size - Gujin is not different than any other driver here.
        For the DBG extracts, the A20 stuff is the usual one (Gujin tries once to open A20 for the VESA2
      video system, and then simply retry to load the kernel - it is quick enough to not try to optimise).
      Sometimes, with new PCs, there is a BIOS handler installed so the motherboard + BIOS can handle
      either not having a way to close A20 or do it in an unusual way - that is not your case.

        The only thing "Windows Millennium DOS" tells is:
      using HIMEM.SYS freemem: 2094336 Kb, BIOS totalmem: 262144 Kb
        and vmlinuz_E820 disabled (0 entries) here:
      end with ret 2989, cont_val 0, bufsize 18
      0 E820_entries:
      end E820_entries.
        So Gujin cannot deduce that there is 8 Gbytes, it could maximum find 2 Gbytes.

        So there is two ways for you to boot, either give a mem=xxx kernel parameter, or do without
      this DOS - or without DOS altogether. Is there any reason not to use a Gujin floppy image,
      do you need a special network / USB driver not present in the BIOS? I do not know sles9sp3
      so do not know what is special for this distribution.


      • davenportc

        davenportc - 2007-02-15

        I only mention sles9sp3 so you can look at the kernel source if you need to. I'm told DOS is internally limited to seeing 2GB, so I'm willing to accept 2GB instead of 8.

        I've tried the "mem=2G" parameter, but it still showed the same e801 memory map and still paniced out of memory while trying to load a ~150MB initrd.

        I'm sort of stuck with the DOS environment I've got because it's part of a packaged solution.

        I managed to get a win95 DOS to not load himem.sys by using DOS=NOAUTO in config.sys, and now the gujin/dbgload.exe output shows me:
        [HIMEM: abscent, using direct memory access, USE_INT1587, no need to manage A20 now] menu_load_system: using BIOS freemem/totalmem: 262144 Kb

        Shouldn't gujin and the kernel be able to use the e820 map now that himem is gone?

        The system I'm working with has the following e820 map, taken from a ROM debugger:
        EAX=E820, EBX=0, ECX=14;EDX=534D4150
        RAM      00000000:00000000   00000000:0009F400 (637 KB)
        Reserved 00000000:0009F400   00000000:00000C00 (3 KB)
        Reserved 00000000:000F0000   00000000:00010000 (64 KB)
        RAM      00000000:00100000   00000000:0FF00000 (261120 KB)
        RAM      00000000:10000000   00000000:10000000 (262144 KB)
        RAM      00000000:20000000   00000000:5FE50000 (1571136 KB)
        ACPI     00000000:7FE50000   00000000:00008000 (32 KB)
        Reserved 00000000:7FE58000   00000000:001A8000 (1696 KB)
        Reserved 00000000:FEC00000   00000000:00100000 (1024 KB)
        Reserved 00000000:FEE00000   00000000:00010000 (64 KB)
        Reserved 00000000:FFC00000   00000000:00400000 (4096 KB)

        If I root through the gujin code, it looks like detect_bios_memory only returns the size of the usable region that starts at 0x100000, when in this case there is more memory available in the adjacent regions.

        So I've got a small code patch that should join contiguous regions:
        --- gujin-new/util.c    2007-02-04 12:55:04.000000000 -0600
        +++ gujin/util.c    2007-02-14 18:25:03.000000000 -0600
        @@ -718,25 +718,28 @@

           *extendmem = 0;
           struct e820map_info info;
           unsigned cont_val = 0;
        +  unsigned long long start_addr = 0x100000;

           UDBG (("_BIOS_QueryMemoryMap:"));

           while (_BIOS_QueryMemoryMap(&cont_val, sizeof (struct e820map_info), &info) != 0 && cont_val != 0) {
               UDBG (("\r\n    0x%llX+0x%llX,0x%X", info.base, info.length, (unsigned)info.type));
               if (info.type == MemAvailable) {
               unsigned infoKBsize = (unsigned)(info.length / 1024);
               if (info.base == 0 && infoKBsize < *basemem) {
                   UDBG ((": CORRECTING basemem to %u Kb", infoKBsize));
                   *basemem = infoKBsize;
        -      if (info.base == 0x100000) {
        +          if (info.base == start_addr) {
                   UDBG ((": extended memory size 0x%llX, i.e. %U Kb", info.length, infoKBsize));
        -          *extendmem = infoKBsize;
        +              *extendmem += infoKBsize;
        +              start_addr = info.base + info.length;

           if (*extendmem != 0) {

        Additionally, I've used the same ROM debugger to see what _BIOS_QueryMemoryMap returns, and verified that gujin is able to fetch the e820 map, but I keep seeing the following in the DBG output, regardless of himem:

           vmlinuz_E820: try _BIOS_QueryMemoryMap (bufsize 18 at 0xE838):
           end with ret 2989, cont_val 0, bufsize 18
           0 E820_entries:
           end E820_entries.

        I took a look at the code, and I think that bufsize is incorrect, it comes from here:
        vmlinuz.c:2291:   LnxParam->nb_E820_entries = vmlinuz_E820 (LnxParam->e820map, nbof (LnxParam->e820map));

        If I understand, then nbof(LnxParam->e820map) is the number of e820 entries, not the size of the buffer that contains them, and that line should be:
        vmlinuz.c:2291:   LnxParam->nb_E820_entries = vmlinuz_E820 (LnxParam->e820map, sizeof(LnxParam->e820map));

        Another question, in that vmlinuz_E820 function, vmlinuz.c:1765, the _BIOS_QueryMemoryMap second argument is the size of the buffer remaining, where I expect it to be sizeof(e820map_info). I can't tell if this is a bug or not.

        What do you think?

    • Etienne LORRAIN

      Etienne LORRAIN - 2007-02-15

      Well, shame on me / thanks a lot for the bug report!

      The first patch looks correct, I will not be able to test because I do not have that strange an E820 table;
      by the way do you know why it describes different contigous block like this?
      The second (nbof (LnxParam->e820map) -> sizeof(LnxParam->e820map)) looks like a big-bad-bug and its fix,
      I'll test as soon as this new compiler bootstrap is finished - can't reboot before.
      For the third thing, the E820 reference says that you can give a big buffer - you do not need to ask block
      per block, so as long as BIOS are compliant it is OK. On the first call of _BIOS_QueryMemoryMap(), the one
      of your patch, Gujin do not save the table so just ask block per block.

      I will do some more tests and hopefully release a new version at the end of the weekend.

      Thanks again,

    • davenportc

      davenportc - 2007-02-15

      According to the ROM developers, it was much easier to produce the memory map in this form. I suspect it has something to do with the way the AMD Rev. F processors address memory in a multiprocessor configuration.

      I'm seeing one other anomoly, when the e820 map comes back, it seems to be one entry short..

      vmlinuz.c:1764  and  util.c:728
        while (   bufsize >= sizeof(struct e820map_info)
              && (actual_len = _BIOS_QueryMemoryMap(&cont_val, sizeof(e820map_info), e820map))!=0
              && cont_val != 0 )

      It looks like the while loops are terminating early because the cont_val on the last entry is 0, but the call succeeded. The while loop exits when it evaluates cont_val immediately after calling _BIOS_QueryMemoryMap..

      Removing that condition of the while loop, and adding an if(cont_val == 0) break; at the end of the loop seems to resolve it.

        while (   bufsize >= sizeof(struct e820map_info)
              && (actual_len = _BIOS_QueryMemoryMap(&cont_val, sizeof(e820map_info), e820map))!=0
      ) {
      // loop stuff
      if(cont_val == 0)

    • Etienne LORRAIN

      Etienne LORRAIN - 2007-02-16

      That will be included too.
      Anything more you noticed, in memory management or elsewhere?


    • davenportc

      davenportc - 2007-02-16

      So far everything is working, gujin is very capable, more than loadlin or linld.

      I'm debugging an issue right now where there's an interrupt storm in the linux kernel which causes drivers to hang. Something about loading network drivers in the DOS environment before using gujin to load linux.

      I don't think gujin has anything to do with that. But it might be nice to be able to load the kernel/initrd then drop back to DOS or offer some other facility to unload things before jumping into the kernel.

      Can't really unload much from DOS anyway, and I don't really know the cause yet, but that's the only other issue I have to resolve.

    • Etienne LORRAIN

      Etienne LORRAIN - 2007-02-17

      Well, maybe the next release will be delayed a bit...
      I have already a version of tiny where you can run a DOS program after loading and before the switch to  protected mode (tiny /X=%CMDSPEC% works) - but I did not acheive to get parameters of this command to work - badly documented DOS.
      I suspect you just have to disable the network before starting Linux to stop the interrupt storm, something like "C:\&gt; net stop", and different people will have the same problem.
      I do not want to use command line parameter to the full "boot.exe" so I would like to simply detect that a network is running and call the interrupt to stop the network. There is quite a few interfaces
      described in "Ralf Brown's Interrupt List", do you know which ones you are using - i.e. which
      drivers do you use for the network? What is completely obsolete and what other people may use these days?
      If I implement the BIOS call to stop the interrupt, will you be willing to test it (before release) because I can't...


    • davenportc

      davenportc - 2007-02-20

      I don't know exactly which interrupt is being used on the DOS side, probably int 61 for the TCP/IP TSR, 21/ax=4402h for NDIS and Lan Manager.

      I followed the instructions on broadcom's website to create my DOS bootdisk. http://www.broadcom.com/support/ethernet_nic/faq_drivers.php#18

      I think it's a pretty standard approach to creating a network boot disk for use with windows shares. The same approach shows up all over a google search for same.

      I would be willing to test.

    • Etienne LORRAIN

      Etienne LORRAIN - 2007-02-22

      > I would be willing to test.

        I have some executables to test, but the address davenportc@users.sourceforge.net refuses
      *.exe and *.tgz files, do you have any another address?


    • davenportc

      davenportc - 2007-02-24

      I sent a message to you through sourceforge with an email address you can use. Haven't seen any confirmation of that.

      I did some testing, moved everything I was doing from the network share to a USB key so I could play with unloading the network before launching gujin/sles9sp3.

      "net stop" unloaded a fair amount, disconnected network drives, etc.
      I found the "unloadt.exe" program, which unloads elements of the DOS TSR Network stack. It was able to unload netbeui, tcp/ip, and lanman, all the components it reported as existing.

      The error still occurs. I hope that the NDIS 2.0 interrupt can be used to stop the card or unload the NIC driver.

    • Etienne LORRAIN

      Etienne LORRAIN - 2007-02-24

      Sorry, I did not receive anything. It seems that the redirector at sourceforge is not working as intended.
      I got E-mails saying you were unreacheable.
      Anyway I decided to release a new version because of the other things included - so you can test on v1.9.
      Please, I am still interrested to get the DBG file with your network - you can now get it with file lotiny.exe.


      -- etienne@gujin.org

    • davenportc

      davenportc - 2007-03-01

      Nothing new, just wanted to post the workaround that I found so that others can do the same possibly.

      The problem is that when using a NIC driver in a DOS environment before loading Gujin, the NIC stays enabled during the kernel boot and causes spurious interrupts. Kernel symptoms are a kernel message saying "Disabled IRQ#??", and things like "IRQ ??, nobody cared"

      By default, when the kernel loads, it disables all interrupts, then re-enables each one as drivers load and initialize devices. In this case, as soon as the linux kernel re-enables the IRQ line being used by the NIC, the kernel will get spurious intterupts on that line, and often disable it permanently. After that, any device that shares the interrupt line and requires the IRQ will not work properly. Note that the IRQ number in DOS  (PIC mode) is not the same as in linux (APIC mode)

      I tried several methods to stop or unload the DOS network drivers, "net stop", "unloadt", "unbind", but haven't been able to get the card to stop.

      The workaround is to force the linux kernel to load the NIC driver before any other driver whose devices shares that interrupt line. The NIC driver will usually reset the NIC and stop the spurious interrupts before enabling the IRQ. After that, other devices using that line can load normally.


Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks