You can subscribe to this list here.
| 2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(16) |
Nov
(48) |
Dec
(11) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2004 |
Jan
(9) |
Feb
(97) |
Mar
(61) |
Apr
(28) |
May
(52) |
Jun
(45) |
Jul
(8) |
Aug
(14) |
Sep
(10) |
Oct
(1) |
Nov
|
Dec
(24) |
| 2005 |
Jan
(3) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(8) |
Oct
(9) |
Nov
(5) |
Dec
(7) |
| 2006 |
Jan
(18) |
Feb
(4) |
Mar
|
Apr
(15) |
May
(9) |
Jun
(2) |
Jul
|
Aug
|
Sep
(8) |
Oct
(16) |
Nov
(28) |
Dec
|
|
From: Daniel R. <cos...@gm...> - 2006-06-24 15:31:56
|
Hello, Todays update in a way completes the development of the hal-module: All devices required by the nucleus can now be accessed, and the hal-classes have reached a certain level of maturity. This of course doesn't mean that we wouldn't have to add some more details to the hal in the future. There's still the "excutive" missing, that is the part of the nucleus that responds to events (irqs, systemcalls, etc) and implements the most basic kernel mechanisms (ipc, scheduling, etc). It will be based on the hal aswell as the "object"-classes, which we'll thus have to develop now. The object-part basically defines the kernel's run-time memory-management by providing list and tree classes, which will be used to store the kernel's data-structures (access right, task-blocks, etc). As I'll from now on focus on the memory-classes, and thus won't touch the hal-code in quite some time, I decided to clean it up a bit. Hopefully the new, more object-oriented style will be a bit more readable: Not only for you, but also for me when I return to the hal-code in a few months ;) regards, Daniel |
|
From: Daniel R. <cos...@gm...> - 2006-06-02 21:57:45
|
Hello, I'll be on holiday in Italy during the next week and though that I might aswell commit the code I've been working on to the CVS. Unfortunately I ran out of time a bit and thus couldn't fully debug the code: It does work on bochs though, on real hardware there seem to be some minor problems. - support for sysenter/sysexit mechanism (ring3) - merged apic_error with the exception class - some general improvements to the old code In case that somebody wants to code a bit in the meantime there's (..for example..) still the GRUB memory map parsing class that we'll need sooner or later. In my opinion it shouldn't be too hard a task, especially if you take mp_detect.hpp - which too scans through some tables - as a base.. cheers, cosmo86 |
|
From: Daniel R. <cos...@gm...> - 2006-05-22 16:18:31
|
Hello Trionists, I finally found some time to fix a number of minor bugs and extended the kernel by some new functionality: - Added (very basic) handling for local APIC errors and spurious interrupts - APIC classes now use logical IDs only - Fixed a small bug that migth have lead to faulty APIC reactivations - Cleaned up the exception class, registers on the stack can now be accessed directly - IRQ class can detect an I/O APIC by probing, support for system that use ISA PIC only - Included support for hyperthreading enabled processors Unfortunately there are still some problems with the logical ID APIC code. As cluster-mode doesn't seem to be supported by modern CPU anymore, I was forced to use the more restrictive flat-mode, that only allows a maximum of 8 processors. Each CPU is assigned a unique 8 bit logical ID that can be calculated by (1 << physical_id). The ID is thus basically a bitmask with each bit representing a processor. If the APIC receives a message it compares (AND operation) its logical ID with the destination ID defined in the message. In case that the result is true the message is accepted and gets handled. All messages sent by the I/O APIC have a destination of 0xff, which basically means that any processors may handle it. As the IRQs are sent with delivery-mode 'lowest' the processors with the lowest privilege-level should take the message. The privilege-level is either determined by the current task-priority, that can be set by software (I however didn't touch it yet..), or by the priority of a pending interrupt that is currently handled. I then ran a test in which the first CPU receives an interrupt and blocks, thus remaining on a high priority-level. One would now expect that the next interrupt is handled by one of the remaining processors, which still run on the lowest privilege-level. Unfortunately this is not what's happening. Any subsequent interrupt is again dispatched to the first processor, which of course can't handle it as it hasn't yet acknowledged its earlier messages. The logical destination code itself however seems to work flawlessly: If I change the IRQ's destination ID to 0xFE the second processor handles it, and if I set it to 0xFB it's the third CPU that gets the message. As I've already spent quite some time location my mistake (without any notable success so far), I was wondering whether it wasn't possible that Bochs just doesn't simulate the details of APIC message delivery mechanism properly. Bochs' APIC support is still somewhat limited, and I've seen similiar things with error-handling, that doesn't seem to get emulated either (the error-handling code however does work on real hardware). Maybe someone could try the code on some multiprocessor or dual-core hardware ? All you would have to do is to hang the code at the _beginning_ of InterruptDestructor() in irq.cpp - if my code works each processor should print its own number, otherwise there's only one number printed after the "You may now type something" statement. Has anybody here already implemented logical APIC IDs before ? regards, cosmo86 I should probably warn you that the CVS address hostname has changed from "cvs.sourceforge.net" to "trion.cvs.sourceforge.net", it took me two days to find it out myself ;) |
|
From: Daniel R. <cos...@gm...> - 2006-05-09 16:13:38
|
Hello Manuel
thanks for the test-run.. Good to see that it also works on the more dated
hardware.
> There it is again (the version 03). I guess it's legitimate.
Either that or you two just happen to use mainboards with the same chipset
;). From what I can tell versions numbers 0x01 through 0x10 are reserved
for I/O APICs of the 486 era that didn't yet use the new APIC architecture
(XAPIC). Here's a snipped of the linux code that checks the version number
at boot-time:
if((reg_01.bits.version != 0x01) && /* 82489DX IO-APICs */
(reg_01.bits.version != 0x10) && /* oldest IO-APICs */
(reg_01.bits.version != 0x11) && /* Pentium/Pro IO-APICs */
(reg_01.bits.version != 0x13) && /* Xeon IO-APICs */
(reg_01.bits.version != 0x20)) /* Intel P64H (82806 AA) */
{
UNEXPECTED_IO_APIC();
}
I propably should add that UNEXPECTED_IO_APIC() is just some empty stub
procedure, so that the kernel still boots if the version is other than the
expected values.
regards,
cosmo86
http://lxr.linux.no/source/arch/i386/kernel/io_apic.c#L1302
|
|
From: Stephen M. W. <ste...@br...> - 2006-05-09 12:22:16
|
On 07/05/06 08:14, Manuel Hohmann wrote: > AMD Athlon XP 3000+ M 1 000F6570 2 03 17 There it is again (the version 03). I guess it's legitimate. -- Stephen M. Webb ste...@br... |
|
From: Manuel H. <mho...@ph...> - 2006-05-09 01:13:28
|
Hi Trionists,
I have also tested the kernel at several computers at home and at
university, here's the result:
M/U # Addr ID Ver Red
Dual Pentium 2, 333 MHz M 2 000F1400 2 11 17
Pentium 2, 350 MHz U 1 F FF FF
AMD Athlon, 500 MHz U 1 F FF FF
Pentium 4, 2400 MHz M 1 000F0000 4 20 17
AMD Athlon XP 3000+ M 1 000F6570 2 03 17
AMD Athlon 64 3200+ M 1 000F1400 2 11 17
As you can see, it also works on SMP hardware that is slightly older.
Regards,
Manuel
|
|
From: Stephen M. W. <ste...@br...> - 2006-05-04 18:41:11
|
On 04/05/06 13:15, Daniel Raffler wrote: > > It's a real IOAPIC all right. The mobo is a GA-K8VM800M (VIA chipset for > > socket 754). I suspect Gigabyte misinterpreted the IOAPIC specs, since > > they say the correct value is 11h which they may have though was in > > binary. > > As the I/O APIC is part of the chipset's south-bridge it's actually VIA > rather than Gigabyte that is to be blamed for the bug. In any case your > assumption does make some sense in my opinion, although I frankly wouldn't > have expected a global player like VIA to make such blunders. True. I have a VIA chipset in another computer (not at hand at the moment) that does not report anything unusual. Since noone seems to do much with the IO-APIC version number I don't suppose it really makes any difference. > > Note that the LAPIC address here is different that the one you > > detected. > > Actually the trion kernel doesn't even print the local APIC address, what > you posted is the location of the multiprocessor configuration table.. My mistake. I was so excitied about creating a bootable Trion CD I didn't bother checking the code tee see what I was actually writing down. Some of my test systems do not have floppies. -- Stephen M. Webb ste...@br... |
|
From: Daniel R. <cos...@gm...> - 2006-05-04 17:15:21
|
> It's a real IOAPIC all right. The mobo is a GA-K8VM800M (VIA chipset for > socket 754). I suspect Gigabyte misinterpreted the IOAPIC specs, since > they say the correct value is 11h which they may have though was in > binary. As the I/O APIC is part of the chipset's south-bridge it's actually VIA rather than Gigabyte that is to be blamed for the bug. In any case your assumption does make some sense in my opinion, although I frankly wouldn't have expected a global player like VIA to make such blunders. > Note that the LAPIC address here is different that the one you > detected. Actually the trion kernel doesn't even print the local APIC address, what you posted is the location of the multiprocessor configuration table.. regards, Daniel |
|
From: Stephen M. W. <ste...@xa...> - 2006-05-03 18:41:15
|
On 03/05/06 13:00, Daniel Raffler wrote:
> > Description M/U # LAPIC ID Ver Red
> >
> > AMD Sempron 2800+ M 1 000f0d00 2 03 17
> The
> only thing that causes me some headach is the version number returned by
> your Sempron 2800+, which should actually be something in between 0x10
> and 0x20 for an internal APIC. The two other values (ID and Red) seem to
> be perfectly sensible though..
It's a real IOAPIC all right. The mobo is a GA-K8VM800M (VIA chipset for
socket 754). I suspect Gigabyte misinterpreted the IOAPIC specs, since they
say the correct value is 11h which they may have though was in binary.
The first few lines of the Linux bootup (just for interests sake) is as
follows. Note that the LAPIC address here is different that the one you
detected.
Linux version 2.6.15-dcc-smp (root@amd64-coreos) (gcc version 3.3.5 (Debian
1:3.3.5-13)) #1 SMP PREEMPT Fri Mar 10 15:50:30 EST 2006
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000003bff0000 (usable)
BIOS-e820: 000000003bff0000 - 000000003bff3000 (ACPI NVS)
BIOS-e820: 000000003bff3000 - 000000003c000000 (ACPI data)
BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
ACPI: RSDP (v000 VIAK8 ) @ 0x00000000000f6b20
ACPI: RSDT (v001 VIAK8 AWRDACPI 0x42302e31 AWRD 0x01010101) @
0x000000003bff3000
ACPI: FADT (v001 VIAK8 AWRDACPI 0x42302e31 AWRD 0x01010101) @
0x000000003bff3040
ACPI: MADT (v001 VIAK8 AWRDACPI 0x42302e31 AWRD 0x01010101) @
0x000000003bff7740
ACPI: DSDT (v001 VIAK8 AWRDACPI 0x00001000 MSFT 0x0100000c) @
0x0000000000000000
On node 0 totalpages: 241262
DMA zone: 2917 pages, LIFO batch:0
DMA32 zone: 238345 pages, LIFO batch:31
Normal zone: 0 pages, LIFO batch:0
HighMem zone: 0 pages, LIFO batch:0
ACPI: PM-Timer IO Port: 0x4008
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 15:12 APIC version 16
ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 3, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Setting APIC routing to physical flat
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 40000000 (gap: 3c000000:c2c00000)
Checking aperture...
CPU 0: aperture @ e0000000 size 64 MB
SMP: Allowing 3 CPUs, 2 hotplug CPUs
Built 1 zonelists
Kernel command line: auto BOOT_IMAGE=Xandros_Server_1 ro root=306 quiet rw
acpi=on resume2=swap:/dev/hda5
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 131072 bytes)
time.c: Using 3.579545 MHz PM timer.
time.c: Detected 1607.471 MHz processor.
Console: colour dummy device 80x25
Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
Memory: 962932k/982976k available (2274k kernel code, 19344k reserved, 877k
data, 176k init)
Calibrating delay using timer specific routine.. 3222.83 BogoMIPS
(lpj=6445671)
Security Framework v1.0.0 initialized
SELinux: Initializing.
SELinux: Starting in permissive mode
selinux_register_security: Registering secondary module capability
Capability LSM initialized as secondary
Mount-cache hash table entries: 256
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 256K (64 bytes/line)
mtrr: v2.0 (20020519)
Using local APIC timer interrupts.
Detected 12.558 MHz APIC timer.
Brought up 1 CPUs
time.c: Using PIT/TSC based timekeeping.
testing NMI watchdog ... OK.
checking if image is initramfs...it isn't (no cpio magic); looks like an
initrd
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: Using configuration type 1
ACPI: Subsystem revision 20050902
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
ACPI-0412: *** Error: Handler for [SystemMemory] returned AE_AML_ALIGNMENT
ACPI-0508: *** Error: Method execution failed [\_SB_.PCI0._CRS] (Node
ffff810001ee4900), AE_AML_ALIGNMENT
ACPI-0156: *** Error: Method execution failed [\_SB_.PCI0._CRS] (Node
ffff810001ee4900), AE_AML_ALIGNMENT
Boot video device is 0000:01:00.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 6 7 10 *11 12)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 6 7 10 11 12) *5
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 6 7 *10 11 12)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 6 7 10 11 12) *0, disabled.
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 6 7 10 11 12) *0, disabled.
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 6 7 10 11 12) *0, disabled.
ACPI: PCI Interrupt Link [LNK0] (IRQs 3 4 6 7 10 11 12) *0, disabled.
ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 6 7 10 11 12) *0, disabled.
ACPI: PCI Interrupt Link [ALKA] (IRQs 20) *0, disabled.
ACPI: PCI Interrupt Link [ALKB] (IRQs 21) *0, disabled.
ACPI: PCI Interrupt Link [ALKC] (IRQs 22) *0, disabled.
ACPI: PCI Interrupt Link [ALKD] (IRQs 23) *0, disabled.
|
|
From: Daniel R. <cos...@gm...> - 2006-05-03 17:00:41
|
> Description M/U # LAPIC ID Ver Red > > AMD Sempron 2800+ M 1 000f0d00 2 03 17 > AMD Athlon 2200+ U 1 1 11 17 > AMD dual-core Opteron 165 M 2 000f0d00 0 11 17 > Intel Celeron M 1 000f0000 0 11 0f > Intel P4 M 1 000f12a0 1 20 17 Hi Stephen, thanks for having taken the time for a test-run ! It's good to see that the new local apic reactivation code doesn't only work on my computer but also on you uniprocessor Athlon. From the results it also looks as if the I/O APIC was much more common on modern computers than I've expected. The only thing that causes me some headach is the version number returned by your Sempron 2800+, which should actually be something in between 0x10 and 0x20 for an internal APIC. The two other values (ID and Red) seem to be perfectly sensible though.. Luckily the machine's BIOS however does support the multiprocessor table and will thus report if there's an I/O APIC. For uniprocessor systems that don't provide any multiprocessor tables we could fall back to a probing mechanism. I would propose that we simply try to read-out the I/O APIC's version number to make some sanity checks on it. If the number is between 0x10 and 0x20 we assume that it really is an I/O APIC, otherwise we'll have to retreat to the ISA PIC. It guess it should be unlikely enough that the bus-noise returned if there's no I/O APIC happens to be a valid version number ? Description M/U # ID Ver Red AMD Athlon 2600+ M 1 2 11 17 Intel Pentium 3 700 U 1 f ff ff Intel P4 2800 M 1 2 20 17 regards, cosmo86 |
|
From: Stephen M. W. <ste...@br...> - 2006-05-01 14:48:32
|
On 30/04/06 16:02, Daniel Raffler wrote: > > What I need to know is how many processors your computer has and if they > where all detected, if the APIC test was passed and whether an I/O APIC > was detected. Here's the result of tests on 5 different machines. Description M/U # LAPIC ID Ver Red AMD Sempron 2800+ M 1 000f0d00 2 03 17 AMD Athlon 2200+ U 1 1 11 17 AMD dual-core Opteron 165 M 2 000f0d00 0 11 17 Intel Celeron M 1 000f0000 0 11 0f Intel P4 M 1 000f12a0 1 20 17 Notes: M/U indicates if the kernel detected a multiprocessor configuration or not (uniprocessor). # indicates the number of processors sucessfully booted. LAPIC is the detected address of the local APIC. -- Stephen M. Webb ste...@br... |
|
From: Daniel R. <cos...@gm...> - 2006-04-30 20:06:51
|
Sorry guys, the url is actually: mitglied.lycos.de/cozmo86/example.png |
|
From: Daniel R. <cos...@gm...> - 2006-04-30 20:02:38
|
Hello, I've now added the APIC reactivation hack that I've mentioned a while ago to the code. It would be great if everybody could run the kernel on a few real computers, so that we get a better idea of how broadly the APIC is really supported. Based on this knowledge we can then decide if it's really necessary to write some backwards compatiblity code for PCs that don't support the APIC timer. What I need to know is how many processors your computer has and if they where all detected, if the APIC test was passed and whether an I/O APIC was detected. The local apic test sets-up a timer and then waits until it runs out. The timer length is 0x1000000 cycles, so that the test should be finished after a few seconds. If there is an local apic the I/O APIC test will follow, otherwise the computer just hangs. For the I/O APIC test the kernel simply assumes that the default location is used, and prints some of the registers that should be located there. If the APIC really exists the values should all be different: version should be between 0x10 and 0x20 and the number of redirection entries should be 0x17. In case that there's no I/O APIC the values printed will all be the same (in most cases either 0x00 or 0xff). I've uploaded an example screenshot to my webspace (mitglied.lycos.de/coszmo86/example.png) in order to give you an idea of how it should look if everything works as planned.. (Note that I just had some problems compiling the code with cygwin. In case that you too encounter some errors in cpu_info.hpp you should be able to fix them by simply declaring the whole class as public.) regards, cosmo86 |
|
From: Daniel R. <cos...@gm...> - 2006-04-23 19:39:22
|
> Using the INIT-STARTUP-STARTUP sequence finally works and boots up AMD > SimNow! The AP is booted up and interrupts are enabled. If I press some > keys, the scancodes are displayed. That's some excellent news, although I still don't quite understand were the problem really was. Bochs, aswell as my HT machine, start booting right after the first SIPI was sent, so that the second one is probably only meant to ensure that the BSP really waits until the AP has received the message. That's just a wild guess though, the Apic documentation just doesn't explain at all what the sequence really does. > I discovered that there is still a slight problem on uniprocessor bochs. > The current IRQ code relies on the presence of an I/O APIC - which does > in > general not exist on a UP system. The code tries to read one of the > registers, which is not mapped into virtual memory, causing a page fault. Hmm, that shouldn't be too hard to fix. During the weekend I played around with hyperthreading a bit and managed to start my second logical processor with merely some 10 lines of code - actually I did expect it to be way less trivial. Before the new feature can be added to the kernel, I'll however have to write some cpuid & msr instruction wrappers fist. As I'll be quite busy during the next few days (school starts again - yuck), I however don't expect to be able to write the code before the thursday. Apart from that I also collected some more information about local Apics on uniprocessors systems. It seems as if virtually all processors from the p54c (1994) onwards actually have a local Apic, that is however very often disabled by the BIOS. Luckily the Linux kernel includes a small hack that tries to activate the Apic by simply re-setting the very flag the BIOS cleared to disable it. According to the Intel Manuals this hack is actually not guaranteed to work, but I guess that there can't be that many problem with it, if it's in the official Linux source.. I've already tried to implement it for our kernel, but it doesn't yet work. After executing the hack the cpuid instruction does report a local Apic, but I can't access its registers. Most probably I'll just have to install a MTRR to mark the region as strong uncachable to make it work. regards, cosmo86 |
|
From: Manuel H. <mho...@ph...> - 2006-04-23 14:04:59
|
Hi Daniel, using the INIT-STARTUP-STARTUP sequence finally works and boots up AMD SimNow! The AP is booted up and interrupts are enabled. If I press some keys, the scancodes are displayed. I discovered that there is still a slight problem on uniprocessor bochs. The current IRQ code relies on the presence of an I/O APIC - which does in general not exist on a UP system. The code tries to read one of the registers, which is not mapped into virtual memory, causing a page fault. Regards, Manuel |
|
From: Daniel R. <cos...@gm...> - 2006-04-20 19:19:33
|
> I had a look at the MP specification and it says that there should > always be an INIT IPI at the beginning. I've tried that, too - without > any effect... Hi Manuel, I just looked it up and, according to the multiprocessor specifications, you are right here: A INIT IPI really seems to be needed before the AP may even accept a STARTUP IPI. Unless we want to support 486 processors, it's however not necessary to set the CMOS' warm reboot vector, as modern processors don't really reset on an INIT IPI. All they do is to return to real-mode, before they enter a wait-for-SIPI state. Apart from that the STARTUP IPI obviously has to be sent twice in a row (Intel Reference Manual, 7.5) ? The code I just added to the CVS now send an INIT-SIPI-SIPI sequence exactly as specified by the documentation. Apart from that I also fixed a possible bug in SendStartupIPI() by explicitly setting the assert-flag. Actually all IPI's, except for INIT de-assert, must have this flag set, although the Intel reference manual also states that modern CPUs should ignore it. If we're lucky AMD is just a bit more picky then the rest.. regards, cosmo86 |
|
From: Daniel R. <cos...@gm...> - 2006-04-19 17:45:15
|
Hello,
I've just uploaded the first part of the UML documentation I was working
on. I'm afraid that it's not really, but I nevertheless hope that it will
help you to get familiar witht the code. I'll add some more as soon as
possible, but as I'll be quite buisy for the next few week, I can't
promise anything.
I also had a look at the current code and made this small list of tasks:
/hal/apic_local: - add support for logical destinations (priorities, etc)
- catch error interrupts (no handling - just hang)
- handle spurious interrupts
- check if there's any way to enable the Apic on a
uniprocessor processors
if it was permanently disabled by the BIOS (warm reset
?,IRM 3: 8.4.3)
/hal/cpu_node: - implement basic task-switching
(todo) - based on tss (stack), ports (I/O bitmap), mmu (cr3) &
systemcall (ring0-3)
- implement scheduling mechanism
/hal/exception: - look up whether an EOI is needed
- provide facilities to route exceptions to the user tasks
/hal/ipi: - develop system for inter processor communication
- provide functions for the TLB shootdown algorithm
/hal/irq: - find some solution for the nested IRQs problem (line
200)
- design systemcalls that allow the dev manager to change
IRQ specs (polarity, level, mask)
- enable irq delivery to user-tasks
/hal/mmu: - implement the missing functions (dirty & access map,
contect switch)
- find some nicer solution for the paging flags
- develop a tlb shootdown algorithm
- make the whole class more objects oriented (built
around the page-directory)
/hal/ports: - same as for the mmu: make the class more object oriented
- implement the missing functionality (I/O bitmap) based
on the TSS class
/hal/systemcall: - design the systemcall interface (Nucleus API)
(todo) - implement a mechanism: interrupt or sysenter/sysexit
- define the final systemcall technique (message register
format, etc)
/hal/timer: - design system of timer delivery (scheduler only ? user
timers ?)
- enable interrupts for timer notifications
/hal/tss: - base class of ports.hpp (I/O bitmap) and cpu_node.hpp
(stack pointers)
(todo) - provides access to the TS segment's data
- creates a new TSS for each processor and initializes it
- no further abstraction for the gdt should be needed as
only TSS have to be added at run-time
/loader/main: - clean the kernel entrypoint up, sperate between
system-wide and per cpu object creation
/loader/grub: - class can be built based on mp_detect, which too parses
a table
(todo) - detection of available memory, VESA support (hack |
GRUB 2.0 | selfmade solution)
/object/*: - memory classes can be developed in parallel with the
hal objetcs
- both, hal objects and memory class, will be the
foundation of the multiplexion layer
In my opinion fixing the local_apic code and writing the GRUB class are
the most urgent tasks right now. The systemcall class should also be easy..
regards,
cosmo86
|
|
From: Daniel R. <cos...@gm...> - 2006-04-10 16:05:59
|
Hello, I just added a new tarball to the file-releases. It doesn't yet contain the I/O Apic code, as I ran into some conceptual problems, but it does print all entries of the multiprocessor table, so that the user can at least see something. After a small discussion with Brendan from the mega-tokyo board (www.mega-tokyo.com/forum/index.php?board=1;action=display;threadid=9311) it now seems to me as if the I/O APIC couldn't totaly replace the traditional PIC, at least not if we still want to receive ISA IRQs which are for example used by floppy, keyboard, mouse and RTC. Also remapping the I/O Apic seems to be much more complicated than I originally thought so that I'll have to reconsider the whole thing first. I'll also see if I can add some smaller coding tasks to the tracker list anytime soon. The idea is that such small and local assignments could be done quite easily without having to know all the details of the steadily growing kernel. This should help to make development more parallel as everybody can work on a private module. Apart from that I've started documenting the kernel with umbrello, but it will probably take some time until the diagrams are finished.. regards, Daniel |
|
From: Daniel R. <cos...@gm...> - 2006-04-09 14:26:50
|
> Unfortunately the kernel hangs on MP SimNow!... For some strange reason > the AP is not booted. Even worse: The timeout counter does not stop at > 0, > but instead overflows and continues below 0! Unless the kernel checks the > code in the very short moment when it passes 0, it will wait forever... I hope you made sure that the local apic's base-address is set-up correct. Right now the kernel always uses the default constructor, which assumes that the register base is 0xFEE00000. You can print the actual base address from mp_detect.cpp in line 93. If it's something different than the default, we just found the bug and all I'll have to do is to change the initialization routines a bit. Apart from that try doing with the LVTT what I described for the ICR yesterday.. regards, cosmo86 |
|
From: Manuel H. <mho...@ph...> - 2006-04-09 09:19:21
|
You are right, I updated the code on thursday and so I missed your updates from friday. Now the code works perfectly fine on both UP and MP bochs, on the latter it also boots up the APs and the print their messages. It also works om UP SimNow!. Unfortunately the kernel hangs on MP SimNow!... For some strange reason the AP is not booted. Even worse: The timeout counter does not stop at 0, but instead overflows and continues below 0! Unless the kernel checks the code in the very short moment when it passes 0, it will wait forever... This seems to be a bug in SimNow! - I'll try to figure that out. However, it should be possible to boot the AP - at least my kernel does. I had a look at the MP specification and it says that there should always be an INIT IPI at the beginning. I've tried that, too - without any effect... Regards, Manuel |
|
From: Daniel R. <cos...@gm...> - 2006-04-08 19:45:46
|
> I tested it on MP bochs and the first AP does not run into a triple fault
> just after booting anymore - I just found out that this happens because
> it is not even booted anymore... Instead, a page fault occurs on the BSP
> at
> 0xe0001a34, which is somewhere in mp_detect::DetectFloatingPointer. This
> is also reported correctly by the kernel. The same thing happens on UP
> bochs.
Since my last update on friday DetectFloatingPointer() no longer allocates
a big block of memory to search for the structure, but rather uses a
single page, that is assigned a new physical address on each loop
iteration. Although I had a really thorough look at the function I
couldn't find the problem, and thus now assume that it must be some bug in
either the heap_manager or the mmu class.
What you might want to try is to add a cout statement at the beginning of
the main loop (line 137) that prints the current virtual
(virt_range.GetVirtualBase()) aswell as the physical (pbase + i*4096)
address. While the physical address should increase in 4kB steps, the
virtual address must always be 0xE003F000 - if it isn't there's some
problem in the heap_manager.
Also check on which run of the loop the page-fault occures. If it does
work the first time but crashes on the second run, something must have
gone wrong when the page was assigned its new physical address.
I just had a look at the old kernel's page class, and the only real
difference I could find is, that it invalidates the TLB entry before the
new physical address gets written to it, while I do it the other way
round. On real hardware this doesn't make any difference, but if your
bochs loads the TLB right away (and not only once the page gets accessed),
it might simply miss the new mapping. You should therefore try to reverse
the order of these two instructions (/hal/mmu.cpp line 103 <-> 106).
> Even more interesting is the result on AMD SimNow!: The UP version boots
> correctly and reports that there is only one cpu present. On the MP
> version, our kernel correctly finds 2 cpus and reports that all AP's
> have been
> booted - but the AP does not print its "Hello world" message. (There was
> a small bug in mp_detect::BootAllProcessors - it always returned true.)
> In
> fact, the AP is still not booted.
Could it be that you're using a slightly outdated version of the code ?
The latest version uses an integer return value that either holds the
number of processors booted, or zero if there were some problems during
startup.
> IMHO the usage of bit fields with "empty bits" is quite dangerous because
> the values of these empty bits are undefined and may cause conflicts.
> However, I just checked the value of the ICR and it is absolutely
> correct - but the AP still does not respond. I'll keep on trying...
I would argue that it's not any worse than with traditonal flags. If you
want to make sure that the unused bits are all set to zero, all you have
to do it initialize the bitfield (lapic_icr icr = {0}). Regarding upwards
compatibility this is however hardly any better as it might aswell be that
some of the reserved bits actually have to be set in the future. To be on
the save side one would therefore have to read the register out first and
only mask those bits that really have to be altered. This however should
also be possible using a bitfield:
lapic_icr icr = GetICR()
You might nevertheless be right that the unused bits are part of the
problem, as the AMD's 64bit processors may use an updated APIC version
that requires new flags. Try what happends if you only mask those bits of
the ICR that are really necessary (Use lapic_icr icr = GetICR() then only
set vector, delivery_mode & dest_shorthand). Also check if it you can get
the code to work if you update the ICR's value manually:
46: WriteRegister(lapic_reg(ICR1), target << 24);
47: // SetICR(flags);
48: uint value = (ReadRegister(lapic_reg(ICR0) &FFF3F000) | 0x0608;
49: WriteRegister(lapic_reg(ICR0), value);
To exclude the faintest possibility that gcc for some reasons changes
memory ordering when the ICRs are accessed, you might try to declare the
local APIC's base address (/hal/apic_local line 152) as volatile.
cheers,
Daniel
|
|
From: Manuel H. <mho...@ph...> - 2006-04-08 11:28:55
|
> > Manuel wrote: > > I just downloaded the sources from CVS, compiled them and tested the > > kernel with bochs, it works fine. > > Ehm.. does this mean that the multiprocessor code works now, or did you > just run bochs as a uniprocessor. I tested it on MP bochs and the first AP does not run into a triple fault just after booting anymore - I just found out that this happens because it is not even booted anymore... Instead, a page fault occurs on the BSP at 0xe0001a34, which is somewhere in mp_detect::DetectFloatingPointer. This is also reported correctly by the kernel. The same thing happens on UP bochs. Even more interesting is the result on AMD SimNow!: The UP version boots correctly and reports that there is only one cpu present. On the MP version, our kernel correctly finds 2 cpus and reports that all AP's have been booted - but the AP does not print its "Hello world" message. (There was a small bug in mp_detect::BootAllProcessors - it always returned true.) In fact, the AP is still not booted. IMHO the usage of bit fields with "empty bits" is quite dangerous because the values of these empty bits are undefined and may cause conflicts. However, I just checked the value of the ICR and it is absolutely correct - but the AP still does not respond. I'll keep on trying... Regards, Manuel |
|
From: Daniel R. <cos...@gm...> - 2006-04-07 19:47:54
|
I'm right now working on some I/O APIC code that is meant to enable IRQ handling. With this new code the tarball should actually be a much more impressive, as we might easily add some basic keyboard support or a small clock for demonstration purposes. Since it doesn't look as if there were any bigger problems ahead, I'd expect to be able to finish coding on this weekend. Just think of it as a deadline: either I manage to get the new code working until monday or we'll release the tarball without it.. > Manuel wrote: > I just downloaded the sources from CVS, compiled them and tested the > kernel with bochs, it works fine. Ehm.. does this mean that the multiprocessor code works now, or did you just run bochs as a uniprocessor. regards, cosmo86 |
|
From: Manuel H. <man...@on...> - 2006-04-06 17:47:46
|
I just downloaded the sources from CVS, compiled them and tested the kernel with bochs, it works fine. I think this is a good check and indicates that the sources are ready for making a tarball. |
|
From: Victor A. <vic...@ly...> - 2006-04-06 17:39:54
|
Even if you consider it stable or not its nice to put it for download, so t= he file can be downloaded without cvs. I think that the new kernel is good = enough to be the next "target". --=20 _______________________________________________ Search for businesses by name, location, or phone number. -Lycos Yellow Pa= ges http://r.lycos.com/r/yp_emailfooter/http://yellowpages.lycos.com/default.as= p?SRC=3Dlycos10 |