From: Eric K. <ek...@rz...> - 2002-08-30 22:16:28
|
Hi! I've got some good news for you. Today I started working on SetupLdr (modified FreeLoader) and Isoboot (CD bootsector) again. After a lot of useless attemps I found out that the BIOS of one of my machines got a bug that is not correctly handled by the current Isoboot code. Even the latest SYSLINUX release freezes this machine. The problem is that the read buffer in a call to read cd-sectors must not cross segment boundaries. I reduced the sector count limit and now SetupLdr boots from CD. ;-) I planned to make a few more tests and commit the changes tomorrow. So if anyone wants to have an iso-image, let me know. Eric |
From: Royce M. I. <ro...@ev...> - 2002-08-31 21:04:13
|
Okay, I finally had a little bit of time to play around with ReactOS again, so I downloaded the latest CVS, compiled with DBG=1, and it's still crashing, but now, at least, I have been able to produce a valid stack trace. I took the screen dump, and did an addr2line on each entry. Here are my results: -------------------------------------------------------------------------------- Here is the line that's crashing: portio.c:68 : __asm__ __volatile__ ("cld ; rep ; insw" : "=D" (Buffer), "=c" (Count) : "d" (Port),"0" (Buffer),"1" (Count)); Here's my results on the stack trace: <hal.dll: 4228> portio.c:68 READ_PORT_BUFFER_USHORT -> see the code pasted above <atapi: 1a3c> atapi.c:782 AtapiInterrupt() calling IDEReadBlock() /* defined as READ_PORT_BUFFER_USHORT in ide.h */ <scsiport: 25b1> scsiport.c:1303 ScsiPortIsr() calling DeviceExtension->HwInterrupt() <ntoskrnl.exe: 2388> irq.c:423 KiInterruptDispatch2() calling isr->ServiceRoutine() <hal.dll: 5694> irql.c:94 HalpExecuteIrqs() calling KiInterruptDispatch2() <hal.dll: 56e8> irql.c:108 HalpLowerIrql() calling HalpExecuteIrqs() <hal.dll: 57f3> irql.c:164 KfLowerIrql() calling HalpLowerIrql() <hal.dll: 580f> irql.c:188 KeLowerIrql() calling KfLowerIrql() <ntoskrnl.exe: ec7f> spinlock.c:50 KeSynchronizeExecution() calling KeLowerIrql() <scsiport: 1de6> scsiport.c:956 ScsiPortStartIo() calling KeSynchronizeExecution() <ntoskrnl.exe: 39d73> queue.c:133 IoStartPacket() calling DeviceObject->DriverObject->DriverStartIo() <scsiport: 19f5> scsiport.c:736 ScsiPortDispatchScsi() calling IoStartPacket() <ntoskrnl.exe: 36d04> irp.c:130 IofCallDriver() calling DriverObject->MajorFunction[Param->MajorFunction]() <ntoskrnl.exe: 36d2b> irp.c:141 IoCallDriver() calling IofCallDriver <ntoskrnl.exe: 31d6e> device.c:531 IopInitializeDriver() calling DriverEntry() <ntoskrnl.exe: 42191> loader.c:538 LdrInitializeBootStartDriver() calling IopInitializeDriver() <ntoskrnl.exe: d507> main.c:513 ExpInitializeExecutive() calling LdrInitializeBootStartDriver() <ntoskrnl.exe: d641> main.c:593 KiSystemStartup() calling ExpInitializeExecutive() <ntoskrnl.exe: dbbc> main.c:752 _main() calling KiSystemStartup() <ntoskrnl.exe: 126c> invalid :) -------------------------------------------------------------------------------- Here's the screen dump itself: 000 (ke/main.c:436) Module: 'system32\ntoskrnl\ntoskrnl.sym' at c0115000, length 0x0 0002000 (ke/main.c:436) Module: 'hal\halx86\hal.sym' at c0117000, length 0x0003d000 (ke/main.c:436) Module: 'drivers\storage\scsiport\scsiport.sym' at c0154000, len gth 0x00030000 (ke/main.c:436) Module: 'drivers\storage\atapi\atapi.sym' at c0154000, length 0x 00031000 (ke/main.c:436) Module: 'drivers\storage\class2\class2.sym' at c01b5000, length 0x00031000 (ke/main.c:436) Module: 'drivers\fs\vfat\vfstfs.sym' at c0216000, length 0x0003f 000 (ke/main.c:484) Process registry chunk at c0113000 (ke/main.c:512) Initializing driver 'system32\drivers\scsiport.sys' at c00d4000, length 0x0000a000 Initializing system32\drivers\scsiport.sys... DriverBase for system32\drivers\scsiport.sys: dccb3000 (ke/main.c:512) Initializing driver 'system32\drivers\atapi.sys' at c00de000, le ngth 0x00009000 Initializing system32\drivers\atapi.sys... DriverBase for system32\drivers\atapi.sys: dccbb000 (ke/main.c:512) Initializing driver 'system32\drivers\class2.sys' at c00e7000, l ength 0x0000b000 Initializing system32\drivers\class2.sys... DriverBase for system32\drivers\class2.sys: dccc3000 (ke/main.c:512) Initializing driver 'system32\drivers\disk.sys' at c00f2000, len gth 0x00009000 Initializing system32\drivers\disk.sys... DriverBase for system32\drivers\disk.sys: dcccc000 Page fault at high IRQL was 12 Bug detected code: 0x1D Page Fault Exception: 14(2) Processor: 0 CS:EIP 8:c0259228 <hal.dll: 4228> cr2 c04e2000 cr3 28e000 Proc: c0474b3a Pid: 1 <SYSTEM> Thrd: c04857e4 Tid: 1 DS 10 ES 10 FS 30 GS 10 EAX: 00006868 EBX: c04ea85a ECX: 0000442d EDX: 00000170 EBP: c00b40a0 ESI: 00200000 EDI: c04e2000 EFLAGS: 00010212 kESP: c00b4028 kernel stack base c00b2000 ESP c00b4028 Frames: <atapi: 1a3c> <scsiport: 25b1> <ntoskrnl.exe: 2388> <hal.dll: 5694> <hal .dll: 56e8> <hal.dll: 57f3> <hal.dll: 580f> <ntoskrnl.exe: ec7f> <scsiport: 1de6 > <ntoskrnl.exe: 39d73> <scsiport: 19f5> <ntoskrnl.exe: 36d04> <ntoskrnl.exe: 36 d2b> <ntoskrnl.exe: 31d6e> <ntoskrnl.exe: 42191> <ntoskrnl.exe: d507> <nto skrnl.exe: d641> <ntoskrnl.exe: dbbc> <ntoskrnl.exe: 126c> |
From: Eric K. <ek...@rz...> - 2002-08-31 22:43:54
|
"Royce Mitchell III" <ro...@ev...> wrote: Hello Royce! > <atapi: 1a3c> atapi.c:782 AtapiInterrupt() calling IDEReadBlock() /* defined as READ_PORT_BUFFER_USHORT in ide.h */ Please add a DPRINT1() before line 782 (in drivers/storage/atapi/atapi.c) that prints TargetAddress and TransferSize of the call to IDEReadBlock(). Something like: DPRINT1("TargetAddress %lx TransferSize %lu\n", (ULONG)TargetAddress, TransferSize); Do the last printed TargetAddress and TransferSize differ from the previous ones? Eric |
From: Royce M. I. <ro...@ev...> - 2002-08-31 23:49:06
|
Hello Eric, Here are the results: (atapi.c:782) TargetAddress ccc64482 TransferSize 512 (atapi.c:782) TargetAddress ccc64482 TransferSize 512 (atapi.c:782) TargetAddress c04dd78a TransferSize 53456 Kind of an odd transfer size on that last one... Royce3 |
From: Eric K. <ek...@rz...> - 2002-09-01 12:42:39
|
"Royce Mitchell III" <ro...@ev...> wrote: > Here are the results: > > (atapi.c:782) TargetAddress ccc64482 TransferSize 512 > (atapi.c:782) TargetAddress ccc64482 TransferSize 512 > (atapi.c:782) TargetAddress c04dd78a TransferSize 53456 > > Kind of an odd transfer size on that last one... Well, it clearly shows that something is going wrong. In AtapiInterrupt() you will find some more DPRINT() macros. Enable one of them, by changing it to DPRINT1(), to get some more information about the failed data transfer. For example, enable the DPRINT() at line 718 to check the validity of the SRB (Scsi Request Block). The SRB should be the same for each valid interrupt of a data-transfer. Or enable the DPRINT()s at lines 757, 758 and 771 to check the progress of the current transfer. IsLastBlock should be TRUE at the last interrupt of a transfer. TransferLength is the number of bytes that will have to be transferred to complete the transfer. It should diminish by 512 (0x200) upon each interrupt. Your harddisk might generate an excessive interrupt after the data transfer has completed. In this case atapi.sys might not identify this interrupt as invalid. Eric |
From: Royce M. I. <ro...@ev...> - 2002-09-01 20:48:01
|
Hello Eric, Turning on all those lines apparently causes a timing problem, because it causes ReactOS to hang up at a different point. I'm going to try to combine the information into a single DPRINT1 to see if that fixes the timing problem. Sunday, September 1, 2002, 7:47:37 AM, you wrote: EK> In AtapiInterrupt() you will find some more DPRINT() macros. Enable one of EK> them, by changing it to DPRINT1(), to get some more information about the EK> failed data transfer. EK> For example, enable the DPRINT() at line 718 to check the validity of the EK> SRB (Scsi Request Block). The SRB should be the same for each valid EK> interrupt of a data-transfer. EK> Or enable the DPRINT()s at lines 757, 758 and 771 to check the progress of EK> the current transfer. IsLastBlock should be TRUE at the last interrupt of a EK> transfer. TransferLength is the number of bytes that will have to be EK> transferred to complete the transfer. It should diminish by 512 (0x200) upon EK> each interrupt. EK> Your harddisk might generate an excessive interrupt after the data transfer EK> has completed. In this case atapi.sys might not identify this interrupt as EK> invalid. EK> Eric |
From: Royce M. I. <ro...@ev...> - 2002-09-01 21:10:04
|
Hello Eric, Here's the relevant snippet: Initializing system32\drivers\disk.sys... DriverBase for system32\drivers\disk.sys: dcccc000 (atapi.c:783) Srb c04de636 XferLen 0 XferSz 512 LastBlk TRUE TgtAddr ccc6448 2 (atapi.c:783) Srb c04de636 XferLen 0 XferSz 512 LastBlk TRUE TgtAddr ccc6448 2 (atapi.c:783) Srb c00b4440 XferLen 0 XferSz 53456 LastBlk TRUE TgtAddr c04dd 78a Page fault at high IRQL was 12 Bug detected code: 0x1D Page Fault Exception: 14(2) Just so as there's no confusion, here's my atapi.c:783 : DPRINT1("Srb %p XferLen %lu XferSz %lu LastBlk %s TgtAddr %lx\n", Srb, Srb->DataTransferLength, TransferSize, (IsLastBlock) ? "TRUE" : "FALSE", (ULONG)TargetAddress ); Why would Srb->DataTransferLength be 0??? Royce3 Sunday, September 1, 2002, 7:47:37 AM, you wrote: EK> Well, it clearly shows that something is going wrong. EK> In AtapiInterrupt() you will find some more DPRINT() macros. Enable one of EK> them, by changing it to DPRINT1(), to get some more information about the EK> failed data transfer. EK> For example, enable the DPRINT() at line 718 to check the validity of the EK> SRB (Scsi Request Block). The SRB should be the same for each valid EK> interrupt of a data-transfer. EK> Or enable the DPRINT()s at lines 757, 758 and 771 to check the progress of EK> the current transfer. IsLastBlock should be TRUE at the last interrupt of a EK> transfer. TransferLength is the number of bytes that will have to be EK> transferred to complete the transfer. It should diminish by 512 (0x200) upon EK> each interrupt. EK> Your harddisk might generate an excessive interrupt after the data transfer EK> has completed. In this case atapi.sys might not identify this interrupt as EK> invalid. EK> Eric |
From: Royce M. I. <ro...@ev...> - 2002-09-01 21:19:35
|
>> (atapi.c:782) TargetAddress ccc64482 TransferSize 512 >> (atapi.c:782) TargetAddress ccc64482 TransferSize 512 >> (atapi.c:782) TargetAddress c04dd78a TransferSize 53456 I don't know if this means anything, but that last transfer size is 0xD0D0 in hex. That variable is being built from calls to IDEReadCylinderLow and IDEReadCylinderHigh. Could 0xD0 be an error code return value from those functions? I don't have a clue, seeing as how they are just defines for functions that read in from ports. Royce3 |
From: Royce M. I. <ro...@ev...> - 2002-09-04 20:48:12
|
Eric, I put in the following code where you suggested. The OS is locking up before it gets to the page fault. I don't have the screen capture, but will get it to you as soon as I get a chance. > if (DeviceStatus & IDE_SR_BUSY) > { > /* Wait for BUSY to drop */ > for (Retries = 0; Retries < IDE_MAX_BUSY_RETRIES; Retries) > { > DeviceStatus = IDEReadStatus(CommandPortBase); > if (!(DeviceStatus & IDE_SR_BUSY)) > { > break; > } > ScsiPortStallExecution(10); > } > if (Retries >= IDE_MAX_BUSY_RETRIES) > { > DPRINT1("Drive is BUSY for too long\n"); > /* FIXME: handle timeout */ > } > } |
From: Eric K. <ek...@rz...> - 2002-09-04 21:28:20
|
"Royce Mitchell III" <ro...@ev...> wrote: > I put in the following code where you suggested. The OS is locking up > before it gets to the page fault. I don't have the screen capture, but > will get it to you as soon as I get a chance. It is locking up because the retry counter is not incremented. Silly me! :-( I just updated the atapi driver in the CVS tree. Please check out the latest version and try again. Eric |