From: <fh...@at...> - 2000-10-22 22:41:27
|
In <Pine.LNX.4.10.10010221201370.474-100000@cassiopeia.home>, on 10/22/00 at 12:04 PM, Geert Uytterhoeven <ge...@li...> said: >On Sun, 22 Oct 2000 fh...@at... wrote: >> In <Pine.LNX.4.10.10010181211150.841-100000@cassiopeia.home>, on 10/18/00 >> at 12:46 PM, Geert Uytterhoeven <ge...@li...> said: >> >> >On Tue, 17 Oct 2000 fh...@at... wrote: >> >> In <Pine.LNX.4.10.10010171602340.394-100000@cassiopeia.home>, on 10/17/00 >> >> at 04:09 PM, Geert Uytterhoeven <ge...@li...> said: >> >> Thanks for the tip Geert. >> I got the driver to pass the cache test finally. >Good! What exactly did you have to change? Well my excitement has been dashed to pieces. It seems there is a memory corruption problem in the driver. >> I also found out something interesting: There is a limit as >> to how many printks you can have in a driver. :) >Really? Probably not. I think the memory stack or heap is getting corrupted that is causing the problem. I sent a message to Richard early this after noon I have attached the message I sent him in the hope that someone can explain/help. In <200...@li...>, on 10/17/00 at 10:09 AM, Richard Hirst <rh...@li...> said: Richard, I sent you a message previously about how the driver successfully completes the cache test. After doing some more examination of the code regarding the change I made, I have found something that puzzles me. This is in the ncr_attach function: /* ** Align np and first ccb to 32 boundary for cache line ** bursting when copying the global header. */ /* Fred */ np = (ncb_p) (((u_long) &host_data->_ncb_data) & NCB_ALIGN_MASK); The above line is the original code that "came with" the driver I am working on. The following line is in the newer sym53c8xx driver. When I use the m_calloc function to allocate the np structure, the resultant structure is either not aligned properly or ends up in the wrong place in memory for the cache test to work. As you can see the np data structure physical address is 82f2000. Using the above line puts the np structure at bf34080 and the cache test works fine. Here's my puzzlement: I don't see how the above line allocates any storage for the np/ncb data structure. What makes me think that the structure is not really allocated properly results from the last line of the latest dmesg. You notice the inst_name (before the colon) is not printed, possibly indicating that the memory has become corrupted. (The result of trying to boot is that the computer locks up, with no panic.) What do you think? /* 4643 *//* np = m_calloc(sizeof(struct ncb), "NCB", MEMO_WARN); */ The line above is borrowed from the newer sym53c8xx driver code. /* 4644 */ if (!np) /* 4645 */ goto attach_error; /* NCR_INIT_LOCK_NCB(np); *//* Fred */ /* No SMP on APUS */ host_data->ncb = np; /* bzero (np, sizeof (*np)); */ /* Fred */ /* Not in newer Symbios */ Does this bzero do some sort of memory allocations? I think all it does is clear the memory? /* Fred */ /* Next two lines Not in newer Symbios */ np->ccb = (ccb_p) (((u_long) &host_data->_ccb_data) & CCB_ALIGN_MASK); bzero (np->ccb, sizeof (*np->ccb)); Part of old dmesg: virt_to_phys(np): 82f2000 np: c02f2000 ncr_cache: 00000000 pc: 0bf32e50 ncr_cache: c02f2078 0 78 start=0bf32e50, pc=0bf32e50, end=0bf32e7c c0000004 082f2078 00f40034 c0000004 00f4001c 082f2078 c0000004 082f2078 00f4001c 98080000 00000063 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 This is the latest dmesg from my testing: ncr53c8xx: 53c770 detected ncr53c770-0: rev=0x00, base=0xf40000, io_port=0x0, irq=12 new SCRIPT[3772] @c02f2000. new SCRIPTH[3708] @c3f33000. Peparing... myaddr: 0 myaddr: 7 myaddr: 7 ncr53c770- 0: ID 7, Fast-20, Parity Checking ncr53c770- 0: initial SCNTL3/DMODE/DCNTL/CTEST3/4/5 = (hex) 05/c0/20/00/00/04 ncr53c770- 0: final SCNTL3/DMODE/DCNTL/CTEST3/4/5 = (hex) 05/82/20/00/08/24 SCSI reset cleared test: aabbff11 aabbff11 virt_to_phys(np): bf34080 np: c3f34080 ncr_cache: 00000000 pc: 0bf33e50 ncr_cache:ptr: c3f340d8 val: 0 offset: 58 np->ncr_cache: faedbeef ncr_wr: deadfead cache addr: bf340d8 t_istat: 1 dstat: 84 (1 128) dsp: bf33e7c dsps: 63 (200490620 99) <-- Snooptest runs successfully sstat2: a (10) term pc: bf33e7c requesting irq... <-- now we request a shared IRQ shared irq : resetting, command processing suspended for 2 seconds <-- This should be printing ncr53c770-x before the colon. That inst_name is in the np/ncb data structure. See following code. Some more code: static void ncr_start_reset(ncb_p np, int settle_delay) { u_long flags; save_flags(flags); cli(); if (!np->settle_time) { if /* (bootverbose > 1) */ (1) printf("%s: resetting, command processing suspended for %d seconds\n", ncr_name(np), settle_delay); np->settle_time = jiffies + settle_delay * HZ; OUTB (nc_istat, SRST); Fred |