kernel panic when drive is connected via usb

Help
Ed Finegan
2004-03-07
2004-04-02
  • Ed Finegan

    Ed Finegan - 2004-03-07

    I have a 200GB usb2.0/Firewire drive fomated ext3. When connected to my iBook G4 via firewire all is well. Howvever when the same drive is connected via usb 2 on the same iBook I will get a kernel panic with in a few seconds of trying to write data. I have also tested it on a PowerBook G3 lombard via usb 1 and have recieved the same results. This has happend with version 1.1 and 1.1.1. Feel free to contact me if any more information is needed. 

     
    • Brian Bergstrand

      Ed,

      I will need to see the panic log file (/Library/Logs/panic.log). You can send the file to me through e-mail, or post it here.

      Also, if the fs is corrupted, all bets are off. 'sudo fsck_ext2 -fy' will make sure the fs is clean.

      Finally, bad hardware is a possibility. See the "can't copy from drive" thread for an example of bad hardware problems.

      PS. Don't use 1.1, there a couple of known panics in that version.

       
    • Ed Finegan

      Ed Finegan - 2004-03-08

      Ok I had 1.1 on one mac and 1.1.1 one the other. I uninstalled 1.1 and replaced it with 1.1.1. It works fine by firewire to the mac and I think usb on my linux box (not thoughly test on it yet though). When I run fsck_ext2 -fn I get the following results. This is from my mac connected with firewire:

      sudo /sbin/fsck_ext2 -fn /dev/disk1s2
      Password:
      e2fsck 1.34 (25-Jul-2003)
      Pass 1: Checking inodes, blocks, and sizes
      Pass 2: Checking directory structure
      Pass 3: Checking directory connectivity
      Pass 4: Checking reference counts
      Pass 5: Checking group summary information
      Backups: 12493/24428544 files (0.3% non-contiguous), 1826399/48840236 blocks

      Here is the panic log from my iBook g4, If it will help i can also post the log from my PB g3 lombard tomorrow if needed. This was casue by connecting the drive with usb and then trying to write to it.

      Sun Mar  7 15:18:33 2004

      panic(cpu 0): ext2_cmap: allocation requested inside a block (possible filesystem corruption): qbmask=4095, inode=5177357, offset=1179136, blkoff=3584
      Latest stack backtrace for cpu 0:
            Backtrace:
               0x000833B8 0x0008389C 0x0001ED8C 0x1D20A500 0x000B79DC 0x000BB820 0x000BB4E4 0x000BB310
               0x1D205294 0x000CD1C8 0x00204134 0x0020422C 0x0023DD24 0x00093D20 0x00720069
            Kernel loadable modules in backtrace (with dependencies):
               net.sourceforge.ext2fs.fs.ext2(1.1)@0x1d1f7000
               com.netlock.kext.NetlockKernel(3.0)@0x708000
      Proceeding back via exception chain:
         Exception state (sv=0x1D0A3C80)
            PC=0x9001050C; MSR=0x0000F030; DAR=0x196FB008; DSISR=0x40000000; LR=0x90297C98; R1=0xF02846C0; XCP=0x00000030 (0xC00 - System call)

      Kernel version:
      Darwin Kernel Version 7.2.0:
      Thu Dec 11 16:20:23 PST 2003; root:xnu/xnu-517.3.7.obj~1/RELEASE_PPC

      *********

      Mon Mar  8 00:08:16 2004

      panic(cpu 0): ext2_cmap: allocation requested inside a block (possible filesystem corruption): qbmask=4095, inode=12, offset=1176064, blkoff=512
      Latest stack backtrace for cpu 0:
            Backtrace:
               0x000833B8 0x0008389C 0x0001ED8C 0x20494518 0x000B79DC 0x000BB820 0x000BB4E4 0x000BB310
               0x2048F2AC 0x000CD1C8 0x00204134 0x0020422C 0x0023DD24 0x00093D20 0x0C000003
            Kernel loadable modules in backtrace (with dependencies):
               net.sourceforge.ext2fs.fs.ext2(1.1.1)@0x20481000
      Proceeding back via exception chain:
         Exception state (sv=0x1F69FA00)
            PC=0x9001050C; MSR=0x0000F030; DAR=0xE13B2000; DSISR=0x42000000; LR=0x90297C98; R1=0xF02036C0; XCP=0x00000030 (0xC00 - System call)

      Kernel version:
      Darwin Kernel Version 7.2.0:
      Thu Dec 11 16:20:23 PST 2003; root:xnu/xnu-517.3.7.obj~1/RELEASE_PPC

      
      *********

      Tomorrow i will try to use it more on my linux box via usb  to see if there are any problems.

      Please let me know if any more info is needed

      Thanks
      Ed

       
    • Brian Bergstrand

      Ed,

      "ext2_cmap: allocation requested inside a block (possible filesystem corruption)"

      I can tell you right away that these types of panics are due to fs corruption or bad hardware. There really is no other possibility.

      If the drive works fine with Firewire, then there may be a USB problem with your computer or the drive enclosure. In fact the "can't copy from drive" thread that I mentioned before was caused the by the latter. The user's USB2 drive enclosure worked fine on Linux (PC) and OS 9, but refused to work properly under OS X. Transfering the drive to a different enclosure (FW 400) allowed it to work properly.

      Actually, there is one other possiblity. I noticed in the first panic that another kext was in the backtrace (Netlock). There may be some kind of conflict betwen ext2 and Netlock that is causing the problem, but I don't think this is likely as you say the drive works fine over FW, and the second panic does not implicate Netlock.

      HTH.

       
    • Nobody/Anonymous

      Hi,

      I have the same problem on my ibook G4 with a zip100 USB drive. I'am sharing a ext2 "CVS repository"  disk between linux and osx.
      On OSX, your driver seems to work fine (great job !!) but when trying to write some files (not all) it crash the kernel. I first thought it was the hardware, so i tested the disk on osx and linux with fsck but the disk seems ok.
      I changed the floppy disk (a new one), but the same error happened : it always crash the kernel.
      The disk was formatted on linux, does it matter ?? Do you think it's a hardware problem with no solution ??

      Thanks for your help. (and scuse my poor english..)

      Cyril.

       
      • Brian Bergstrand

        Cyril,

        Can you post your /Library/Logs/panic.log file here? It will give me more information to go on. USB seems to be the common problem here.

         
    • Nobody/Anonymous

      Hi,

      I was on holiday last week.. sorry for the responding time.
      Unfortunately, i have no panic.log file on my ibook.. is there something to do to activate the creation of this file ??
      thanks.

      Cyril

       
      • Brian Bergstrand

        If there is no panic log, one of three things happend:

        1. The panic info was too large to fit in nvram.
        2. There was an error storing the info in nvram.
        3. The kernel crashed so hard, it was unable to create a log.

        In all these cases there is nothing you can do. This really sounds like a hardware problem (bad USB, or bad RAM). Bad hardware can cause option 3 to occur.

        Let me know if you get anymore info..

         
    • Nobody/Anonymous

      Hi Brian,

      I tested yesterday the new version of panther 10.3.3, and it crash the same way than the previous (10.3.2) except I got a panic.log :

      Thu Mar 25 20:21:34 2004

      panic(cpu 0): ext2_cmap: allocation requested inside a block (possible filesystem corruption): qbmask=1023, inode=63, offset=196096, blkoff=512
      Latest stack backtrace for cpu 0:
            Backtrace:
               0x000834B8 0x0008399C 0x0001EDA4 0x144A5518 0x000B7DDC 0x000BBC20 0x000BB8E4 0x000BB710
               0x144A02AC 0x000CD654 0x002069C0 0x00206AB8 0x002405B4 0x00093E20 0x63617465
            Kernel loadable modules in backtrace (with dependencies):
               net.sourceforge.ext2fs.fs.ext2(1.1.1)@0x14492000
      Proceeding back via exception chain:
         Exception state (sv=0x140BDA00)
            PC=0x9001050C; MSR=0x0000F030; DAR=0x11884008; DSISR=0x40000000; LR=0x90297A48; R1=0xF14A86C0; XCP=0x00000030 (0xC00 - System call)

      Kernel version:
      Darwin Kernel Version 7.3.0:
      Fri Mar  5 14:22:55 PST 2004; root:xnu/xnu-517.3.15.obj~4/RELEASE_PPC

      I thing you're right with the hardware problem, because my zip device sound weirdly sometimes...
      I made a new test : I format the crashing disk in fat32 and write some files on that disk, to fill it. The osx kernel didn't crash, but when i try to read this disk with my win2000 computer, some of the files where corrupted. There are a lot of chance that my zip device has some defuncts..
      I'am going to replace it with an usb key to solve the problem (i hope) !!! :>

      by the way, i was just thinking of one thing :
      the fat32 driver seems to don't crash the kernel but can write bad food on disk, and the ext2fs crash the kernel when there is a bad food problem.
      Do you thing there is a way of telling the user his disk/device is corrupted instead of crashing the kernel ??
      It may be a new feature request ??

      Really thanks for your help

      Cyril

       
      • Brian Bergstrand

        Cyril,

        This is the same panic as others in this thread and it means that the filesystem is corrupted or there is bad hardware. Basically, the panic happens because someone requested a block inside of another block. Big no no, and that should never happen with a valid filesystem.

        The reason we panic, is because there is no easy way to recover from this, either in memory or on disk. Memory will be cleared on reboot, and the disk will be dirty and hence require an fsck. So the panic solves recovery from both errors. FAT uses a completely different allocation method, and I also don't think it supports the cluster routines which includes the cmap call. That's why you are not panicing, but instead get corrupted data silently. UFS is very similar to ext2, so if you format the disk as UFS you should get a similar panic.

        HTH.

         
    • R.J.V. Bertin

      R.J.V. Bertin - 2004-03-31

      Re corrupted Fat filesystem: for me, this only happens when I don't cleanly unmount the disk.
      Re: Fat32: fat 32 drives must be at least 512Mb large, if memory serves me well. Some utilities accept to create smaller ones, but these can cause problems.

       
    • R.J.V. Bertin

      R.J.V. Bertin - 2004-04-02

      Brian,

      I just tested your hypothesis that a panic w/should also occur with UFS. I formatted a 6Mb CF card as a single UFS partition, and had NO problems using it, neither writing to nor reading from. I can't recall having had problems with Ext2 partitions this small, but certainly had them while handling files of around 1Mb. Same hardware, of course.

      When I have time this weekend, I may try with a larger partition (would need to free up a 256Mb card). But is there any reason to suspect a correlation with the partition size?

      Also, what I'm wondering: why would one get the same allocation error when reading (also from a RO mount) and when writing?

      Also, is there really no other way to handle this (I understand "you" generate the panic)? Something like killing the process, and raising an IO error on the partition, possibly causing it to be forcibly unmounted? Possibly when you detect the device is mounted via USB (rather supposing bad hardware than a corrupted fs)? Or is the memory corruption so bad that the whole system might be affected?

      Aside from that: anyone any suggestions as to where to find a suitable/compatible USB (or FW) CF/multi-cardreader?

       
      • Brian Bergstrand

        R.J.,

        I did say "should cause a similar panic" . :)

        Seriously though, UFS is similar to Ext2, but its allocation methods are different enough that you may not get the problem. Did you verify that any data written to the drive was valid (using MD5)?

        6MB is kind of small, and you said that you don't remember any problems with an Ext2 partition of that size. I don't think there should be any difference between a 6MB and a 256MB partition. Both would have a block size of 1K, and that is the only thing that I can think of that might make a difference.

        The panic can happen when reading and writing, because both go through the block cluster support in the kernel. The cmap function converts an offset into a block number and then does some read ahead semantics to get the largest possible transfer size. In the case of reading, no new blocks are created, but it still calls the block map routine to "allocate" existing blocks for the read ahead.

        As for error recovery, I was trying to think of a way to get rid of this panic, but it would not be easy. First off, it's impossible to force an unmount from within a system call in the kernel (at least without asking for trouble at a later point) so the unmount would have to be pushed out to userland and I don't really want to write a new daemon (and new code in the kext to handle communication with the daemon) just for a hardware error. Secondly, with a writable filesystem there would be a race condition between notification of the error and the actual unmount where a sync could happen in between and cause corrupted data or fs meta-data to be written to the disk and that could make things a lot worse.

        There is no easy solution/recovery for this problem. I'll do some more investigation, but I'm not promising anything.

        Thanks.

         

Log in to post a comment.