From: Bruno W. I. <br...@wo...> - 2004-12-20 04:53:19
|
On Mon, Dec 20, 2004 at 13:11:38 +1000, Michael Mansour <mi...@np...> wrote: > Hi Bruno, > > > On Mon, Dec 20, 2004 at 10:38:51 +1000, > > Michael Mansour <mi...@np...> wrote: > > > > > > My guess is that the drive is having a problem with one sector. If I can > mark > > > that sector as bad, the problem will be resolved. If this is the case, how > can > > > I mark it as bad? > > > > One option is to shutdown and boot with a live CD (such as knoppix) and > > copy the block from the good drive to the bad drive. > > Hmmm.. but how would I know exactly which block it was? It will probably show up on a self test. (If it doesn't show up on a short selftest, try a long one.) That number will give you the block number using 512 byte blocks. You will probably want to use 4096 byte blocks and divide the number by 8, as under linux trying to write a 512 byte block seems to result in a read of a 4096 byte block that includes the bad sector which won't work very well. Be sure to use the who disk device (e.g. /dev/hda), not a partition device (e.g. /dev/hda1) unless you want to do some more math. > When you mentioned that I thought maybe then I should just run an fsck on the > drive and that would likely mark the block bad. The drive will reallocate the block once it can safely delete the old one. It can do this if it either gets a good read of the block or if the block gets overwritten. > > > If you don't want to take the system down, but can afford a lot of > > disk IO for a while, you can fail the bad drive and then add it back > > to the mirror. This will rebuild the entire drive (or at least the > > partitions you failed out of their mirror sets). > > Yeah, this is only a test cluster environment so it's no issue to bring the > server down or hammer it with disk IO. > > I'm not too familiar with failing a drive and adding it back, do you happen > have the steps involved? I believe you want to do something like the following: mdadm /dev/md0 -f /dev/hda1 mdadm /dev/md0 -r /dev/hda1 mdadm /dev/md0 -a /dev/hda1 This assumes that the back block is on the hda1 partition and that that partition is part of the md0 raid set. While the partition is offline, you might want to pound on it with badblocks for a while to see if there are any other bad blocks on the device (in that partition). > > I'm using RHEL 3.0.3. I've setup md0 to md5 as mirrored partitions on /dev/hda > and /dev/hdb. |