Menu

does it ever make sense to Force rewrite?

2004-09-15
2012-11-28
  • Vitaly Oratovsky

    Greetings,

    I was investigating the function e2fsck_handle_read_error() in ehandler.c which is normally invoked when a read doesn't return the expected amount of data (or experiences an outright error).  In case the offending "short read" returned at least some data, this function tries to read in the rest.  This is quite sensible.

    However, if the offending "short read" returned no data at all (e.g. hard error) then this is what this function does:

        if (ask(ctx, _("Ignore error"), 1)) {
            if (ask(ctx, _("Force rewrite"), 1))
                io_channel_write_blk(channel, block, 1, data);
            return 0;
        }

    So basically it gives the operator the option to ignore the read error, and then follow up by WRITING over the offending block if the operator gives his consent.

    Could someone please explain to me why it makes sense to WRITE a block which you failed to read?

    BTW, it has been my observation that most people invoke fsck with "-y", thereby unwittingly giving their consent to overwrite those blocks.

    Furthermore, if I understand the code correctly, after e2fsck_handle_read_error() returns 0 after rewriting the offending block, the caller doesn't do anything sensible either, such as mark the block as bad.

     
    • Theodore Ts'o

      Theodore Ts'o - 2005-01-19

      It makes sense to do the force rewrite because it may cause the hard drive to remap the block to one of its spare blocks that are reserved specifically for this purpose.  This is a much easier way to deal with a bad block in the inode table than trying to relocate the entire inode table (which must be contiguous).

       
      • Vitaly Oratovsky

        Forcing the drive to remap the block by writing is something that didn't occur to me, and I suppose it makes a certain amount of sense.  However, I'm uncomfortable with this:  what if the disk is not directly attached to the host?  For example, what if it is on a SAN or iSCSI?  The read error could have been caused by a transient fibre channel or ethernet switch problem, so a subsequent force-rewrite could easily end up destroying good blocks.

         

Log in to post a comment.