Early boot (initramfs) read-only mount of separate /usr filesystem

Help
Bobby Kent
2013-01-20
2013-01-30
  • Bobby Kent
    Bobby Kent
    2013-01-20

    Hi, I have a relatively old system that, when built, was set up with /usr and /var each on their on (ext3) filesystems. My root filesystem is very small, and, being a Gentoo user, my /usr is fairly large (collapsing is not an option for me in the short term).

    All was well until some fairly recent changes required me to set up an initramfs which mounts both these filesystems early in the boot process. The initramfs mounts them in read-only mode and they are later mounted in read write. By the time fsck is able to look at them them they are already mounted, though still in ro mode.

    During boot, when fsck (1.42) does run it complains with:

    /dev/mapper/vg01-var is mounted. e2fsck: Cannot continue, aborting.

    After an upgrade to 1.42.6 similar messages were displayed.

    I looked through the code and found where this is happening in e2fsprogs 1.42.6 unix.c and patched it with:

    --- e2fsck/unix.c       2012-09-22 20:57:14.000000000 -0400
    +++ e2fsck/unix.c       2013-01-20 11:43:26.000000000 -0500
    @@ -237,20 +237,19 @@
            }
    
            /*
    -        * If the filesystem isn't mounted, or it's the root
    -        * filesystem and it's mounted read-only, and we're not doing
    -        * a read/write check, then everything's fine.
    +        * If the filesystem isn't mounted, or it's mounted read-only, and we're not 
    +        * doing a read/write check, then everything's fine.
             */
            if ((!(ctx->mount_flags & (EXT2_MF_MOUNTED | EXT2_MF_BUSY))) ||
    -           ((ctx->mount_flags & EXT2_MF_ISROOT) &&
    -            (ctx->mount_flags & EXT2_MF_READONLY) &&
    -            !(ctx->options & E2F_OPT_WRITECHECK)))
    -               return;
    -
    -       if (((ctx->options & E2F_OPT_READONLY) ||
    -            ((ctx->options & E2F_OPT_FORCE) &&
    -             (ctx->mount_flags & EXT2_MF_READONLY))) &&
    -           !(ctx->options & E2F_OPT_WRITECHECK)) {
    +           (((ctx->options & E2F_OPT_READONLY) ||
    +             (ctx->mount_flags & EXT2_MF_READONLY)) &&
    +            !(ctx->options & E2F_OPT_WRITECHECK))) {
    +               /*
    +                * If the filesystem isn't root and it is mounted, let's display a
    +                * warning
    +                */
    +               if (!(ctx->mount_flags & EXT2_MF_ISROOT) &&
    +                   (ctx->mount_flags & (EXT2_MF_MOUNTED | EXT2_MF_BUSY)))
                    log_out(ctx, _("Warning!  %s is %s.\n"),
                            ctx->filesystem_name,
                            ctx->mount_flags & EXT2_MF_MOUNTED ?
    @@ -1256,8 +1255,7 @@
                    flags |= EXT2_FLAG_64BITS;
            if ((ctx->options & E2F_OPT_READONLY) == 0) {
                    flags |= EXT2_FLAG_RW;
    -               if (!(ctx->mount_flags & EXT2_MF_ISROOT &&
    -                     ctx->mount_flags & EXT2_MF_READONLY))
    +               if (!(ctx->mount_flags & EXT2_MF_READONLY))
                            flags |= EXT2_FLAG_EXCLUSIVE;
                    if ((ctx->mount_flags & EXT2_MF_READONLY) &&
                        (ctx->options & E2F_OPT_FORCE))
    

    It seemed to me the test on E2F_OPT_FORCE for non-root filesystems was unnecessary, if the filesystem is mounted read-only, why require a force (which will fsck a filesystem regardless of read-only)? Similarly the test in main to set EXCLUSIVE did not appear to require a test for ROOT, and without this change, a different error was reported ("Filesystem mounted or opened exclusively by another program?")...

    Anyway, thought I'd share as the above worked for me, with apparently sane behavior when filesystems are mounted read-write (can't without a force) and read-only (works with a warning, unless it's the root filesystem).

    Not being familiar with the code, I may have accidently introduced some unintended behavior and wanted to be sure that nothing in the above could lead to situations where damage might result.

     
    Last edit: Bobby Kent 2013-01-20
  • Theodore Ts'o
    Theodore Ts'o
    2013-01-29

    It is very dangerous to check mounted file systems. In particular, if the file system is corrupted, and the e2fsck repairs the file system, you can't just remount the file system read/write; there may be incorrect information which is cached by the kernel. This is why if there are any problems which are fixed while the root file system is mounted read-only, you must reboot afterwards. The init scripts for most distributions handle this correctly by now, but for a while there were screwups that were caused by buggy iniit scripts.

    As a result, I'm not particularly enthusiastic about trying to encourage more use of fsck'ing mounted file systems. If you have some reason why you need to do this because you can't afford to reinstall some running system, and would prefer to have a rickety system set up using mucilage and bailing wire, that's of course your prerogative. But it's not really something I want to encourage others to follow as an example.

    In fact, what I would prefer is that people check all file systems before they are mounted, incuding the root file system, by doing the file system check in the initramfs script. Is that something that you could perhaps do in your configuration?

     
  • Bobby Kent
    Bobby Kent
    2013-01-29

    Thanks for the warnings about the potential side effects, and offering suggestions about better solutions regarding initramfs set up.

     
  • Bobby Kent
    Bobby Kent
    2013-01-30

    Gentoo's response to the proposed solution:

    fsck presently hardcodes that it is possible to check / while it is mounted read-only, with the caveat that if repairs are performed, you MUST reboot.

    He cites prior bad implementations that didn't reboot and got really corrupted afterwards, and recommends that fsck be in the initramfs instead.

    He doesn't mention the possibility that fsck itself may be corrupted - if that's the case, there is a probability that the initramfs copy is corrupted (it does have some integrity by being compressed, but that's irrelevant).

    My major problem is that the initramfs gets extremely large fast. Even with building a dynamically linked initramfs, adding e2fsck is a minimum of 500K added to the initramfs:
    /lib64/libext2fs.so.2.4 267K
    /lib64/libblkid.so.1.1.0 201K
    /lib64/libe2p.so.2.3 32K

    That's JUST e2fs libraries, not the binaries as well.

    e2fsck.static is ~1MB often.

    reiserfs, JFS will cost you ~300KB each min. btrfs is ~150KB.
    XFS I won't include, as it explicitly puts xfs_check OUTSIDE of the realm of fsck.

    If we want to build a good generic initramfs now (per recent discussions in gentoo-dev ML about shipping a generic kernel + initramfs for releng and quick installs), we'd have to include all of those, so we're going to have at least 1.2MB in there.

    The code I'd like to implement in genkernel's initramfs, is as follows:
    1. Mount needed filesystem(s) on /newroot, readonly
    1.1. This may be /newroot and /newroot/usr depending on libraries.
    2. Call the fsck binaries FROM /newroot, on /newroot and /newroot/usr
    2.1. Probably from chroot
    2.2. Alternatively, PATH=/newroot/sbin:/newroot/bin /newroot/sbin/fsck
    2.2.1. Might need to muck with LD_LIBRARY_PATH as well.
    3. If there are ANY problems that get fixed:
    3.1. Reboot
    3.2. umount -f /newroot/usr /newroot && mount them again!

    The other case where this is needed, is where there IS no initramfs.
    You'll recall that there is an extremely vocal Gentoo minority that is refusing the push to initramfs.

    In which case, as above, with the following changes:
    0. As above, just s!/newroot!/!g.
    2.1. chroot / is idempotent.
    2.2. Invalid in this case.
    3.1. REQUIRED.
    3.2. Impossible in this case.

     
  • Theodore Ts'o
    Theodore Ts'o
    2013-01-30

    Why do you need /usr at the time when you run e2fsck, anyway? E2fsck shouldn't need /usr for anything, and you don't need udev in order to check the root file system.