Recovering Ext2 filesystem after 'mke2fs'

Help
2007-01-16
2012-11-28
  • Jörg W Mittag
    Jörg W Mittag
    2007-01-16

    Hi,

    I stupidly managed to run 'mke2fs' on the wrong partition and
    wanted to know if and how much of my data I can possibly restore
    without having to reach for my backups.  (The filesystem in
    question is almost 460 GiByte in size with less than 200 MiByte
    of free space and the backups are on literally hundreds of CDs,
    so I'd want to avoid playing diskjockey as much as possible.)

    Here's the short form of the question: how do I recover an Ext2/3
    filesystem after having overwritten it with a fresh, empty Ext3
    filesystem?  Basically, how do I "un-mke2fs" given that no data
    was actually overwritten (the filesystem was never mounted r/w),
    only metadata (superblock and root directory and what else
    'mke2fs' initializes) was destroyed?

    Now the long story ...

    I originally created a filesystem on that partition with Linux
    'mke2fs'.  I cannot remember whether that was an Ext2 filesystem
    or an Ext3 filesystem and if the latter, whether I removed the
    journal later on or kept it.  I am pretty sure it was a plain old
    simple Ext2 filesystem, however, I do know it had HTree directory
    indexing, so I am not sure if that's correct.

    Initially, I copied quite a lot of data onto the filesystem in
    Linux but afterwards the filesystem was mostly used in Windows
    with the Ext2IFS driver from <http://FS-Driver.org/>, which,
    unfortunately doesn't support HTree very well and sometimes
    corrupts the indices.

    Yesterday I accidentally ran 'mke2fs -j' followed by 'tune2fs' on
    that partition, thus destroying the filesystem or at least the
    top-level metadata (superblocks, root directory).  I realized my
    mistake immediately after running 'tune2fs', so, after the
    incident, the filesystem was never written to and it was never
    mounted r/w; it might however have been mounted r/o once.

    In other words: all of the data (well, *almost* all of it: up to
    128 MiByte of 460 GiByte might have been overwritten by the
    journal) and almost all of the metadata (all directories except
    the root directory and the superblocks) are *still there*, but
    "overlayed" with and "hidden" under another Ext3 filesystem.

    In my naive understanding of the Ext3 filesystem structures, the
    hierarchical structure looks something like this: the root of the
    filesystem is/are the superblock(s).  Those contain a link to the
    root directory.  The root directory then contains two links to
    itself ('.' and '..') and links to all the directories directly
    underneath the root directory (including 'lost+found').  Each
    directory then in turn contains links to itself ('.'), its
    parent ('..') and to all its immediate subdirectories and so
    forth.

    In my case, the second link(s) in that chain are broken: there is
    a new superblock which contains a perfectly fine link to the new
    root directory, but this new root directory doesn't contain the
    necessary links to the old subdirectories.  However, the whole
    rest of the filesystem tree is still there, it's just no longer
    connected to the root.  In other words, I "just" need to
    "reconnect" those immediate subdirectories back to the root
    directory and I'm done.  (Well, obviously all the "houskeeping"
    stuff, free/used block/inode lists and so on, will be dead wrong,
    but as far as I understand it, 'e2fsck' can take care of those.)

    Well, at least, that's what my naive view of filesystems looks
    like (-;

    These are the necessary steps, at least how I see them with my
    limited understanding of the Ext2/3 filesystems:

      0. Create a backup image of the partition to work on.

      1. Find out exactly which blocks have been overwritten by
           'mke2fs' and 'tune2fs', especially find out where the
           journal lives.

      2. Using the information from step 1., find out which files the
           overwritten blocks belonged to.
           [Maybe defer this step until after 4.?]

      3. "Reconnect" the subdirectories to the root directory.

      4. Fix up the filesystem housekeeping metadata.

      5. Restore the files identified in step 2. from secondary media.

    Problem is: I don't have the slightest idea how to perform
    steps 1-4!  I think, I might need 'debugfs' for steps 1 and 2,
    'e2fsck' for step 4 and one or the other for step 3, but I don't
    know how to use 'debugfs'.

    Or can I just run 'e2fsck' on the image and it will know not to
    trust the seemingly fine filesystem on top but actually dig down
    into the old overwritten filesystem below the new one?  Could it
    really be *that* simple?

    I am a bit reluctant to simply use a Data Carving Tool like
    'foremost', because those tools totally ignore the filesystem
    structure and try to scrape data directly from the disk blocks. 
    Given that Data Carving is a forensic technology where you have
    to assume that the filesystem metadata has been tampered with and
    cannot be trusted, that's of course the right thing to do, but in
    my case, there *is* a lot of trustworthy filesystem metadata left
    and I *want* whatever tool I end up using to use that metadata. 
    Data Carving Tools can only restore file contents but not paths
    or filenames and, quite frankly, it's not exactly fun renaming
    and reorganizing 20000 files by hand (-;

    I still have a lot of knowledge about the files and directories
    on the overwritten filesystem; I had indexed many of them in a
    search machine and I was able to extract parts of the directory
    structure from the index: I have a list of more than 1500
    directories and more than 15000 files of which I know the full
    pathname.  I also still know the names of most of the directories
    in the root directory and I remember some of the names of the
    second-level directories.  That should help me finding the
    directories on disk; actually, I already found some of those and
    poked a bit around them with 'od', but I couldn't make much sense
    of it.

    I would appreciate any help, even if it is just a pointer to some
    manpage or tutorial.

    Thank You very much in advance,
        jwm.

     
    • Jörg W Mittag
      Jörg W Mittag
      2007-01-16

      Hi again.

      I just realized I had completely forgotten about one of the key
      concepts of Ext2/3: inodes.  So, while I might be able to
      reconstruct the filesystem tree from the directory inodes, I need
      the inode tables (which have been overwritten by 'mke2fs') to
      actually get the file *contents*.  In other words: I'm screwed (-;

      Cheers,
          jwm.