Re: [Jfs-discussion] Ideas on how to recover a trashed JFS volume.

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

> On Sun, 2006-10-08 at 19:23 +0200, lu...@ik... wrote:
>> Hi!
>> A while ago I accidentally mke2fs:ed my root volume. Long story short -
I
>> blame the fact that the device numbering differed between my gentoo
installation and the ubuntu live cd. I had backups on the most
important
>> data. However not on the newest digital photos and some other stuff. I
have not found any good recovery tools aimed at jfs. I did run
magicrescue and PhotoRec. Both looks for potential file starts and ends
and extracts the data in between. This works ok on some filesystems and on
>> volumes with low fragmentation. I did extract a few hundred thousand
possible images, and the ones that I were interested in had random
corruption or parts of the images were shuffeled around.
>> I guess that this depends on fragmentation so that the files were not
allocated in one large continous extent. After browsing the archives of
this mailing list I got the idea that it might be possible to scan for
inodes with a simple sanity check. I found this thread quite
>> interesting:
>> http://sourceforge.net/mailarchive/forum.php?thread_id=8137509&forum_id=43911
I am currently reading up on JFS (trying to understand the
>> 'layout'-paper)
>> and studying the source code. It seems like there is a lot of usable
functions in jfs_debugfs.
>> My idea is a program that does a read-only extraction of files from a
trashed JFS filesystem (on disk or image). As there may be random
errors
>> on the volume one have to take into account that the metadata read from
the device may be wrong.
>> My approach is to scan the volume for sane inodes, and then try to
write
>> its extents to a file on an other (mounted) device. If the filename is
extractable then use it, else use a file serial number (like
>> dinode.di_number). It sounds quite simple but I guess its not. Is it a
feasable approach? I will gladly spend some time to (try to) implement
this. Any advice/ideas/help is greatly appriciated!
>
> Finding the file data may be about that easy.  Finding the file name
would require parsing the directory inodes.  This is doable, but would
probably double the amount of work you'd need to do.
>
> I think the thread you found should contain the information you need to
identify a group of inodes.  If you have any questions, direct them my way.
>
> Thanks,
> Shaggy
> --
> David Kleikamp
> IBM Linux Technology Center
>
>

Hi!

I have now implemented a simple "extent-extractor". It uses modified
versions of the functions that display the xads in debugfs_jfs. An inode
scanner tries to check all 512b-blocks weather they are sane inodes or
not. It is currently overly simplified, it only checks:
((ino_ptr->di_ixpxd.len==4) && (ino_ptr->di_fileset==16))

It does seem to work quite well. I have extracted a few very fragmented
files and their checksums match the originals. As I used this box as a pvr
backend I have a few hundred Gb in really large files.

It is really slow. It scans and extracts a few Gb an hour. According to
the profiler it stays around 80% in fwrite, so i might blame it on the
slow USB-storage I use for the recovered files. The inode scanner might be
optimized if one stored ranges that were taken up by file extents that one
had encountered. When the scanner found a range that had been saved as a
file extent, then it could skip to the end of that range. If one would use
that technique then the extents have to be validated to assure that an
illegal extent does not make the scanner skip interesting blocks.

The original routines from debugfs_jfs detected that my superblocks were
OK, but of course they were not. For instance the aggregate block size
extracted from the superblock was illegal (178323 or something like that).
I forced the block size to 4096b. It may or may not be feasable with a
more robust check in debugfs_jfs that checks that the blocksize is valid
and 1<<l2bs == blocksize.

The program occasionally encounters invalid XADs or detects invalid inodes
as valid ones. This causes the program to jump around on invalid XADs,
which will take a while, and might create very large files. I need a way
to detect if a xad is valid. The only sanity check i can think of is to
check that offset and length is less than the aggregate size. Any ideas?

When I got this inode-scan-and-extent-walk-kind-of-program working I will
try to also parse directory inodes for filenames. I think I might need to
do it in two steps. Probably the simples way would be to scan for
directory inodes and then create a directory structure on the
'to'-filesystem. Then you scan for file inodes and place them in the
directory structure. I see two main problems. The directory inode scan
will need a fairly advanced data structure, that can rearrange a directory
tree when new data is found. Another is what to do with items whos path is
not fully discovered.

Best regards,
Simon