Hi
While validating a dump made of a 16TB filesystem it was observed that the validation failed badly. Long story short is that any backups made of 2TB+ filesystems are likely incomplete and will contain corrupted data/files.
Enclosed are a set of patches that (at least partially -- they are necessary, but possibly not sufficient to) address this issue -- I've validated the dump/restore on a test set of data that was failing before, and have a larger dump/restore in progress for further testing/validation. These diffs are against the most recent version of dump (0.4b47) and testing was done on a Ubuntu 20.04.4 (x86_64) system. The filesystem being dumped looked like:
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup-Foo 19T 16T 2.2T 88% /u1
Basically the corruption occurs on files whose logical block addresses are larger than 32 bits. The requested address ends up overflowing, and pulling in a block that doesn't belong to the original file. Also note that we need to use ext2fs_block_iterate3() instead of ext2fs_block_iterate2(), as the latter cannot cope with 64-bit sizes.
I'm trying to build a sparse filesystem to show easy replication, but neither debugfs nor dd seem to want to deal with 64-bit offsets/sizes either.
In the meantime, at an absolute minimum, the existing code should be modified to detect that the filesystem being dumped is larger than 2TB and refuse to run.
Thanks.
Later...
Greg Oster
My math here might be wrong... if 4294967296 is the maximum logical block address, then the corruption wouldn't be seen until LBAs of over that value.. I.e. for a block size of 4K, that would mean on filesystems larger than 16TB... which would help explain why this hasn't been reported before.
Later...
Greg Oster
8.5TB of data successfully dumped/restored with the submitted patches in use. The dump/restore of this data set has 1000's of validation errors without the patches.
Later...
Greg Oster
These changes are necessary, but not sufficient. A multi-tape dump looks like it is corrupting a file that spans two tapes. The error seen is:
Incorrect block for <filename> at 11432470600 blocks
Incorrect block for <filename> at 11432470601 blocks
...
Incorrect block for <filename> at 11432470790 blocks
Incorrect block for <filename> at 11432470791 blocks</filename></filename></filename></filename>
When the 16TB restore finishes I'll know if this is the only file that is corrupt. [UPDATE: 16TB restore finished. 'diff' showed that only the one file above (which spanned tapes) was corrupt.]
I suspect that to fix this we'll need to modify compat/include/protocols/dumprestore.h to bump int32_t c_firstrec; to an int64_t . But such a change will need to be made in a backwards compatible way so old backups arn't rendered obsolete.
Fixing the above should also allow an easy fix for the outstanding dump progress "% done" issue too.
Later...
Greg Oster
Last edit: Greg Oster 2022-07-01
Thanks for this! I've managed to generate a test case that doesn't require terrabytes of data, only about 3GB of diskspace to reproduce:
(This assume you have no loop devices in use - it will trash them if there are!)
And this is the result without this patch:
There's another serious bug related to EXT2_EXTENT_FLAGS_UNINIT which I've got a fix for (and might be the cause of bug 175)
There's also an issue with the verify of long symlinks. (doesn't affect the restore, only a verify similar to what is being done above in the testcase)
(There's also a longstanding bug related to verify and counting of extended attributes for which there's a fix in the debian package but doesn't appear to be in here)
You're most welcome! Thanks for coming up with a small test case -- I switched from 'dump' to 'restic' for backups at about the same time that I reported the issue, and so havn't had the need to chase this problem further.
Later...
Greg Oster
There was a minor bug in the original patches which I've fixed in the attached patch.
I had to add a bit of extra logging to actually show that the test case was duming an EA block >2^32
dumping EA (block) in inode #13 block=4312435892
A somewhat modified testcase that runs a bit quicker.