Overview - Ext4 file system lvm2 snapshot is dumped (not mounted) and then restored to new logical volume. Restored file system contains files with binary differences to snapshot and original file system.
Linux distribution: Debian GNU/Linux jessie/sid # uname -a Linux gwar 3.10.22gwar_2013.12.07 #1 SMP Sat Dec 7 12:10:45 EST 2013 x86_64 GNU/Linux dump 0.4b44 (using libext2fs 1.42.9 of 4-Feb-2014) restore 0.4b44 (using libext2fs 1.42.9 of 4-Feb-2014) commands used: + lvcreate --size 40g --snapshot --name home-snap /dev/vg0/home Logical volume "home-snap" created + dump 0f /mnt/home.2014-03-11.dump /dev/vg0/home-snap DUMP: Date of this level 0 dump: Tue Mar 11 13:44:54 2014 DUMP: Dumping /dev/vg0/home-snap (an unlisted file system) to /mnt/home.2014-03-11.dump DUMP: Label: none DUMP: Writing 10 Kilobyte records DUMP: mapping (Pass I) [regular files] DUMP: mapping (Pass II) [directories] DUMP: estimated 715508510 blocks. DUMP: Volume 1 started with block 1 at: Tue Mar 11 13:45:00 2014 DUMP: dumping (Pass III) [directories] DUMP: dumping (Pass IV) [regular files] DUMP: 3.40% done at 81157 kB/s, finished in 2:21 DUMP: 6.98% done at 83188 kB/s, finished in 2:13 DUMP: 10.47% done at 83073 kB/s, finished in 2:08 DUMP: 13.93% done at 82900 kB/s, finished in 2:03 DUMP: 17.09% done at 81362 kB/s, finished in 2:01 DUMP: 20.49% done at 81303 kB/s, finished in 1:56 DUMP: 23.69% done at 80570 kB/s, finished in 1:52 DUMP: 27.02% done at 80432 kB/s, finished in 1:48 DUMP: 30.05% done at 79522 kB/s, finished in 1:44 DUMP: 33.28% done at 79260 kB/s, finished in 1:40 DUMP: 36.35% done at 78722 kB/s, finished in 1:36 DUMP: 39.41% done at 78243 kB/s, finished in 1:32 DUMP: 42.52% done at 77926 kB/s, finished in 1:27 DUMP: 45.66% done at 77710 kB/s, finished in 1:23 DUMP: 48.82% done at 77551 kB/s, finished in 1:18 DUMP: 51.83% done at 77177 kB/s, finished in 1:14 DUMP: 55.00% done at 77090 kB/s, finished in 1:09 DUMP: 57.86% done at 76596 kB/s, finished in 1:05 DUMP: 60.77% done at 76215 kB/s, finished in 1:01 DUMP: 63.74% done at 75942 kB/s, finished in 0:56 DUMP: 66.45% done at 75408 kB/s, finished in 0:53 DUMP: 69.58% done at 75374 kB/s, finished in 0:48 DUMP: 72.52% done at 75146 kB/s, finished in 0:43 DUMP: 75.48% done at 74959 kB/s, finished in 0:39 DUMP: 78.39% done at 74736 kB/s, finished in 0:34 DUMP: 81.29% done at 74521 kB/s, finished in 0:29 DUMP: 84.02% done at 74164 kB/s, finished in 0:25 DUMP: 86.86% done at 73937 kB/s, finished in 0:21 DUMP: 89.71% done at 73725 kB/s, finished in 0:16 DUMP: 92.51% done at 73494 kB/s, finished in 0:12 DUMP: 95.32% done at 73277 kB/s, finished in 0:07 DUMP: 97.54% done at 72645 kB/s, finished in 0:04 DUMP: 99.71% done at 71997 kB/s, finished in 0:00 DUMP: Closing /mnt/home.2014-03-11.dump DUMP: Volume 1 completed at: Tue Mar 11 16:30:43 2014 DUMP: Volume 1 715360690 blocks (698594.42MB) DUMP: Volume 1 took 2:45:43 DUMP: Volume 1 transfer rate: 71946 kB/s DUMP: 715360690 blocks (698594.42MB) on 1 volume(s) DUMP: finished in 9943 seconds, throughput 71946 kBytes/sec DUMP: Date of this level 0 dump: Tue Mar 11 13:44:54 2014 DUMP: Date this dump completed: Tue Mar 11 16:30:43 2014 DUMP: Average transfer rate: 71946 kB/s DUMP: DUMP IS DONE # vgdisplay --- Volume group --- VG Name vg0 System ID Format lvm2 Metadata Areas 1 Metadata Sequence No 27 VG Access read/write VG Status resizable MAX LV 0 Cur LV 5 Open LV 3 Max PV 0 Cur PV 1 Act PV 1 VG Size 5.46 TiB PE Size 128.00 MiB Total PE 44702 Alloc PE / Size 25480 / 3.11 TiB Free PE / Size 19222 / 2.35 TiB VG UUID g72JMA-wTIJ-wcM3-q0GX-MMe4-Zx4Z-uEzgYQ # lvcreate -L 1200g -n hometest vg0 Logical volume "hometest" created # mkfs -t ext4 /dev/vg0/hometest mke2fs 1.42.9 (4-Feb-2014) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=128 blocks, Stripe width=256 blocks 78643200 inodes, 314572800 blocks 15728640 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=4294967296 9600 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 102400000, 214990848 Allocating group tables: done Writing inode tables: done Creating journal (32768 blocks): done Writing superblocks and filesystem accounting information: done root@gwar:/# mount /dev/vg0/hometest /hometest root@gwar:/# cd /hometest root@gwar:/hometest# restore -rf /mnt/home.2014-03-11.dump restore: ./lost+found: File exists root@gwar:/hometest# rm restoresymtable root@gwar:/hometest# cd .. root@gwar:/# mount -o ro /dev/vg0/home-snap /home-snap root@gwar:/# diff -r home-snap hometest Binary files home-snap/dsorkin/libvirt_images/oderus.img and hometest/dsorkin/libvirt_images/oderus.img differ Binary files home-snap/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_001_ and hometest/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_001_ differ Binary files home-snap/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_002_ and hometest/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_002_ differ Binary files home-snap/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_003_ and hometest/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_003_ differ (lines omitted for sym links and special files) root@gwar:/# cmp home-snap/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_001_ hometest/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_001_ home-snap/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_001_ hometest/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_001_ differ: byte 495617, line 7008 root@gwar:/# sum home-snap/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_001_ hometest/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_001_ home/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_001_ 35772 4096 home-snap/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_001_ 55937 4096 hometest/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_001_ 35772 4096 home/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_001_ root@gwar:/# ls -l home-snap/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_001_ hometest/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_001_ home/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_001_ -rw------- 1 1004 users 4194304 Oct 17 23:46 home/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_001_ -rw------- 1 1004 users 4194304 Oct 17 23:46 home-snap/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_001_ -rw------- 1 1004 users 4194304 Oct 17 23:46 hometest/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/_CACHE_001_ root@gwar:/# tune2fs -l /dev/vg0/home-snap tune2fs 1.42.9 (4-Feb-2014) Filesystem volume name: <none> Last mounted on: /home-snap Filesystem UUID: 9b55f6e6-f0f0-4e48-b05c-810cbbaf2aeb Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index filetype extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize Filesystem flags: signed_directory_hash Default mount options: user_xattr acl Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 201326592 Block count: 805306368 Reserved block count: 40265318 Free blocks: 614251931 Free inodes: 200777989 First block: 0 Block size: 4096 Fragment size: 4096 Reserved GDT blocks: 832 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 8192 Inode blocks per group: 512 RAID stride: 128 RAID stripe width: 256 Flex block group size: 16 Filesystem created: Fri Sep 13 19:47:09 2013 Last mount time: Tue Mar 11 17:23:07 2014 Last write time: Tue Mar 11 17:23:43 2014 Mount count: 37 Maximum mount count: -1 Last checked: Fri Sep 13 19:47:09 2013 Check interval: 0 (<none>) Lifetime writes: 1595 GB Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 256 Required extra isize: 28 Desired extra isize: 28 Journal inode: 8 Default directory hash: half_md4 Directory Hash Seed: ed8a746d-558f-426e-822c-42518900ea68 Journal backup: inode blocks root@gwar:/#
Thanks!
I have duplicated the behavior using direct dump/restore as follows:
The corruption seems to occur when the original file ends in a lot of zeros. Dump/restore replaces some of these zeros with different values. The checksum of the corrupted files shows the problem is deterministic. The files are not sparse files. I think this is a legitimate defect that should be fixed.
Thanks
root@gwar:/# ls -lsh /hometest/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/CACHE_001 /home/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/CACHE_001
4.0M -rw------- 1 1004 users 4.0M Oct 17 23:46 /home/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/CACHE_001
4.0M -rw------- 1 1004 users 4.0M Oct 17 23:46 /hometest/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/CACHE_001
root@gwar:/# sum /hometest/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/CACHE_001 /home/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/CACHE_001
55937 4096 /hometest/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/CACHE_001
35772 4096 /home/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/CACHE_001
root@gwar:/# cmp /hometest/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/CACHE_002 /home/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/CACHE_002
/hometest/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/CACHE_002 /home/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/CACHE_002 differ: byte 471041, line 2494
root@gwar:/# od -Ad -c -j 471041 /home/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/CACHE_002
0471041 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
*
4194289 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
4194304
root@gwar:/hometest/dsorkin2# od -Ad -c -j 471041 /hometest/dsorkin2/.cache/mozilla/firefox/5lnxsfy1.default/Cache/CACHE_002 | head
0471041 006 301 026 235 033 221 L j \0 367 001 005 016 G + 245
0471057 \0 354 020 202 025 277 025 024 002 / 034 ! 4 r 004 [
0471073 002 241 D E 023 202 022 006 002 244 \f 025 021 305 006 254
0471089 024 Z 003 207 004 025 \0 030 \b G 002 026 E 307 023 o
0471105 . K ( 210 006 A 026 5 020 305 \a " 024 323 \a 032
0471121 001 271 004 274 \r 207 005 027 006 ; \0 233 025 } 030 1
0471137 034 \a 027 375 003 . 032 310 \0 _ : 357 032 w 006 G
0471153 022 371 004 247 020 302 2 333 \f 302 \v \a \0 202 001 b
0471169 020 351 1 232 @ \f 004 & 034 ~ 022 w \v , 002 a
0471185 021 B * 7 004 330 026 262 020 226 " , 001 \r 006 243
I'm getting this same issue with dump 0.4b44-7 on Ubuntu 16.04.1 LTS x86_64. I also reported the bug to Ubuntu at https://bugs.launchpad.net/ubuntu/+source/dump/+bug/1629617
Looking at _CACHE_001_ in debugfs and doing "dump_extents _CACHE_001_" I see:
Maybe that Uninit flag isn't handled properly?
Last edit: dreamlayers 2016-10-02
I think, I encountered the same bug on slackware64 15.0 using dump version 0.4b47 from the slackbuilds.org site (e2fsprogs-1,46.5).
When checking the restored dump of that tool, some files had a "garbage" block of data somewhere. e.g. at the end.
Comparing that with debugfs on the original ext4 filesystem, revealed that
the blocks on disk are very identical to the restored file, but dump seems not to honor some new(?) filesystem block marker, which can mark a filesystem block as "unwritten".
Example:
An example file had 1 Garbage block at the very end after restore.
the filefrag command gives some hint:
root@linux:/home/user/.mozilla/firefox/somedir/# filefrag -v 2918063365piupsah.sqlite
Filesystem type is: ef53
File size of 2918063365piupsah.sqlite is 49152 (12 blocks of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags:
0: 0.. 1: 12689028.. 12689029: 2:
1: 2.. 10: 12689044.. 12689052: 9: 12689030:
2: 11.. 11: 12689053.. 12689053: 1: last,unwritten,eof
2918063365piupsah.sqlite: 2 extents found
root@linux:/home/user/.mozilla/firefox/somedir/#
The last 4k block is marked as "unwritten". An when you mount the filesystem and copy that file,
that exact block is returned as 0 all over. But dump reads(and restores) the actual -but invalid- on disk data of that block, which contains garbage.
Maybe this is also some failure in the filesystem, that this block is always marked unwritten, but e2fsck -f does not complain about this, and obviously the kernel 5.15.63 does not complain and everything seems to work with other tools.
And of course, the original filesystem was offline during dump, so no runtime problems as mentioned from Linus 20 years ago should be relevant.
Conclusion:
- For me, only a very small amount of files were affected: (less than 0.01%)
- dump in its current version(0.4b47) is unfortunately not usable for ext4 filesystem backups.
- it's a pity, nobody cares about that backup tool :(
Thank you for investigating this. It is easy to create a file with a single unwritten extent using the fallocate command, like: fallocate -l 512 foo. Such a file will trigger this bug.
I've just reported this as #176 and attached a patch there that fixes this.
attaching patch here too.