From: Mark W. K. <kr...@dr...> - 2002-08-06 01:55:06
|
The problem with ext2 dump in Linux is that even if you do everything right, dump will still get the wrong versions of some files. It's not dump's fault and there's nothing that dump can do about it, the 2.4 kernels just make it impossible for dump to see the correct version of some files. You can have a perfectly idle (but mounted) file system, no writes, no open file descriptors, do as many sync's as you like, wait several hours, and dump will still get some files wrong. You have to wait until the partition is unmounted for dump to work, and that's not practical. The problem is very real and easily reproducible. See my thread to dump-users on "incremental dump bug" about 3 weeks ago. Run as many syncs and sleeps as you like before the dumps, it won't help. The problem is Linux specific. Dump works as expected on Solaris, FreeBSD, etc. There are always risks with dump (or any backup program) on a live file system, but the Linux problems go far beyond the normal risks. That's the theoretical answer: ext2 dump on Linux on a mounted file system is deprecated, don't use it. The practical answer is that things aren't all that bad. Since learning of these problems some 3 weeks ago, I added restore -C to my backup scripts and make a cpio archive of the files that dump got wrong. So far, the only file that differs is /etc/dumpdates, which of course is different. But I would take Linus's message as a warning. Some day, not too far off, probably starting with the 2.5/2.6 kernels, the problems will get much worse and dump will become hopeless. Pity. Funny, the new "Linux Administration Handbook" by Nemeth, et al, published April 2002 (an excellent book) makes no mention of these problems and recommends dump (or something built on top of dump like Amanda) as the backup tool of choice. Linus wrote the forward, dated April 2001 (same month as his mail on lwn.net). Maybe he didn't tell them. P.S. This problem really needs to be mentioned in the man pages. Admins from other Unixes will expect this to work. P.P.S. Stelian, are we really, really sure it's a design issue in the 2.4 kernel and not just a bug in sync(2) or something? The problems I've seen are quite deterministic and happen precisely on files written without O_TRUNC. That sounds more like a bug than a design issue, but maybe I'm grasping at straws. --Mark |