Jan, you write:
> I got reproducible corruptions while doing the following:
>
> lvcreate -n test -L 100M vg1
> mke2fs -b 1024 /dev/vg1/test
> ext2prepare -v /dev/vg1/test 50G
> mount /dev/vg1/test /mnt/test/
>
> at the same time, do:
>
> cp -a /usr/src/linux /mnt/test/linux
> cp -a /mnt/test/linux /mnt/test/linux2
> cp -a /mnt/test/linux2 /mnt/test/linux3
> ...
>
> and
> e2fsadm -L+90M /dev/vg1/test
> e2fsadm -L+90M /dev/vg1/test
> ... whenever necessary
>
> The corrupted files contain blocks of other files or directory information
> on 1024k boundaries.
This sounds _very_ similar to the problems that people on linux-kernel
are having... They were getting duplicate blocks when copying lots of
files. Did you apply the one-line fix that moved the "head" initialization
inside the loop (this is an IDE problem)? Sorry, I don't recall the exact
patch, but a quick search of l-k should find it.
> Once I even got fs metadata corruptions, but data corruption seems to occur
> more often (but that may be caused by the amount of data vs. metadata)
>
> Kernel version is 2.4.0-test12-pre3 with current ext2-compat and online-ext2
> patches.
>
> Is this a known problem? I read on linux-kernel that there are some problems
> with 1k blocksized filesystems, but I did test it with the same configuration
> without resizing and got no corruptions.
I have never had problems while doing a resize. I sometimes run random
resizes while copying data into the filesystem and never see this. Most
of my testing is under 2.2, however.
In any case - if you follow through what the ext2online and kernel
patch really does, there is really nowhere that it could introduce
data corruption. The only parts of the disk it writes to are:
- block/inode bitmaps, inode table _after the end of the current filesystem_.
- _unused_ group descriptor entries. Even if it somehow (I don't know how
it would happen) wrote "old" data over the top of the current group
descriptors, the worst that would happen is "df" would be out a bit, and
you may get an ENOSPC error finding a free block in a group without any.
- reserved inode #7 (resize inode). Again, if it makes a mistake and
overwrites the whole block of inodes (can't _really_ happen) it would
only affect the first 8 inodes (first 32 on a 4k filesystem). For a 1k
filesystem, the only possibility would be the root inode.
So because of this, I _think_ it's the same kernel bug as others are having.
That, and the fact that nobody else has previously reported a bug like this.
Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert
|