Accidental corruption of reiser4 partition
Reiser4 file system for Linux OS
Brought to you by:
edward6
Status: Not reproducible
http://marc.info/?l=reiserfs-devel&m=141502132410905&w=2
Fsck results:
http://marc.info/?l=reiserfs-devel&m=141510698708213&w=2
The corruption was preceded by unsuccessful mount:
http://marc.info/?l=reiserfs-devel&m=141491783319853&w=2
With reiser4-for-3.17.2.patch and reiser4-for-3.16.2.patch Reiser4 makes silent corruptions to filesystem.
Corruptions only show when you access the file as:
reiser4[zsh(6623)]: key_warning (/mkdeb/build/linux/linux/fs/reiser4/plugin/file_plugin_common.c:512)[nikita-717]: WARNING: Error for inode 174112 (-2)
Of 4 times this happening to me 3 times they damaged the twig and once file.
Last time it was /usr/src directory
Full /var/log/messages is attached as var.log.messsages.20141219
fsck.reiser4 finds errors (attachment fsck.sda2.21041219) and when you try to correct them with fsck.reiser4 --build-fs (attachment fsck.buildfs.sda2.21041219 ) fsck doesn'r do a good job of repairing (attachment var.log.messages.20141219.2).
When you try to fsck.reiser4 again you get new errors (fsck.sda2.21041219.2) but fsck.reiser4 --build-fs now spreads damage so now whole /usr is destroyed (fsck.buildfs.sda2.21041219.2).
When I try to copy the data from damaged partition kernels with patches 3.16.2 and 3.17.2 they froze in the middle, but when I tried with kernel 3.14 from systemrescueCD it worked ok. Looks like they choked on .metacity folder (files metacity.good(from backup) and metacity.bad(from location I tried saving data to)).
I have saved filesystem image with dd, metadata, profile and tree with debugfs.reiser4.
PS. Corruptions started with .config (attachment 3.17.config) I used fr 3.17.2 from 3.10.patch that worked.
PPS. reiser4progs.1.0.9
Last edit: Dushan Tcholich 2014-12-20
These are 3 different issues and as soon it's possible to open tickets I'll do it for each:
1. reiser4-for-3.16 and 3.17: Silent corruption of filesystem;
2. Reiser4progs-1.0.9: fsck.reiser4 --build-fs does additional damage when damaged twig node
3. reiser4-for-3.16: It takes very long to copy Evolution Sent file.
3: When trying to copy Evolution Sent file 3.16 is just reading from disk (only bi no bo in vmstat) for hours. With 3.14 and 3.10 everything works OK (finishes cp && sync in ~10sec).
Last edit: Dushan Tcholich 2014-12-20
Why are you sure that fsck does additional damage? Did you mount the partitoin read-only after fsck --build-fs, or what?
Order of actions (detailed in var.log.messages.20141219.2):
~10:20 rsync -a /myroot /safe
after it went slow: ctrl+c && umount /myroot && fsck.reiser4 /dev/sda2 && fsck.reiser4 --build-fs /dev/sda2
~10:27 mount -o ro /dev/sda2 /myroot
rsync -a /myroot /safe
after few hours ctrl+c because it stalled and scary messages in logs
umount /myroot && fsck.reiser4 /dev/sda2 && fsck.reiser4 --build-fs /dev/sda2 <---- Here it showed fsck.logs.2 about missing names to folders in /usr ( fsck.sda2.21041219.2 ) like local, sbin, tmp...
~14:15 mount /dev/sda2 /myroot <---- This is where I didn't remount it ro
One more report from Mathieu Bélanger:
http://marc.info/?l=reiserfs-devel&m=145970547706544&w=2
It seems that corruption happens in some path which is triggered in not all configurations.
Actually fsck fixes everything. However, once the partition is mounted in rw-mode, it immediatelly gets corrupted again, so it can make a wrong impression that fsck does't work properly.
Release for 4.5.3 contains the patch, which removes residual block barriers support:
http://marc.info/?l=reiserfs-devel&m=145987104231235&w=2
For older kernels it is highly recommended to use mount option "no_write_barrier".