#292 e2fsck alternates when fixing group inode count

open
e2fsck (61)
5
2014-08-24
2011-11-04
No

I have an ext4 filesystem created by Android's make_ext4fs on an ARM device with incorrect group inode counts that confuses e2fsck. When run once, it "fixes" the counts. When immediately run again, it complains that they're wrong, and that the original value was the correct one! Each time it runs, it alternates between the two values, never being satisfied. Here's an example run:

/ # /e2fsck -v -f /dev/mmcblk0p8
e2fsck 1.42-WIP (16-Oct-2011)
ext2fs_check_if_mount: Can't check if filesystem is mounted due to missing mtab file while determining whether /dev/mmcblk0p8 is mounted.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Inode bitmap differences: -(7833--7834) +(7912--7919) -(15661--15668) +(15872--15887) -(23487--23502) +(23725--23748) -(31313--31336) +(31455--31486)
Fix<y>? yes

Free inodes count wrong for group #0 (7599, counted=7601).
Fix<y>? yes

Directories count wrong for group #3 (24, counted=28).
Fix<y>? yes

Free inodes count wrong for group #4 (7716, counted=7684).
Fix<y>? yes

Free inodes count wrong (38267, counted=38237).
Fix<y>? yes

/dev/mmcblk0p8: ***** FILE SYSTEM WAS MODIFIED *****

933 inodes used (2.38%)
6 non-contiguous files (0.6%)
1 non-contiguous directory (0.1%)
# of inodes with ind/dind/tind blocks: 0/0/0
Extent depth histogram: 860
94828 blocks used (60.53%)
0 bad blocks
0 large files

795 regular files
64 directories
0 character device files
0 block device files
0 fifos
0 links
65 symbolic links (65 fast symbolic links)
0 sockets
--------
924 files
/ # / # /e2fsck -v -f /dev/mmcblk0p8
e2fsck 1.42-WIP (16-Oct-2011)
ext2fs_check_if_mount: Can't check if filesystem is mounted due to missing mtab file while determining whether /dev/mmcblk0p8 is mounted.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Inode bitmap differences: -(7833--7834) +(7912--7919) -(15661--15668) +(15872--15887) -(23487--23502) +(23725--23748) -(31313--31336) +(31455--31486)
Fix<y>? no

Free inodes count wrong for group #0 (7601, counted=7599).
Fix<y>? yes

Directories count wrong for group #3 (28, counted=24).
Fix<y>? yes

Free inodes count wrong for group #4 (7684, counted=7716).
Fix<y>? yes

Free inodes count wrong (38237, counted=38267).
Fix<y>? yes

/dev/mmcblk0p8: ***** FILE SYSTEM WAS MODIFIED *****

/dev/mmcblk0p8: ********** WARNING: Filesystem still has errors **********

903 inodes used (2.31%)
6 non-contiguous files (0.7%)
1 non-contiguous directory (0.1%)
# of inodes with ind/dind/tind blocks: 0/0/0
Extent depth histogram: 860
94828 blocks used (60.53%)
0 bad blocks
0 large files

795 regular files
64 directories
0 character device files
0 block device files
0 fifos
0 links
65 symbolic links (65 fast symbolic links)
0 sockets
--------
924 files
/ # /debugfs /dev/mmcblk0p8
debugfs 1.42-WIP (16-Oct-2011)
debugfs: stats
Filesystem volume name: <none>
Last mounted on: /system
Filesystem UUID: 57f8f4bc-abf4-0000-675f-946fc0f9f25b
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal resize_inode filetype extent sparse_super large_file
Filesystem flags: unsigned_directory_hash
Default mount options: (none)
Filesystem state: not clean
Errors behavior: Remount read-only
Filesystem OS type: Linux
Inode count: 39170
Block count: 156672
Reserved block count: 0
Free blocks: 61844
Free inodes: 38267
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 39
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 7834
Inode blocks per group: 490
Last mount time: Fri Nov 4 19:47:05 2011
Last write time: Fri Nov 4 22:24:43 2011
Mount count: 0
Maximum mount count: -1
Last checked: Fri Nov 4 22:24:43 2011
Check interval: 0 (<none>)
Lifetime writes: 353 MB
Reserved blocks uid: 0 (user unknown)
Reserved blocks gid: 0 (group unknown)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: tea
Journal backup: inode blocks
Directories: 60
Group 0: block bitmap at 41, inode bitmap at 42, inode table at 43
652 free blocks, 7599 free inodes, 10 used directories
Group 1: block bitmap at 32809, inode bitmap at 32810, inode table at 32811
1347 free blocks, 7749 free inodes, 1 used directory
Group 2: block bitmap at 65536, inode bitmap at 65537, inode table at 65538
21843 free blocks, 7615 free inodes, 1 used directory
Group 3: block bitmap at 98345, inode bitmap at 98346, inode table at 98347
16666 free blocks, 7588 free inodes, 24 used directories
Group 4: block bitmap at 131072, inode bitmap at 131073, inode table at 131074
21336 free blocks, 7716 free inodes, 24 used directories
debugfs:

Discussion

  • Theodore Ts'o

    Theodore Ts'o - 2011-11-07

    Yeah, this is due to a known bug in make_ext4fs. Valid ext3 and ext4 file systems have always had the requirement that the number of inodes per group must be a multiple of the inodes per block. (i.e., with a 256 byte inodes and 4096 bytes per block, the number of inodes per group must be a multiple of 16)

    My personal opinion is that it's very unfortunate that some handset manufacturers are apparently allergic to GPL code in userspace (even though they are fine with GPL code in the kernel), since it leads to pointless, unnecessary, (and in some cases, buggy) rereimplementation of code just to use a BSD license, and since it apparently led to some phones being unable to fix (or detect) corrupted file systems because e2fsck was not included, and it lead to mke2fs being badly and incorrectly reimplemented.

    In this case, the number of inodes per group was not only not a multiple of 16, as required, it's not even a multiple of 8. Which means that the inode bitmap code in e2fsprogs doesn't even have a hope of doing the right thing, since it assumes that each block group's portion of the inode allocation bitmap begins on a byte boundary when all of the block group's inode allocation bitmap fragments are concatenated together, and this won't be true if the number of inodes per group isn't a multiple of 8.

    What version of make_ext4fs are you using? I was under the impression that make_ext4fs screwed up by creating file systems that had inodes/block group that could fail the multiple of 16 criteria (which caused problems with Linux kernels > 2.6.37) but this is the first I've heard that make_ext4fs could create file systems that had inodes/group wasn't even a multiple of 8.

     
  • Dan Fandrich

    Dan Fandrich - 2011-11-08

    Thanks for the pointer. I patched make_ext4fs so it generates enough inodes to fill a block, and e2fsck works like a charm! Another peculiarity I had to work around is that make_ext4fs doesn't clear unused inodes, which I discovered is rather a problem when running on a pre-2.6.37 kernel.

    make_ext4fs does have one feature that mke2fs doesn't, which is the ability to dynamically populate the ext4 filesystem on the fly during filesystem creation. That can't always be easily done with mke2fs on a loopback filesystem given that the filesystem image being created could be 16 GB. And, I'm not aware of a replacement for its "sparse" writing feature, which creates a custom compressed file format that only contains those blocks actually written. That's very useful when initially provisioning a device.

     
  • Nobody/Anonymous

    When you say "doesn't clear unused inodes", do you mean it's not zero'ing the inode table? That shouldn't be a problem, so long as you don't have to run e2fsck on the file system after it gets corrutped. How was it calling problems on pre-2.6.37 kernels?

    In fact, ext4 has a feature where the responsibility to initialize the inode table is transferred to the kernel, and is done lazily when the file system is first mounted. So just not zero'ing the inode table is unsafe from a e2fsck recovery point of view, but it shouldn't have affected pre 2.6.37 kernels.

    Or am I misunderstanding what you meant?

     
  • Dan Fandrich

    Dan Fandrich - 2011-11-08

    You understood correctly: make_ext4fs is not zeroing the inode table. And you're right that it doesn't cause problems pre-2.6.37 except when running e2fsck. But it means that essentially you can never run e2fsck on such a filesystem, because chances are some inodes will never have been properly initialized.