Running genext2fs to build a filesystem that contains symlinks that are exactly 60 characters long result in those symlinks getting created (incorrectly) as "fast" symlinks where the data is stored within the inode.
Running e2fsck on the generated filesystem reports invalid symlinks, and from there things only get worse.
I noticed that all of the invalid symlinks were 60 bytes long and then ran across the following ext2 documentation:
From http://www.mjmwired.net/kernel/Documentation/filesystems/ext2.txt#189
Symbolic links are also filesystem objects with inodes. They deserve
special mention because the data for them is stored within the inode
itself if the symlink is less than 60 bytes long. It uses the fields
which would normally be used to store the pointers to data blocks.
This is a worthwhile optimisation as it we avoid allocating a full
block for the symlink, and most symlinks are less than 60 characters long.
Upon looking at the genext2fs v1.4.1 source, I found the following test in mklink_fs():
if(size <= 4 * (EXT2_TIND_BLOCK+1))
Elsewhere in the source, I found that EXT2_TIND_BLOCK is 14, hence the test is for <= 60.
Changing this to instead test for "size < 4 * ...", re-building, re-generating my filesystem image, and re-running e2fsck now reports no corruption.
Looks like a corresponding change (from > to >=) needs to be made on (or near) line 1544 to ensure that stats.nblocks counts the block used to store 60-char regular symlinks... though it appears that this only affects stats, and not the validity of the generated filesystem.
correcting/improving above note:
the value of EXT2_N_BLOCKS is the array size of i_block and equals 15.
sizeof(i_block) resolves to 60 as i_block array is based upon __u32 data tye.
i am now basically confirming the problem.
please also see http://sourceforge.net/tracker/?func=detail&aid=3371606&group_id=2406&atid=202406
Ted Tso claimed symlink format to be like this (refined in my own words):
when an inode is flagged as a symlink entry
then the i_block array will hold a string if the space is sufficient.
a string consists of the string terminating '\0' character (non-printable)
and 1-59 printable characters.
for larger strings the string data is stored elsewhere.
i_size is set to the same as str/n/len() function family would return, the length of the string in chars without trailing zero.
i had to add:
in practical cases the following member i_generation has a value of 0
resulting in the bug probably might ever hit in normal operation.
but for picky tools its a violation.
e2fsck typically offers deletion of the violating symlink which is a correct offer
but indeed that is dangerous for the data the user wants to stay on the target file system.
this should fix the problem:
http://lists.linuxtogo.org/pipermail/openembedded-devel/2011-July/034017.html
(duplicating that patch into attachment in a few moments)
Fixed in CVS.
Hi Finn,
when are you planning to make the next release? There were many fixes since 1.4.1 and many projects like for example Buildroot have to ship a huge patch to get all knwon issues fixed in their repositories.
Thanks.
Hi Yegor,
I have several patches in my tree but I didn't find a moment to make
them pass all tests (there are still bugs somewhere).I'll try to have a
look at what's shipped by buildroot.
Xav
Le lundi 16 janvier 2017 à 08:02 +0000, Yegor Yefremov a écrit :
Related
Bugs: #2
Hi Xavier,
BR's patch is only an "artificial" bump to CVS version 1.118.
https://git.busybox.net/buildroot/tree/package/genext2fs/0001-update-genext2fs.c-to-rev-1.118.patch
So it has no new changes. But new release would at least eliminate the need of this patch and also such distros as Debian would be able to update and fix the known issues. Espcecially this one, because it make problems when creating an ext4 image via genimage from a debootstrap created Debian root filesystem.
Last edit: Yegor Yefremov 2017-01-16