#50 max size of symlink target for embedding into an inode

open
nobody
None
5
2012-11-28
2011-07-19
Alexander Stohr
No

it appeared to me having a ext3 filesystem generated using the toolchain of the open embedded folks.

the resulting image exposed uncomfortable warnings about illegal inodes to me.
the offered option was to delete all those symlinks. but astonishingly they all worked
for me in a normal testbed environment and even when mounted from a PC based Linux.

after closer inspection of the image and the e2fsck program i came to the assumption
that all those popping up symlinks had a target length of exactly 60 ASCII characters
and that the function e2fsck_pass1_check_symlink from the file e2fsck/pass1.c from
my version, from the latest release and from the topmost version in the git repository
considers such lengths invalid when those embedded in the blocks array of the inode.

my support question right here is:
whats the official definition of the character encoding and max length for such sort of data embedding?

the text contained there is length encoded as i_size is used for encoding its length.
i know about the habit of some c-coders to still add a trailing zero in such setups.
and it might be the case that either a trailing zero is obligatory or its not needed.

the relevant check sequence from e2fsck looks like this with zero meaning an error:
if (inode->i_size >= sizeof(inode->i_block))
return 0;
len = strnlen((char *)inode->i_block, sizeof(inode->i_block));
if (len == sizeof(inode->i_block))
return 0;
}
if (len != inode->i_size)
return 0;
return 1;

other sources are quite un-precise or even contradicting each other:

http://book.opensourceproject.org.cn/kernel/kernel3rd/opensource/0596005652/understandlk-chp-18-sect-2.html
18.2.6.3. Symbolic link
As stated before, if the pathname of a symbolic link has up to 60 characters, it is stored in the i_block field of the inode, which consists of an array of 15 4-byte integers; no data block is therefore required. If the pathname is longer than 60 characters, however, a single data block is required.

http://www.nongnu.org/ext2-doc/ext2.html#DEF-SYMBOLIC-LINKS
Symbolic links are also filesystem objects with inodes. For all symlink shorter than 60 bytes long, the data is stored within the inode itself; it uses the fields which would normally be used to store the pointers to data blocks. This is a worthwhile optimisation as it we avoid allocating a full block for the symlink, and most symlinks are less than 60 characters long.

i need to know that because this decides which part of software i want to fix - either the checker or the image generator tool.
either the 60 character strings are valid for beeing stored in this block area or they are not. (the second option looks a little bit like space wasting to me.)

Discussion

  • Theodore Ts'o
    Theodore Ts'o
    2011-07-19

    The answer is the null character is considered part of the length. So the kernel code will use an external block for a symlink which is 60 characters or longer.

     
  • thanks for the quick reply to Ted.

    i think it worked mostly for the reason that the consecutive numeric value is always zero for symlinks.

    it might depend on the case if the '\0' has to be counted as a printable characters.
    at least other sources stated for what is allowed in the names for the directory structure.

    as strnlen() returns the length of a string without the trailing zero
    the i_size value must follow the same sheme. -> sane range for embedded string length 1-59.
    anything bigger is stored elsewhere.

    (still wondering where the official spec is stored and maintained - i often tend to primary sources in such cases.)

     
  • the origin of that my problem is seemingly located in a tool called genext2fs.
    this tool has not seen much of development in the past - last zip-file-only release is from 2007.

    https://sourceforge.net/tracker/?func=detail&aid=3218995&group_id=121652&atid=690992

    what about integrating that into the tools collection? does the licensing fit?
    (assuming it should be rebased to headers from this project plus several other adaptions like the build system)