How does fsck handle these cases?

2005-01-08
2012-11-28
  • FengCHeng Lu
    FengCHeng Lu
    2005-01-08

    Hello all
        When fsck check one partitions, how does it handle the following cases:
    1. all blocks of an inode pointer are NULL, but the inode->i_size is NOT zero.
    2. inode->i_size is NOT zero, some blocks of an inode are NULL, the other blocks of the same inode are NOT NULL, for example:
        inode->i_block[0] is NULL;
        inode->i_block[1..4] is NOT NULL;
        inode->i_block[5, 6] is NULL;
        inode->i_block[7..10] is NOT NULL
        ....................

    Any answer are welcome!
    Thanks!

     
    • If you want to know what fsck actually does, you can find it in the fsck source code (I haven't looked, so I can't tell you).  So I'm going to answer a slightly different question, i.e. what SHOULD fsck do in those cases?

      I believe both cases are perfectly legal.  They represent sparse files.  Certainly the 2-nd case is a legal sparse file for sure.  You can create such a file like this:
      fd=open("/tmp/foo", O_CREAT | O_TRUNC | O_RDWR);
      lseek(fd, offset of block#1, SEEK_SET);
      write(fd, buf, 4 blocks);
      lseek(fd, offset of block #7, SEEK_SET);
      write(fd, buf, 4 blocks);
      close(fd);

      The first case may be somewhat questionable, but I think you can produce it like this:

      fd=open("/tmp/foo", O_CREAT | O_TRUNC | O_RDWR);
      ftruncate(fd, 1024*1024);
      close(fd);

      maybe you'll have to insert the following 2 lines between open and ftruncate:

      lseek(fd, 1024*1024, SEEK_SET);
      write(fd, buf, sizeof(buf));

      If memory serves me right, posix states that after ftruncate the next write would end up writing to offset 1024*1024.  The only way to reflect that fact is to set inode's i_size to 1024*1024, while all block pointers in the inode would be 0.

       
    • FengCHeng Lu
      FengCHeng Lu
      2005-01-11

          Thank for your kind answer!
          I wrote some examples to test the 2 cases behavious. the partition block size is 1024
          Example1 Codes: 3,6 8,50 8,2048, 1024,2048
             #define BLK_SIZE 1024
             int main(int argc, char* argv[])
             {
              int blkNbr1;
              int blkNbr2;
             
              if(argc != 3)
                  return 0;
              blkNbr1 = atoi(argv[1]);
              blkNbr2 = atoi(argv[2]);
             
            fd = open("foo", O_CREAT | O_TRUNC | O_RDWR);

            lseek(fd, blkNbr1 * BLK_SIZE, SEEK_SET);
            write(fd, "hello", 5);

            lseek(fd, blkNbr1 * BLK_SIZE, SEEK_SET);
            write(fd, "world", 5);
           
            close(fd);
           
            return 0;
            }
            the program creates the foo file, different parameters make the foo size different.
            the space used by foo file is also different. these cases are legal.
            But I don't find a way to compute how much space is used by the foo file.
            the test results are:
            Test blkNbr1      blkNbr2     space used by foo
            1    3            6           7k/7 blocks
            2    8            50          8k/8 blocks
            3    8            2048        7k/7 blocks
            4    1024         2048        8k/8 blocks
           
            For the test 1, only i_block[3], i_block[6] are NOT zero, the other block pointers
            are all zero. so only 2 blocks are used. but the test result is that 7 blocks are used.
            Why?
           
            Test 2 has one more block used than test 1. it is for the block pointer table.
           
            Test 3 has one less block used than test 2. Test 4 has one more block used than test 3.
            Why?
           
            Could you tell me the reason.
           
            more test results:
            blkNbr1      blkNbr2     space used by foo     
             3            0           4k
             3            1           4k
             3            2           4k
             3            3           4k
             3            4           5k
             3            5           6k
             3            6           7k
             3            7           8k
             3            8           9k
             3            9           6k
             3            10          7k
             3            11          8k
             3            12          6k
             3            13          7k
             3            14          8k
             3            15          9k                    
             3            16          6k        
             3            17          7k        
             3            18          8k        
             3            19          9k        
             ...........................

       
      • You don't say how you determined the number of blocks used.  You also don't say what you did between the different test runs, i.e. did you "rm" the file, or did you simply count on O_TRUNC to truncate the file first?

        I'm not very familiar with ext3 block allocation policy, so it is possible that it tries to pre-allocate some space under some circumstances.  But this is just guessing on my part.

        I've seen various utilities such as ls, du, df, all return different space usage results (I don't know why).  In my opinion debugfs is the only reliable method to see what's really going on.  You'll probably have to unmount the filesystem to be sure that everything relevant has been flushed to disk before you look at it with debugfs.  You can use the debugfs "stat" and "mi" commands to see exactly what's going on with your file's inode, and which block pointers are being used.

         
    • FengCHeng Lu
      FengCHeng Lu
      2005-01-12

      Hello
          I am sorry I didn't give you the detailed description. In yesterday test, I use df command to watch the space usage. After every test, I umount partition and mount it, then check the space usage. For every file, I create it with O_TRUNC option.

      Today, I use the debugfs and stat to do the same test. the test result:
      Test blkNbr1 blkNbr2        blocksUsed   i_block[]               blocks used pointer table
      1     3     6             7         0-6                          0
      2     8     50             8         8-11, 48-50                  1
      3     8     2048             7         8-11, 2048                   2
      4     8     2047             10        8-11, 2044-2047              2
      5     8     2046             9         8-11, 2044 - 2046            2
      6     1024     2048             8         1024-1027 2048               3

      As you said, filesystem tries to pre-allocate some space under some circumstances. But sometimes filesystem doesn't pre-allocate some space. For example test 3, no pre-allocate action happen when the 2048th block is allocated.

      Do you know what policy is used for te?he pre-allocate.

       
      • I don't know ext3 preallocation policy.  Sorry.

        However, I have a feeling that what you are observing has more to do with VM page alignment, rather than preallocation.  The size of your vm page is 4 Kbytes, while the size of your filesystem block is 1 Kbytes.  Placement of your buffers is not obvious since you don't malloc them, but rather you use string constants.  I suggest you malloc an 8 Kbyte buffer, and then use a 1-Kbyte subset of that buffer on various 1K boundaries to see how that affects block allocation at ext3 level.

         
    • FengCHeng Lu
      FengCHeng Lu
      2005-01-13

      I did the test as you said: allocate a 8k buffer and write 1k data. the test result are same. But if I write 4k data. there are not any pre-allocate blocks.