Menu

#495 Properties attribute "size on disk" is incorrect

1.0
closed-fixed
PCMan
None
4
2012-05-22
2011-10-25
Brad Conte
No

The "properties" page for file(s) reports an incorrect total size for the property "Size on Disk". For example, a 513 byte file reports as being 8192 bytes on disk.

(Note that there is a similar defect already addressed in bug #2964807, but that was closed a year and a half ago.)

This is NOT a GIO bug.

There are two methods that provide the function of counting "size on disk", they are both located in "job/fm-deep-count-job.c". One is POSIX based (deep_count_posix) , one is GIO based (deep_count_gio). On my system, the POSIX function was used, but never the GIO one. I don't know if the GIO one has the same problem since it was never executed in my tests, but I think it does because GIO was blamed in the previous defect.

The bug is in these lines of code:

job->total_size += (goffset)st.st_size;
job->total_block_size += (st.st_blocks * st.st_blksize);

The stat() system call populates the "st" structure with data from the file's inode. Two of the struct members are used here:
- st_blocks reports the number of 512 byte blocks.
- st_blksize reports the size of the blocks for the filesystem I/O that the file is on.

Note that those two values are being multiplied to obtain the total size on disk, but they are unrelated to each other. st_blocks is calculated based on a block size that is unrelated to st_blksize. You can confirm this via the stat man page: http://www.kernel.org/doc/man-pages/online/pages/man2/stat.2.html . This is why the size on disk count is wrong, because the total number of blocks is based on 512 byte blocks but most modern file-systems are using 4096 byte blocks.

So for deep_count_posix, the bug is definitely in the PCManFM code. The fix would be to either:

a) find a system call that provides the total size of the file on disk (this is not available directly through stat()) - this seems like the "right" thing to do
b) manually round the file size up to the next largest filesystem I/O block size - this seems like a potentially hackish thing to do

On my own system I made some changes using option (b) and the "properties" dialog showed a more reasonable value for "Size on Disk".

For the cousin function, deep_count_gio, if it has the same bug, I would imagine that a change needs to be made there as well. Look at the GIO attributes it uses from the GFileInfo API (see here: http://developer.gnome.org/gio/2.29/GFileInfo.html#G-FILE-ATTRIBUTE-UNIX-BLOCK-SIZE:CAPS\), and note that looks like they are just wrappers to the same attributes as from stat() earlier. I bet that's what they are. Then the same logic applies and this means that, for this method, it isn't a GIO bug, it's just GIO providing poorly documented data.

However, GIO provides another file attribute: G_FILE_ATTRIBUTE_STANDARD_ALLOCATED_SIZE. The description for this attribute states that it's for "getting the amount of disk space that is consumed by the file", which sounds right. I don't know how this works or if it's correct, but the documentation, albeit brief, sounds encouraging.

So for deep_count_gio, the bug is still in PCManFM. The fix would be to either:

a) use G_FILE_ATTRIBUTE_STANDARD_ALLOCATED_SIZE or a different GIO interface to get the size of the disk - this seems like the "right" thing to do
b) manually round the file size up to the next largest filesystem I/O block size reported by G_FILE_ATTRIBUTE_UNIX_BLOCK_SIZE - this seems like a potentially hackish thing to do

I wish I knew what other file-managers were doing, but I haven't taken the time to investigate. The bottom line, though is that this bug definitely lies in PCManFM.

Hopefully that's useful info.

Discussion

  • Brad Conte

    Brad Conte - 2011-10-25

    OK, I just did some tests using G_FILE_ATTRIBUTE_STANDARD_ALLOCATED_SIZE, and it appears to produce the same effect as rounding up to the next largest block_size. It seems like the right GIO call to use.

     
  • PCMan

    PCMan - 2011-11-07
    • priority: 5 --> 6
    • assigned_to: nobody --> pcmanx
     
  • Anonymous

    Anonymous - 2011-11-11

    First of all, the line:

    job->total_block_size += (st.st_blocks * st.st_blksize);

    is wrong and it should be something like:

    job->total_block_size += (st.st_blocks << 9);

    and then it would give the size in bytes allocated for the file's data in the filesystem (given the filesystem is returning the right values) and this is the largest value you can get from the filesystem about the file's size. It reflects the number of bytes allocated for the file's data.

    The other thing which takes space on a filesystem is the file's metadata, i.e. the inode's data, which is __internal__ and it's not avaiable to query from userspace via normal calls. Space for inodes is usually preallocated when the filesystem is created, so it's not directly connected with actual files anyway. It usually consumes the same amount of space even if no files exist, unless the filesystem has some kind of dynamic allocation of inodes, but that would make an FS work slowly.
    In order to obtain this kind of value, you would have to use a tool for the specific filesystem in question. The filesystem queried for free space also should substract the inode table's size from the device's size first, so I honestly don't know why anyone would want to know them.

     

    Last edit: Anonymous 2015-04-19
  • Brad Conte

    Brad Conte - 2011-11-11

    > job->total_block_size += (st.st_blocks * st.st_blksize);
    >
    >is wrong and it should be something like:
    >
    >job->total_block_size += (st.st_blocks << 9);

    It's not wrong, they are equivalent in the case that st.st_blksize = 512, which, as far as I know, is always. However, it's advantage is that it's flexible should st.st_blksize ever not equal 512. It's the "right way" to do the calculation.

    > I honestly don't know why anyone would want to know them.

    I think you're right, I don't know of a file manager that counts the metadata with the file's size. Windows doesn't, and no other Linux FM I know of does.

     
  • PCMan

    PCMan - 2011-11-18

    this is a rather complicated issue. let's solve it later and not block 1.0 for this.

     
  • PCMan

    PCMan - 2011-11-18
    • priority: 6 --> 4
    • milestone: --> 2297538
    • status: open --> open-later
     
  • PCMan

    PCMan - 2012-05-22

    The bug was already fixed in git.

     
  • PCMan

    PCMan - 2012-05-22
    • milestone: 2297538 --> 1.0
    • status: open-later --> closed-fixed
     

Anonymous
Anonymous

Add attachments
Cancel





MongoDB Logo MongoDB