Menu

#1300 File size in MB is inconsistent when viewing identical copies of pdf in different directories

Other
closed-fixed
nobody
None
5
2019-06-04
2019-04-29
ricker
No

When viewing identical copies of pdf files, the byte count is reliably the same, but the size in MB varies drastically. I think if both files are located in the same directory, they display the same MB size, but if the files are in different directories, the sizes will vary. I haven't been able to figure out if it's differing directory length, differeing directory component count, some other deterministic factor, or if it's random.

Discussion

  • Christiaan Hofman

    What do you mean by the size in MB? Which display? Can you describe a reproducible set of actions to take that reveals the problem?

     
  • Christiaan Hofman

    Also, are you sure they are identical files, or could it be that one has notes and the other doesn't?

     
  • ricker

    ricker - 2019-04-29

    identical in size and in byte-by-byte comparison. reproducible by just cp in.pdf /some/other/dir/out.pdf.
    edit: the attached images show a size difference of 0.6 MB though the byte count is identical. i have seen cases where the difference is around 10 MB (on a 100MB file).

     

    Last edit: ricker 2019-04-29
    • Christiaan Hofman

      Are they on the same volume, or perhaps some on a different kind of disk?

      We just show the sizes that we get from the system. Though the two numbers do not represent the identical size numbers, one is the physical file size and the other the logical file size..

       
      • ricker

        ricker - 2019-04-29

        The files are on the same disk/volume, just different subdirectories. The size is consistenly identical when opening both files in Preview.

         
        • Christiaan Hofman

          And they both have the same notes?

          And does Finder's info window report the same sizes?

          I don't know what sizes Preview reports.

          But I still don't see what is necessarily a bug. If the sizes are reported differently, they probably are different. A bit surprising that this is so large, but I don't know much about how efficient the file system is.

           
          • Christiaan Hofman

            Perhaps there are some permission issues with the directory containing file reporting the lower size. If we cannot get the full size numbers, we rely on some more inaccurate methods to get an approximate size.

             
          • ricker

            ricker - 2019-04-29

            And they both have the same notes?

            files are identical bytewise (e.g. cp in.pdf out.pdf)

            And does Finder's info window report the same sizes?

            yes, all apple tools show 4.6 MB for both files. ,ls -lh and Firefox pdf viewer show 4.4MB. That might be a Mebibyte vs Megabyte thing though. (Edit: this is confirmed: 4.4 * 1.049 = 4.6) Nonetheless, every tool shows the file size the same for both files.

            I don't know what sizes Preview reports.

            Preview reports the same size as Finder (4.6 MB)

            But I still don't see what is necessarily a bug. If the sizes are reported differently, they probably are different. A bit surprising that this is so large, but I don't know much about how efficient the file system is.

            If the files are bytewise identical and every other tool shows the identical file size...

             

            Last edit: ricker 2019-04-29
            • Christiaan Hofman

              You should realize that there does not exast an "the file size". There are many ways to measure the file size, I can get at least 4. And we show two numbers representing two different ways to measure the file size. In particular, the MB size is not the bytewise size of the content, it is the physical size the file occupies on the disk, which can be larger (i.e. the space reserved for it). So they don't need to be the same. And, again, I do not know which of these sizes Preview or Finder report. So I do not necessarily see a bug. Again, is there a difference between the configration of the two files, like permissions of the file and/or the containing folders?

               
  • ricker

    ricker - 2019-04-29

    but I don't know much about how efficient the file system is.

    speaking of filesystem, after upgrading to Mojave, my fs is now APFS. and a little googling i found this: https://forums.developer.apple.com/thread/103162

    which says using FSGetCatalogInfo, which is now deprecated, might not work properly with APFS

     
    • Christiaan Hofman

      This is about inode, we don't use an inode. I don't directly see why the file sizes would be a problem, they certainly are not too big here. And the ones we use do not have modern replacement AFAICS.

       
  • Christiaan Hofman

    • status: unread --> open
     
  • Christiaan Hofman

    Using a different API to calculate the file sizes.

     
  • Christiaan Hofman

    • Status: open --> closed-fixed
     

Log in to post a comment.

MongoDB Logo MongoDB