Menu

Many small files in array?

Help
stelser
2014-08-23
2014-08-23
  • stelser

    stelser - 2014-08-23

    I recently received an error message during a sync attempt advising me that I have insufficient space for the parity file. My array is made up of multiple one terabyte drives and many of them are filled to within 1 gigabyte of space. One drive has over 60 gigabytes of space free and when I added a few files to that drive I suddenly began receiving the error message.

    That confused me, since there was still much more space free than on the other drives until I realized that that particular drive is the only one that has multiple small files: ebooks and mp3 files. There are many tens of thousands of files on that drive. I realize that I need some additional space of 0.128 megabytes for each file on a drive and that appears to be my problem.
    Can I solve it by spreading those files out over the other drives in the array? IOW, is that 0.128 MB extra space needed in the parity file an amount needed for all files in the array on all drives or only for the files on the drive with the maximum number of files?

    Any suggestions?

     
  • Leifi Plomeros

    Leifi Plomeros - 2014-08-23

    To be on the safe side you need to consider the extra space needed as up to 0.256 MB for each file. Specifically the waste is 256 KB minus file size for each file smaller than 256 KB.

    If you need 60 GB on a single disk and spread the small files evenly on 4 disks instead, then you will need 15 GB extra free space on each of the 4 disks instead.

    The ~0.128MB average waste is only true for files larger than 0.127 MB with random size.

    Edit: The parity disk needs have more room than the data stored on the data disks. You can solve that by leaving free space on the data disks OR by using a larger parity disk.

    An alternative solution is to use a smaller block size in the config file, but that would double the memory required each time you make the block size half. It would also require that you rebuild the parity from scratch.

     

    Last edit: Leifi Plomeros 2014-08-23
  • stelser

    stelser - 2014-08-23

    Thank you for the comments. Some of the files (a group of more than 40,000 files) are backups of a sort. They don't change and may never be needed. I suppose one solution would be to compress them into a single file. I could always extract them if ever needed. That drive has always made the sync very slow.

     

Log in to post a comment.

MongoDB Logo MongoDB