Menu

#49 OHSM scrub utility

OHSMv1.3
open
4
2010-03-22
2010-01-16
No

OHSM needs a tool to periodically scrub the filesystem to make sure that the blocks allocated to the files are from their correct tier. A file might not contain the blocks from the correct tier after a disable/enable sequence. OHSM scrub must have two versions:

1. Normal Scrub:
Which will just extract the tier of the file and make sure that the blocks are from the correct tier. If they are not you may enforce relocation on that file.
We might optimize it further by just relocating the extent range which does not lie in the correct range. But enforcing complete relocation might purely help defrag which is good and I think should be the easiest and the right way to go ahead.

2. Advanced Scrub:
This will be responsible for making sure that the home_tier is in accordance with the allocation policy on the file system. Normal Scrub is a subset of Advanced Scrub. Just put a cron job over the night or weekend, it will scrub it for ya....

Bharti is gonna work on this......

Discussion

  • Sandeepksinha

    Sandeepksinha - 2010-01-16

    Bharati moving your way.....
    You might have to use FIEMAP and write a new ioctl to get the allocation policy and topology cache in userspace...

     
  • Sandeepksinha

    Sandeepksinha - 2010-01-16
    • assigned_to: nobody --> bharatialatgi
     
  • Sandeepksinha

    Sandeepksinha - 2010-01-16
    • milestone: --> OHSMv1.3
     
  • Sandeepksinha

    Sandeepksinha - 2010-03-22

    Assigning it to Nidhi.....!!!
    One of the most interesting of all the features.

     
  • Sandeepksinha

    Sandeepksinha - 2010-03-22
    • assigned_to: bharatialatgi --> nidhihada
     
  • Greg

    Greg - 2010-03-29

    As to relocating the full file if even one extent is in the wrong tier:

    I haven't really looked at the ohsm scrub software, but since it is based on e4defrag, I assume it suffers from the same problem below:

    For e4defrag I'm pretty sure this is right:

    First, assume defrag of a non-sparse 1TB file.

    The current code will walk the extent tree and create a single extent
    group that covers the full 1TB, then call fallocate to try to get 1TB
    of donor blocks. Then compare the number of extents in the original
    and the donor. If the donor has less it will swap in the donor
    blocks.

    It seems much smarter work on extent size chunks (or whatever best
    fits the kernels block structure.

    ie.

    for (start_block=0; start_block < max_blocks; start_block+=
    max_blocks_in_extent)

    current_extents = num_extents_in_block_range(start_block,
    start+max_blocks_in_extent);

    if (current_extents == 1) continue;

    // allocate a sparse file with perfectly aligned donor blocks as
    currently required by kernel
    fallocate(start_block * block_size, max_blocks_in_extent * block_size);

    donor_extents = num_extents_in_block_range(start_block,
    start+max_blocks_in_extent);

    if (donor_extents < current_extents)
    donate_donor_blocks_to_orig(start_block,
    start+max_blocks_in_extent);

    )

    And in the case of a sparse file, it seems much easier to understand
    if the above is called on each logically contiguous set or data
    blocks. Seriously, why bother the kernel by making it able to accept
    a block range that has holes in it.

    It seems reasonable for the kernel to check the block range being
    passed in and if the orig files has a hole in the middle of it, then
    return an error.

    Back to e4defrag, even if the code is not greatly simplified, the
    above seems like it would use far less resources than the current
    code. Think about a large file that has the first 90% of the blocks
    defrag'ed. The above would cause just the tail to be defrag'ed, not
    the entire file.

    Greg

     

Log in to post a comment.