Online Hierarchical Storage Manager / BUGS / #49 OHSM scrub utility

#49 OHSM scrub utility

Milestone: OHSMv1.3

Status: open

Owner: nidhi hada

Labels: Enhancements (20)

Priority: 4

Updated: 2010-03-22

Created: 2010-01-16

Creator: Sandeepksinha

Private: No

OHSM needs a tool to periodically scrub the filesystem to make sure that the blocks allocated to the files are from their correct tier. A file might not contain the blocks from the correct tier after a disable/enable sequence. OHSM scrub must have two versions:

1. Normal Scrub:
Which will just extract the tier of the file and make sure that the blocks are from the correct tier. If they are not you may enforce relocation on that file.
We might optimize it further by just relocating the extent range which does not lie in the correct range. But enforcing complete relocation might purely help defrag which is good and I think should be the easiest and the right way to go ahead.

2. Advanced Scrub:
This will be responsible for making sure that the home_tier is in accordance with the allocation policy on the file system. Normal Scrub is a subset of Advanced Scrub. Just put a cron job over the night or weekend, it will scrub it for ya....

Bharti is gonna work on this......

Discussion

Sandeepksinha - 2010-01-16

Bharati moving your way.....
You might have to use FIEMAP and write a new ioctl to get the allocation policy and topology cache in userspace...

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Sandeepksinha - 2010-01-16

assigned_to: nobody --> bharatialatgi
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Sandeepksinha - 2010-01-16

milestone: --> OHSMv1.3
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Sandeepksinha - 2010-03-22

Assigning it to Nidhi.....!!!
One of the most interesting of all the features.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Sandeepksinha - 2010-03-22

assigned_to: bharatialatgi --> nidhihada
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Greg - 2010-03-29

As to relocating the full file if even one extent is in the wrong tier:

I haven't really looked at the ohsm scrub software, but since it is based on e4defrag, I assume it suffers from the same problem below:

For e4defrag I'm pretty sure this is right:

First, assume defrag of a non-sparse 1TB file.

The current code will walk the extent tree and create a single extent
group that covers the full 1TB, then call fallocate to try to get 1TB
of donor blocks. Then compare the number of extents in the original
and the donor. If the donor has less it will swap in the donor
blocks.

It seems much smarter work on extent size chunks (or whatever best
fits the kernels block structure.

ie.

for (start_block=0; start_block < max_blocks; start_block+=
max_blocks_in_extent)

current_extents = num_extents_in_block_range(start_block,
start+max_blocks_in_extent);

if (current_extents == 1) continue;

// allocate a sparse file with perfectly aligned donor blocks as
currently required by kernel
fallocate(start_block * block_size, max_blocks_in_extent * block_size);

donor_extents = num_extents_in_block_range(start_block,
start+max_blocks_in_extent);

if (donor_extents < current_extents)
donate_donor_blocks_to_orig(start_block,
start+max_blocks_in_extent);

)

And in the case of a sparse file, it seems much easier to understand
if the above is called on each logically contiguous set or data
blocks. Seriously, why bother the kernel by making it able to accept
a block range that has holes in it.

It seems reasonable for the kernel to check the block range being
passed in and if the orig files has a hole in the middle of it, then
return an error.

Back to e4defrag, even if the code is not greatly simplified, the
above seems like it would use far less resources than the current
code. Think about a large file that has the first 90% of the blocks
defrag'ed. The above would cause just the tail to be defrag'ed, not
the entire file.

Greg

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

OHSM scrub utility

Group

Searches

Help

#49 OHSM scrub utility

Discussion