Re: [fuse-devel] [PATCH 0/10] fuse: An attempt to implement a write-back cache policy

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

(Modified CC list)

On 07/03/2012 11:53 AM, Pavel Emelyanov wrote:
> Hi everyone.
> 

Hi Pavel,

First off, thanks for this work and sorry for the late response relative
to your post. We had a chance to play around with this a bit in general
and throw some performance tests at a gluster volume and the result was
positive (something near a 4x improvement in sequential 4k writes). This
patch set seems rather promising.

Some questions from review and playing around with this...

...
> The writeback feature is per-connection and is explicitly configurable at the
> init stage (is it worth making it CAP_SOMETHING protected?) When the writeback is
> turned ON:
> 
> * still copy writeback pages to temporary buffer when sending a writeback request
>   and finish the page writeback immediately
> 
> * make kernel maintain the inode's i_size to avoid frequent i_size synchronization
>   with the user space
> 

Would this pose a problem for a filesystem in which the size of the
inode can change remotely (i.e., not visible to the local instance of
fuse)? I haven't tested this, but it seems like it could be an issue
based on the implementation.

Having a look at NFS, it appears to include logic to update the kernel
i_size from the remote side if the client side has no pending writes.
Perhaps we could do something similar here to make sure we account for
cached writes, but also can pick up remote changes to the inode?

Unrelated to that, we noticed in some tests that fuse was issuing read
requests beyond EOF (presumably in write_begin via fuse_prepare_write())
on sub-page size appending writes (i.e., dd if=/dev/zero of=file bs=1k).
This seems like a minor bug. Again, taking a look at NFS, it includes a
check of the page index against the file size to skip the read. Perhaps
something similar is necessary here as well?

Thanks again. We look forward to checking out the next version. :)

Brian

> * take NR_WRITEBACK_TEMP into account when makeing balance_dirty_pages decision.
>   This protects us from having too many dirty pages on FUSE
> 
> The provided patchset survives the fsx test. Performance measurements are not yet
> all finished, but the mentioned copying of a huge file becomes noticeably faster
> even on machines with few RAM and doesn't make the system stuck (the dirty pages
> balancer does its work OK). Applies on top of v3.5-rc4.
> 
> We are currently exploring this with our own distributed storage implementation
> which is heavily oriented on storing big blobs of data with extremely rare meta-data
> updates (virtual machines' and containers' disk images). With the existing cache
> policy a typical usage scenario -- copying a big VM disk into a cloud -- takes way
> too much time to proceed, much longer than if it was simply scp-ed over the same
> network. The write-back policy (as I mentioned) noticeably improves this scenario.
> Kirill (in Cc) can share more details about the performance and the storage concepts
> details if required.
> 
> Thanks,
> Pavel
> 
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and 
> threat landscape has changed and how IT managers can respond. Discussions 
> will include endpoint security, mobile security and the latest in malware 
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> fuse-devel mailing list
> fus...@li...
> https://lists.sourceforge.net/lists/listinfo/fuse-devel
>