From: Maxim P. <mpa...@pa...> - 2013-05-29 14:08:00
|
Hi Brian, 05/29/2013 03:51 PM, Brian Foster wrote: > Hi guys, > > I noticed an issue with the following commit while testing the async dio > stuff and wanted to get some thoughts: > > https://git.kernel.org/cgit/linux/kernel/git/mszeredi/fuse.git/commit/fs/fuse/file.c?id=439ee5f0c5080d4fd15fda0c5bbee1fb3a57894e > > The problem is that direct I/O must be sector aligned (offset and size) > and this patch can effectively turn a valid direct I/O read request into > an invalid request by modifying the size value sent down to the client fs. This part of kernel-to-userspace FUSE protocol is not well documented, as far as I know. The request is 'invalid' only if you do have client fs and the file opened with O_DIRECT there. This is not the case for an abstract storage attached by a fuse daemon. > > I came across this via aio-dio-subblock-eof-read.c but the problem can > easily be reproduced by doing a direct read of a small unaligned file, i.e.: > > truncate --size=300 /mnt/file > dd if=/mnt/file of=/dev/null iflag=direct > > ... which results in an EINVAL error because the client fs (gluster over > XFS in this case) in response receives and sends an invalid request to > XFS (check out xfs_file_aio_read() to see an example of where this is > enforced). Note that this doesn't occur in fuse direct_io mode, because > the optimization lies in fuse_direct_IO(). > > My immediate inclination is to say that if this is purely an > optimization, we might want to drop it for now. I suppose there could be > alternative approaches, such as preserving alignment in the optimization > (though I'm not yet sure we can make that generic enough..?). Note, that FUSE never fully preserved alignment: 256K direct read is always split into two 128K fuse requests ;) > What > concerns me at the moment is that this currently affects behavior > whether FUSE_ASYNC_DIO is enabled or not. Thoughts? I'd vote for disabling that "short read optimization" completely if FUSE_ASYNC_DIO is disabled. And make the way how 'count' is adjusted (in fuse_direct_IO) a bit more intelligent if FUSE_ASYNC_DIO is enabled: if offset + count > i_size, let's calculate 'count' as the minimal multiple of 128K >= i_size - offset, but not greater than original value of 'count'. This should be enough to prevent "dd if=/fuse/f of=/dev/null bs=1G iflag=direct" from cluttering userspace with cascade of async read request beyond EOF. Thanks, Maxim |