From: Brian F. <bf...@re...> - 2013-05-29 14:39:06
|
On 05/29/2013 10:06 AM, Maxim Patlasov wrote: > Hi Brian, > > 05/29/2013 03:51 PM, Brian Foster wrote: >> Hi guys, >> >> I noticed an issue with the following commit while testing the async dio >> stuff and wanted to get some thoughts: >> >> https://git.kernel.org/cgit/linux/kernel/git/mszeredi/fuse.git/commit/fs/fuse/file.c?id=439ee5f0c5080d4fd15fda0c5bbee1fb3a57894e >> >> >> The problem is that direct I/O must be sector aligned (offset and size) >> and this patch can effectively turn a valid direct I/O read request into >> an invalid request by modifying the size value sent down to the client >> fs. > > This part of kernel-to-userspace FUSE protocol is not well documented, > as far as I know. The request is 'invalid' only if you do have client fs > and the file opened with O_DIRECT there. This is not the case for an > abstract storage attached by a fuse daemon. > Hi Maxim, Indeed. By invalid, I'm referring to the expectations of a traditional local filesystem. I'm not saying there aren't ways around it, but it seems to break the ability to pass through requests as is for those implementations that do such things. >> >> I came across this via aio-dio-subblock-eof-read.c but the problem can >> easily be reproduced by doing a direct read of a small unaligned file, >> i.e.: >> >> truncate --size=300 /mnt/file >> dd if=/mnt/file of=/dev/null iflag=direct >> >> ... which results in an EINVAL error because the client fs (gluster over >> XFS in this case) in response receives and sends an invalid request to >> XFS (check out xfs_file_aio_read() to see an example of where this is >> enforced). Note that this doesn't occur in fuse direct_io mode, because >> the optimization lies in fuse_direct_IO(). >> >> My immediate inclination is to say that if this is purely an >> optimization, we might want to drop it for now. I suppose there could be >> alternative approaches, such as preserving alignment in the optimization >> (though I'm not yet sure we can make that generic enough..?). > > Note, that FUSE never fully preserved alignment: 256K direct read is > always split into two 128K fuse requests ;) > Fair enough, but both are still sector aligned. :) >> What >> concerns me at the moment is that this currently affects behavior >> whether FUSE_ASYNC_DIO is enabled or not. Thoughts? > > I'd vote for disabling that "short read optimization" completely if > FUSE_ASYNC_DIO is disabled. And make the way how 'count' is adjusted (in > fuse_direct_IO) a bit more intelligent if FUSE_ASYNC_DIO is enabled: if > offset + count > i_size, let's calculate 'count' as the minimal multiple > of 128K >= i_size - offset, but not greater than original value of > 'count'. This should be enough to prevent "dd if=/fuse/f of=/dev/null > bs=1G iflag=direct" from cluttering userspace with cascade of async read > request beyond EOF. > That sounds reasonable to me. Rather than potentially realign the count, we limit the optimization in a way that preserves traditional behavior and still provides benefit. BTW, is that something you plan to change for this rc period? I could give it a whirl at some point, but I can't promise something for -rc at the moment. For that reason, I'd currently advocate a tactical revert so this doesn't get into the wild. Thoughts? Brian > Thanks, > Maxim |