From: Maxim P. <mpa...@pa...> - 2013-05-29 14:49:16
|
Hi, 05/29/2013 06:36 PM, Brian Foster пишет: > On 05/29/2013 10:06 AM, Maxim Patlasov wrote: >> Hi Brian, >> >> 05/29/2013 03:51 PM, Brian Foster wrote: >>> Hi guys, >>> >>> I noticed an issue with the following commit while testing the async dio >>> stuff and wanted to get some thoughts: >>> >>> https://git.kernel.org/cgit/linux/kernel/git/mszeredi/fuse.git/commit/fs/fuse/file.c?id=439ee5f0c5080d4fd15fda0c5bbee1fb3a57894e >>> >>> >>> The problem is that direct I/O must be sector aligned (offset and size) >>> and this patch can effectively turn a valid direct I/O read request into >>> an invalid request by modifying the size value sent down to the client >>> fs. >> This part of kernel-to-userspace FUSE protocol is not well documented, >> as far as I know. The request is 'invalid' only if you do have client fs >> and the file opened with O_DIRECT there. This is not the case for an >> abstract storage attached by a fuse daemon. >> > Hi Maxim, > > Indeed. By invalid, I'm referring to the expectations of a traditional > local filesystem. I'm not saying there aren't ways around it, but it > seems to break the ability to pass through requests as is for those > implementations that do such things. > >>> I came across this via aio-dio-subblock-eof-read.c but the problem can >>> easily be reproduced by doing a direct read of a small unaligned file, >>> i.e.: >>> >>> truncate --size=300 /mnt/file >>> dd if=/mnt/file of=/dev/null iflag=direct >>> >>> ... which results in an EINVAL error because the client fs (gluster over >>> XFS in this case) in response receives and sends an invalid request to >>> XFS (check out xfs_file_aio_read() to see an example of where this is >>> enforced). Note that this doesn't occur in fuse direct_io mode, because >>> the optimization lies in fuse_direct_IO(). >>> >>> My immediate inclination is to say that if this is purely an >>> optimization, we might want to drop it for now. I suppose there could be >>> alternative approaches, such as preserving alignment in the optimization >>> (though I'm not yet sure we can make that generic enough..?). >> Note, that FUSE never fully preserved alignment: 256K direct read is >> always split into two 128K fuse requests ;) >> > Fair enough, but both are still sector aligned. :) > >>> What >>> concerns me at the moment is that this currently affects behavior >>> whether FUSE_ASYNC_DIO is enabled or not. Thoughts? >> I'd vote for disabling that "short read optimization" completely if >> FUSE_ASYNC_DIO is disabled. And make the way how 'count' is adjusted (in >> fuse_direct_IO) a bit more intelligent if FUSE_ASYNC_DIO is enabled: if >> offset + count > i_size, let's calculate 'count' as the minimal multiple >> of 128K >= i_size - offset, but not greater than original value of >> 'count'. This should be enough to prevent "dd if=/fuse/f of=/dev/null >> bs=1G iflag=direct" from cluttering userspace with cascade of async read >> request beyond EOF. >> > That sounds reasonable to me. Rather than potentially realign the count, > we limit the optimization in a way that preserves traditional behavior > and still provides benefit. > > BTW, is that something you plan to change for this rc period? I could > give it a whirl at some point, but I can't promise something for -rc at > the moment. For that reason, I'd currently advocate a tactical revert so > this doesn't get into the wild. Thoughts? The fix seems to be pretty simple. I'll post a patch tomorrow. Thanks, Maxim |