From: Martin M. <ma...@si...> - 2012-08-21 21:09:15
|
Hi, We're prototyping an NFS proxy by using a Fuse filesystem to bridge the Linux nfs kernel server with the standard Linux nfs client, and we've run into some performance issues. We've traced them to aspects of the fuse implementation, and wanted to understand why things were implemented the way they are, and get some guidance on our proposed changes. First, with direct io, we've noticed both reads and writes are broken into serial, 4k operations. This is because the NFS server uses scatter/gather I/O, producing an iovec of 4k buffers, but fuse in direct mode doesn't implement aio_write/aio_read, only write/read. aio_write/aio_read take an iovec, but write does not. So vfs_writev/vfs_readv loops over each 4k page, turning it into an individual request. So my question is: in direct mode, why doesn't fuse implement the aio interface? Is it because nobody has had a chance to work on it, or is it somehow inappropriate? Would you accept a patch which implements aio_read and aio_write in direct mode? Second, fuse follows generic_file_aio_write in grabbing the VFS per-inode i_mutex. This has the effect of serializing writes to each file, and there is no option to turn this off. Is this just to be safe? If the only access to our fuse filesystem comes from the NFS server, can we simply not grab the mutex and allow parallel writes? What problems should we watch out for? Thanks, Martin |
From: Miklos S. <mi...@sz...> - 2012-08-24 17:10:36
|
Martin Martin <ma...@si...> writes: > Hi, > > We're prototyping an NFS proxy by using a Fuse filesystem to bridge the > Linux nfs kernel server with the standard Linux nfs client, and we've run > into some performance issues. We've traced them to aspects of the fuse > implementation, and wanted to understand why things were implemented the > way they are, and get some guidance on our proposed changes. > > First, with direct io, we've noticed both reads and writes are broken into > serial, 4k operations. This is because the NFS server uses scatter/gather > I/O, producing an iovec of 4k buffers, but fuse in direct mode doesn't > implement aio_write/aio_read, only write/read. aio_write/aio_read take an > iovec, but write does not. So vfs_writev/vfs_readv loops over each 4k > page, turning it into an individual request. > > So my question is: in direct mode, why doesn't fuse implement the aio > interface? Is it because nobody has had a chance to work on it, or is it > somehow inappropriate? Would you accept a patch which implements aio_read > and aio_write in direct mode? About a month ago Maxim Patlasov posted a patchset doing just that. I had some nits, but the basic idea is fine. So yes, I'd love to have this. > Second, fuse follows generic_file_aio_write in grabbing the VFS per-inode > i_mutex. This has the effect of serializing writes to each file, and there > is no option to turn this off. Is this just to be safe? If the only > access to our fuse filesystem comes from the NFS server, can we simply not > grab the mutex and allow parallel writes? What problems should we watch > out for? I mutex protects against various races, including write vs. truncate. Thanks, Miklos |
From: Devesh A. <dag...@si...> - 2012-08-24 22:21:24
|
Hi, We are interested in trying out the patch that implements true scatter/gather for direct IO. My understanding is that it hasn't yet been incorporated in any of the branches in git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git and it only exists in the form of email patches in the lkml and fuse-devel mailing list. (http://article.gmane.org/gmane.comp.file-systems.fuse.devel/11855) Am I right ? Also I am wondering if work is happening on those patches. Thanks. ---------- Forwarded message ---------- From: Miklos Szeredi <mi...@sz...> Date: Fri, Aug 24, 2012 at 1:11 PM Subject: Re: [fuse-devel] Fuse performance with NFS To: Martin Martin <ma...@si...> Cc: fus...@li... Martin Martin <ma...@si...> writes: > Hi, > > We're prototyping an NFS proxy by using a Fuse filesystem to bridge the > Linux nfs kernel server with the standard Linux nfs client, and we've run > into some performance issues. We've traced them to aspects of the fuse > implementation, and wanted to understand why things were implemented the > way they are, and get some guidance on our proposed changes. > > First, with direct io, we've noticed both reads and writes are broken into > serial, 4k operations. This is because the NFS server uses scatter/gather > I/O, producing an iovec of 4k buffers, but fuse in direct mode doesn't > implement aio_write/aio_read, only write/read. aio_write/aio_read take an > iovec, but write does not. So vfs_writev/vfs_readv loops over each 4k > page, turning it into an individual request. > > So my question is: in direct mode, why doesn't fuse implement the aio > interface? Is it because nobody has had a chance to work on it, or is it > somehow inappropriate? Would you accept a patch which implements aio_read > and aio_write in direct mode? About a month ago Maxim Patlasov posted a patchset doing just that. I had some nits, but the basic idea is fine. So yes, I'd love to have this. > Second, fuse follows generic_file_aio_write in grabbing the VFS per-inode > i_mutex. This has the effect of serializing writes to each file, and there > is no option to turn this off. Is this just to be safe? If the only > access to our fuse filesystem comes from the NFS server, can we simply not > grab the mutex and allow parallel writes? What problems should we watch > out for? I mutex protects against various races, including write vs. truncate. Thanks, Miklos -- -- Thanks -- Devesh |
From: Miklos S. <mi...@sz...> - 2012-08-29 22:05:52
|
Devesh Agrawal <dag...@si...> writes: > Hi, > > We are interested in trying out the patch that implements true > scatter/gather for direct IO. > > My understanding is that it hasn't yet been incorporated in any of the > branches in git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git > and it only exists in the form of email patches in the lkml and > fuse-devel mailing list. > (http://article.gmane.org/gmane.comp.file-systems.fuse.devel/11855) > > Am I right ? > > Also I am wondering if work is happening on those patches. Please fix the problems I wrote about and resubmit the patch. As I said I'd be very happy to accept it. Thanks, Miklos |
From: Maxim P. <mpa...@pa...> - 2012-08-31 04:04:20
|
Devesh, On 8/25/12 1:54 AM, Devesh Agrawal wrote: > Hi, > > We are interested in trying out the patch that implements true > scatter/gather for direct IO. > > My understanding is that it hasn't yet been incorporated in any of the > branches in git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git > and it only exists in the form of email patches in the lkml and > fuse-devel mailing list. > (http://article.gmane.org/gmane.comp.file-systems.fuse.devel/11855) > > Am I right ? > > Also I am wondering if work is happening on those patches. I started to work on it but was interrupted. I could hopefully make some progress next week. BTW, I'm not sure that my patches will help you in your use-case. Either your analysis: > First, with direct io, we've noticed both reads and writes are broken into > serial, 4k operations. This is because the NFS server uses scatter/gather > I/O, producing an iovec of 4k buffers, but fuse in direct mode doesn't > implement aio_write/aio_read, only write/read. aio_write/aio_read take an > iovec, but write does not. So vfs_writev/vfs_readv loops over each 4k > page, turning it into an individual request. is not quite correct. Of course, maybe I'm missing something obvious. Did you give my patches a try in your prototype? Thanks, Maxim |
From: Maxim P. <mpa...@pa...> - 2012-08-31 04:03:45
|
Miklos, On 8/30/12 2:07 AM, Miklos Szeredi wrote: > Devesh Agrawal<dag...@si...> writes: > >> Hi, >> >> We are interested in trying out the patch that implements true >> scatter/gather for direct IO. >> >> My understanding is that it hasn't yet been incorporated in any of the >> branches in git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git >> and it only exists in the form of email patches in the lkml and >> fuse-devel mailing list. >> (http://article.gmane.org/gmane.comp.file-systems.fuse.devel/11855) >> >> Am I right ? >> >> Also I am wondering if work is happening on those patches. > Please fix the problems I wrote about and resubmit the patch. As I said > I'd be very happy to accept it. First thing to say, I apologize for so long delay. Unfortunately I was too busy and couldn't work on it much. I was scrutinizing all places where argpages is used. Sometimes we can predict how many pages have to be stashed in fuse request. But it doesn't seem to be the case for fuse_retrieve(). Do you think we need something smarter than simply allocating array of FUSE_MAX_PAGES_PER_REQ elements there? Either we could iterate through mapping twice; first time only to count number of pages. What would be lesser evil? Thanks, Maxim |
From: Miklos S. <mi...@sz...> - 2012-09-03 16:06:48
|
Maxim Patlasov <mpa...@pa...> writes: > Miklos, > > On 8/30/12 2:07 AM, Miklos Szeredi wrote: >> Devesh Agrawal<dag...@si...> writes: >> >>> Hi, >>> >>> We are interested in trying out the patch that implements true >>> scatter/gather for direct IO. >>> >>> My understanding is that it hasn't yet been incorporated in any of the >>> branches in git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git >>> and it only exists in the form of email patches in the lkml and >>> fuse-devel mailing list. >>> (http://article.gmane.org/gmane.comp.file-systems.fuse.devel/11855) >>> >>> Am I right ? >>> >>> Also I am wondering if work is happening on those patches. >> Please fix the problems I wrote about and resubmit the patch. As I said >> I'd be very happy to accept it. > > First thing to say, I apologize for so long delay. Unfortunately I was > too busy and couldn't work on it much. > > I was scrutinizing all places where argpages is used. Sometimes we can > predict how many pages have to be stashed in fuse request. But it > doesn't seem to be the case for fuse_retrieve(). Do you think we need > something smarter than simply allocating array of > FUSE_MAX_PAGES_PER_REQ elements there? It should be possible to calculate the number of pages: num_pages = (num + offset + PAGE_SIZE - 1) >> PAGE_SHIFT; BTW, fuse_retrieve() seems to be broken if offset is not on a page boundary. Following (untested) patch should fix it. Thanks, Miklos ---- diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index 7df2b5e..f4246cf 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -1576,6 +1576,7 @@ static int fuse_retrieve(struct fuse_conn *fc, struct inode *inode, req->pages[req->num_pages] = page; req->num_pages++; + offset = 0; num -= this_num; total_len += this_num; index++; |