|
From: Anand A. <av...@re...> - 2013-02-05 18:54:51
|
On 02/05/2013 10:36 AM, Tejun Heo wrote: > Hey, Miklos. > > On Tue, Feb 05, 2013 at 03:41:24PM +0100, Miklos Szeredi wrote: >> So basically my patch did a custom client side mmap and Linus said >> that that could be better implemented with shmem. Server side mmap >> wasn't involved, that's the part I didn't like about your original >> approach, so a server side data store/retrieve was added instead and >> osspd modified to use that. >> >> I couldn't find anything wrong with what Linus said. Of course it may >> turn out that there's a reason why it won't work, but it doesn't sound >> too difficult to implement. > > Ah, okay, it's about the client side of mmap. > >>> So, I was talking with Avati about gluster the other day, which makes >>> pretty heavy use of FUSE and reportedly hits bottleneck due to the >>> double data copy on high bandwidth configurations. We were talking >>> about ways to reduce data copy. The only obvious thing to do is >>> mapping the page cache pages into the server address space for both >>> page-cached backed and O_DIRECT files, so at least we do have a much >>> stronger use case for mmapping pages into server address space. If >>> direct mmapping is the way to go, I can dust off my old patches. What >>> do you think? >> >> I think the idea to share pages through memory maps is fundamentally >> flawed. Doing zero copy with splice is saner, I think, and there's a > > It is a bit cumbersome but I can't see why it would be fundamentally > flawed. That said, yeap, I agree splice would be a lot nicer. [at the risk of hijacking the original thread ...] splice() would give us zero user-to-kernel copy, but still results in memory copies, no? Even with that, splice() is incompatible with RDMA. It would be awesome if a FUSE userspace filesystem could perform RDMA read/write from/to the page cache directly by specifying a virtual (mapped) address -- which is fundamentally impossible with splice(). I'm exploring options where the fuse userspace filesystem can mmap the file like any other process/app from the mount point, but use the /dev/fuse channel to set/unset dirty flags -- effectively fulfilling IO requests for other apps. Still in very early stage - probably might end up as just a wacko idea :-) Avati > >> lot of infrastructure already there in fuse. And in theory nothing >> prevents improving that and achieving true zero copy with direct-IO >> and single-copy with buffered-IO. BTW you couldn't do server side >> mapping of O_DIRECT anyway since no page cache is involved with that. > > IIRC, the server side created a file serving as backing store for the > direct mmap area. quake could play sound through it, so it worked. > One way or the other, we need to provide an address_space. > > Thanks. > |