From: David S. <da...@da...> - 2013-02-18 02:19:17
|
The high-level API currently renames unlinked files to ".fuse_hiddenXXX" and manages them using the standard file access callbacks provided by the FUSE-based file system. This creates poor performance for network-based FUSE file systems that already open file descriptors against files managed in a local cache. It's unnecessary to ever write such files up to a server. For such file systems (including our recent work on FuseDAV), it would be better to just invoke unlink(), even while the file is still open. The hard_remove option is close, but it breaks further access by those other open handles. The low-level API allows managing references directly, but it also requires rewriting to an API that maps more poorly to WebDAV and -- according to the NTFS3g benchmarks -- runs slower. Would a patch be welcome to add a never_hide option? I may roll one for internal use, regardless. -- David Strauss | da...@da... | +1 512 577 5827 [mobile] |
From: David S. <da...@da...> - 2013-02-18 02:24:09
|
On Sun, Feb 17, 2013 at 5:50 PM, David Strauss <da...@da...> wrote: > Would a patch be welcome to add a never_hide option? I may roll one > for internal use, regardless. On some deeper inspection of the library code, it looks like this wouldn't be so trivial. Still, would there be interest in something like the ability to implement a high-level "hide" callback with the same parameters as "rename" that gets called instead of "rename" when implemented? -- David Strauss | da...@da... | +1 512 577 5827 [mobile] |
From: Miklos S. <mi...@sz...> - 2013-02-18 05:24:00
|
On Mon, Feb 18, 2013 at 2:56 AM, David Strauss <da...@da...> wrote: > On Sun, Feb 17, 2013 at 5:50 PM, David Strauss <da...@da...> wrote: >> Would a patch be welcome to add a never_hide option? I may roll one >> for internal use, regardless. > > On some deeper inspection of the library code, it looks like this > wouldn't be so trivial. Still, would there be interest in something > like the ability to implement a high-level "hide" callback with the > same parameters as "rename" that gets called instead of "rename" when > implemented? > -ohard_remove,nopath should work. Ah, 'nopath' is undocumented it seems. I'll fix that. Thanks, Miklos |
From: David S. <da...@da...> - 2013-02-18 06:09:54
|
On Sun, Feb 17, 2013 at 9:23 PM, Miklos Szeredi <mi...@sz...> wrote: > Ah, 'nopath' is undocumented it seems. I'll fix that. Does that still provide the updated path on write and release when a rename has happened? I'd still like the following to work: f = open("a.txt", O_RDWR); write(f, "123", 3); rename("a.txt", "b.txt"); write(f, "123", 3); // Writes to b.txt from this point. close(f); I just want a clean way to handle not writing to the server on the write/release here: f = open("a.txt", O_RDWR); unlink("a.txt"); write(f, "123", 3); // Writes only locally. close(f); Opening a temp file and then unlinking it is a common pattern we'd like to optimize. -- David Strauss | da...@da... | +1 512 577 5827 [mobile] |
From: Miklos S. <mi...@sz...> - 2013-02-18 09:51:36
|
On Mon, Feb 18, 2013 at 7:09 AM, David Strauss <da...@da...> wrote: > On Sun, Feb 17, 2013 at 9:23 PM, Miklos Szeredi <mi...@sz...> wrote: >> Ah, 'nopath' is undocumented it seems. I'll fix that. > > Does that still provide the updated path on write and release when a > rename has happened? I'd still like the following to work: Okay, what you really want is fuse_op->flag_nullpath_ok = 1 and 'hard_remove'. That will provide the path when available but give NULL if the file has been unlinked. Note, your caching optimization will break if there are other hard links to the file. E.g. fd = open("x"); link("x", "y"); unlink("x"); write(fd, ...); close(fd); will not work as expected. Thanks, Miklos |
From: David S. <da...@da...> - 2013-02-18 16:33:44
|
On Mon, Feb 18, 2013 at 1:51 AM, Miklos Szeredi <mi...@sz...> wrote: > Note, your caching optimization > will break if there are other hard links to the file. Perfectly understandable (and, really, unavoidable with WebDAV). Thank you for the help. -- David Strauss | da...@da... | +1 512 577 5827 [mobile] |
From: Goswin v. B. <gos...@we...> - 2013-02-21 10:10:29
|
On Mon, Feb 18, 2013 at 10:51:28AM +0100, Miklos Szeredi wrote: > On Mon, Feb 18, 2013 at 7:09 AM, David Strauss <da...@da...> wrote: > > On Sun, Feb 17, 2013 at 9:23 PM, Miklos Szeredi <mi...@sz...> wrote: > >> Ah, 'nopath' is undocumented it seems. I'll fix that. > > > > Does that still provide the updated path on write and release when a > > rename has happened? I'd still like the following to work: > > Okay, what you really want is fuse_op->flag_nullpath_ok = 1 and > 'hard_remove'. That will provide the path when available but give > NULL if the file has been unlinked. Note, your caching optimization > will break if there are other hard links to the file. E.g. > > fd = open("x"); > link("x", "y"); > unlink("x"); > write(fd, ...); > close(fd); > > will not work as expected. > > Thanks, > Miklos Or if 2 clients have the file open before it gets deleted. Not sure if webdav allows clients to detect tht other clients access the same file. Also any write would have to be locally cached. Will you keep that data in ram or open a local file for it? Will you read in the whole file before deleting it on the server or only delete it on the server when the FD is closed? Seems to me like you should be doing the rename magic yourself and on the server instead of in libfuse and keep track of the link and open count in the filehandle you create in open(). MfG Goswin |
From: Nikolaus R. <Nik...@ra...> - 2013-02-19 04:22:22
|
David Strauss <dav...@pu...> writes: > other open handles. The low-level API allows managing references > directly, but it also requires rewriting to an API that maps more > poorly to WebDAV and -- according to the NTFS3g benchmarks -- runs > slower. That doesn't make sense. The high level API is implemented on top of the low level API, so it cannot possibly be faster. It could be the case that NTFS3g isn't as clever as the FUSE library that implements the high level API when using the low level API, but that's something that would be specific to NTFS3g and probably not too hard to fix either. Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C |
From: David S. <da...@da...> - 2013-02-20 19:46:38
|
On Mon, Feb 18, 2013 at 8:22 PM, Nikolaus Rath <Nik...@ra...> wrote: > That doesn't make sense. The high level API is implemented on top of the > low level API, so it cannot possibly be faster. It could be the case > that NTFS3g isn't as clever as the FUSE library that implements the high > level API when using the low level API, but that's something that would > be specific to NTFS3g and probably not too hard to fix either. Totally possible. I didn't investigate their claim [1] further, but here it is: "The high level interface performs generally better because several system calls are grouped to form a single file system call [which] can generally be processed in the file system with a couple of inode openings (for the requested file and parent directory), whereas at low level more file system calls are needed causing reopenings of inodes." I had initially read it to mean that the calls happened at a higher level from the kernel and resulted in fewer user-space requests. I now understand that that's not the case. [1] http://b.andre.pagesperso-orange.fr/fuse-interfaces.html -- David Strauss | da...@da... | +1 512 577 5827 [mobile] |
From: Nikolaus R. <Nik...@ra...> - 2013-02-22 01:21:54
|
David Strauss <dav...@pu...> writes: > On Mon, Feb 18, 2013 at 8:22 PM, Nikolaus Rath <Nik...@pu...> wrote: >> That doesn't make sense. The high level API is implemented on top of the >> low level API, so it cannot possibly be faster. It could be the case >> that NTFS3g isn't as clever as the FUSE library that implements the high >> level API when using the low level API, but that's something that would >> be specific to NTFS3g and probably not too hard to fix either. > > Totally possible. I didn't investigate their claim [1] further, but here it is: > > "The high level interface performs generally better because several > system calls are grouped to form a single file system call [which] can > generally be processed in the file system with a couple of inode > openings (for the requested file and parent directory), whereas at low > level more file system calls are needed causing reopenings of inodes." I'm going to be bold here and claim that this claim is total nonsense. The number of system calls for the high and low level API is the same. Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C |
From: Jean-Pierre <jea...@wa...> - 2013-02-22 07:35:30
|
Nikolaus Rath wrote: > David Strauss<da...@da...> writes: >> On Mon, Feb 18, 2013 at 8:22 PM, Nikolaus Rath<Nik...@ra...> wrote: >>> That doesn't make sense. The high level API is implemented on top of the >>> low level API, so it cannot possibly be faster. It could be the case >>> that NTFS3g isn't as clever as the FUSE library that implements the high >>> level API when using the low level API, but that's something that would >>> be specific to NTFS3g and probably not too hard to fix either. >> >> Totally possible. I didn't investigate their claim [1] further, but here it is: >> >> "The high level interface performs generally better because several >> system calls are grouped to form a single file system call [which] can >> generally be processed in the file system with a couple of inode >> openings (for the requested file and parent directory), whereas at low >> level more file system calls are needed causing reopenings of inodes." > > I'm going to be bold here and claim that this claim is total nonsense. > The number of system calls for the high and low level API is the same. > Right, but the claim was about calls received by the user-space file system. At low level, lookups are sent to the file system, whereas at high level they are not. Regards Jean-Pierre |
From: Nikolaus R. <Nik...@ra...> - 2013-02-23 19:44:00
|
Jean-Pierre <jea...@pu...> writes: > Nikolaus Rath wrote: >> David Strauss<dav...@pu...> writes: >>> On Mon, Feb 18, 2013 at 8:22 PM, Nikolaus Rath<Nik...@pu...> wrote: >>>> That doesn't make sense. The high level API is implemented on top of the >>>> low level API, so it cannot possibly be faster. It could be the case >>>> that NTFS3g isn't as clever as the FUSE library that implements the high >>>> level API when using the low level API, but that's something that would >>>> be specific to NTFS3g and probably not too hard to fix either. >>> >>> Totally possible. I didn't investigate their claim [1] further, but here it is: >>> >>> "The high level interface performs generally better because several >>> system calls are grouped to form a single file system call [which] can >>> generally be processed in the file system with a couple of inode >>> openings (for the requested file and parent directory), whereas at low >>> level more file system calls are needed causing reopenings of inodes." >> >> I'm going to be bold here and claim that this claim is total nonsense. >> The number of system calls for the high and low level API is the same. > > Right, but the claim was about calls received by the user-space > file system. At low level, lookups are sent to the file system, > whereas at high level they are not. Are you making a distinction between the part of the user space file system that you implemented, and the part providing the high level API that's linked in from FUSE? They live in the same process and address space, so nothing is being "sent" between those to components (or rather, things are passed around between functions just as they are within your code). The high level API implementation is part of the user-space file system. Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C |
From: David S. <da...@da...> - 2013-02-25 12:31:48
|
On Sat, Feb 23, 2013 at 11:43 AM, Nikolaus Rath <Nik...@ra...> wrote: > They live in the same process and address space, so nothing is being > "sent" between those to components (or rather, things are passed around > between functions just as they are within your code). The high level API > implementation is part of the user-space file system. Thanks for being explicit about this. I gradually came to that conclusion, as it's the only thing fitting with the other posts to this thread (and, of course, the source). Features like readdir_plus, OTOH, will actually reduce kernel/user-space traffic. I definitely look forward to that landing. :-) -- David Strauss | da...@da... | +1 512 577 5827 [mobile] |