From: Valient G. <vg...@po...> - 2004-06-22 11:51:14
|
I am having trouble supporting the Evolution mail reader from within my FUSE based filesystem. The problem is that Evolution makes use of delete-while-open and rename-over-open file states which are perfectly valid and documented for Unix, but are not supported in FUSE. delete-while-open: a file is opened then deleted (while still open). The file is now invisible and could be opened a second time resulting in a new version of the file with the same name. The problem supporting this under FUSE is that libfuse passes the filename as the unique key instead of an inode, so the userspace program can't disambiguate which file is being accessed. rename-while-open: very similar. Open "foo", then rename "newfoo" -> "foo" and open "newfoo". Now there are two open file descriptors both referencing a file named "foo", but one is an invisible file. Evolution uses this type of operation with its Outbox when sending mail, and possibly Drafts folder. There is at least one possibility which would not require much of an interface change to libfuse. Since the invisible files are only accessible via file descriptors and cannot be found in the filesystem, when an open file is deleted or replaced by another file, libfuse could emit an internal rename operation. Since the filename is no longer important, it could be renamed to any internal value -- just so long as it is unique within libfuse. That way hidden files could still be referenced in an unambiguous way. So, for example: system opens "foo", gets file descriptor 1 system opens "bar", gets FD 2 system renames "bar" -> "foo fuse sends internal-rename "foo" -> "foo#1" (some random name) fuse sends rename "bar" -> "foo" , as it does today system reads from FD 1 fuse sends read on "foo#1" system reads from FD 2 fuse sends read on "foo" system deletes "foo" fuse sends internal-rename "foo" -> "foo#2" ... Does that sound reasonable? regards, Valient |
From: Miklos S. <msz...@in...> - 2004-06-22 12:49:57
|
> system opens "foo", gets file descriptor 1 > system opens "bar", gets FD 2 > system renames "bar" -> "foo > fuse sends internal-rename "foo" -> "foo#1" (some random name) > fuse sends rename "bar" -> "foo" , as it does today > system reads from FD 1 > fuse sends read on "foo#1" > system reads from FD 2 > fuse sends read on "foo" > system deletes "foo" > fuse sends internal-rename "foo" -> "foo#2" > ... > > Does that sound reasonable? NFS does something very similar. I think it is possible to do this only in the userspace library, which is what I'd prefer instead of a kernel based solution. I added it to my todo list. If somebody wants to have a go at it, then please go ahead! Miklos |
From: Valient G. <vg...@po...> - 2004-06-23 11:25:50
Attachments:
fuse-1.2-hide.diff
|
How about something like this (patch against fuse 1.2 release attached). It works for me so far - Evolution runs much better now under my filesystem. If you are ok with the general approach, one place I would clean up more is the generation of new filenames. Right now it uses l64a() function which is not reentrant.. Valient On Tue, 2004-06-22 at 14:49, Miklos Szeredi wrote: > > system opens "foo", gets file descriptor 1 > > system opens "bar", gets FD 2 > > system renames "bar" -> "foo > > fuse sends internal-rename "foo" -> "foo#1" (some random name) > > fuse sends rename "bar" -> "foo" , as it does today > > system reads from FD 1 > > fuse sends read on "foo#1" > > system reads from FD 2 > > fuse sends read on "foo" > > system deletes "foo" > > fuse sends internal-rename "foo" -> "foo#2" > > ... > > > > Does that sound reasonable? > > NFS does something very similar. I think it is possible to do this > only in the userspace library, which is what I'd prefer instead of a > kernel based solution. > > I added it to my todo list. If somebody wants to have a go at it, > then please go ahead! > > Miklos > > > ------------------------------------------------------- > This SF.Net email sponsored by Black Hat Briefings & Training. > Attend Black Hat Briefings & Training, Las Vegas July 24-29 - > digital self defense, top technical experts, no vendor pitches, > unmatched networking opportunities. Visit www.blackhat.com > _______________________________________________ > Avf-fuse-dev mailing list > Avf...@li... > https://lists.sourceforge.net/lists/listinfo/avf-fuse-dev > |
From: Miklos S. <msz...@in...> - 2004-06-23 12:28:16
|
> How about something like this (patch against fuse 1.2 release attached). > > It works for me so far - Evolution runs much better now under my > filesystem. > > If you are ok with the general approach, one place I would clean up more > is the generation of new filenames. Right now it uses l64a() function > which is not reentrant.. Looks OK. Is there a reason not to use rename() instead of hidefile()? That way it would work on all filesystems without modification, and "hiding" can be done by prepending a '.' to the hidden filename. That's how NFS does it. Miklos |
From: Valient G. <vg...@po...> - 2004-06-23 16:11:12
|
On Wed, 2004-06-23 at 14:27, Miklos Szeredi wrote: > Looks OK. Is there a reason not to use rename() instead of > hidefile()? That way it would work on all filesystems without > modification, and "hiding" can be done by prepending a '.' to the > hidden filename. That's how NFS does it. Hmm. Well, it would be a little less efficient for the filesystem, but probably not enough to worry about. Less efficient because in the case of a pass-thru filesystem, a rename has to be propogated to the real filesystem. Where a 'hide' is just libfuse telling the filesystem how it will refer to the file in the future.. But yeah, probably useful to have it work on all filesystem with less effort for the filesystem writer.. So the comparison of operations looks like: delete-while-open (name): - using hide: hide(name, hidden name), unlink(name) - using rename: rename(name, hidden name), unlink(hidden name) rename-over-open ( old name -> new name ): - using hide: hide(new name, hidden name), rename(old name, new name) - using rename: rename(new name, hidden name), unlink(hidden name), rename(old name, new name) For renaming, you aren't saying that NFS only prepends '.' and nothing else? I'm worried about generating collisions with real files. This is especially a concern if using a real rename() operation instead of 'hide'. But ok, it makes sense to prepend a '.' to make the file less visible to programs.. Also, as I wrote rename-over-open using rename above, the unlink cannot be done immediately, otherwise filesystems which do not cache open files will loose access to the file. So, we'd have to leave the dangling hidden files around and delete them only when there are no more references.. I'm mostly concerned with the garbage collection -- getting rid of the 'hidden' files properly.. If there was a way for the filesystem to tell what was going on, I could just delete them immediately in my own filesystem since I cache the file descriptors. But I wouldn't be able to tell the difference by only looking at rename operations.. I'll look into it some more. thanks, Valient |
From: Miklos S. <msz...@in...> - 2004-06-23 18:04:54
|
> For renaming, you aren't saying that NFS only prepends '.' and nothing > else? No, actually it generates a some random looking filename like this: .nfs002f297900000020 > Also, as I wrote rename-over-open using rename above, the unlink cannot > be done immediately, otherwise filesystems which do not cache open files > will loose access to the file. True. I'd do it this way: have a flag for each node that indicates whether it is hidden (i.e. deleted) or not. In the release operation if the node is hidden and the open_count is zero, than call the unlink() method. Miklos |
From: Valient G. <vg...@po...> - 2004-06-23 21:00:45
|
On Wed, 2004-06-23 at 20:04, Miklos Szeredi wrote: > True. I'd do it this way: have a flag for each node that indicates > whether it is hidden (i.e. deleted) or not. In the release operation > if the node is hidden and the open_count is zero, than call the > unlink() method. I've implemented this, which works fine (as long as the filesystem also implements unlink, but that would be normal). How do you feel about having both? I've made it so that if hidefile (or perhaps you prefer another name -- rename_internal , rename_reference ?) is not set, then use rename + unlink on close. Unless you won't consider this for inclusion, I'll submit it. The reason is that in my own filesystem I'd prefer to handle the internal-rename without treating it as a real rename because with a rename I have to potentially do a bunch of extra work (one of the encryption options allows data to be dependent on the filename, so renaming a file on disk means having to rewrite part of the file).. Valient |
From: Cody P. <co...@hp...> - 2004-06-23 21:17:22
|
Implementation details aside, does this really belong on the base libfuse? It seems to me it belongs in individual filesystem implementations as it can be easily built from the primatives libfuse already provides. Maybe included with the fuse source as an example.. Just a thought, -Cody On Wed, Jun 23, 2004 at 11:00:40PM +0200, Valient Gough wrote: > On Wed, 2004-06-23 at 20:04, Miklos Szeredi wrote: > > > > True. I'd do it this way: have a flag for each node that indicates > > whether it is hidden (i.e. deleted) or not. In the release operation > > if the node is hidden and the open_count is zero, than call the > > unlink() method. > > > I've implemented this, which works fine (as long as the filesystem also > implements unlink, but that would be normal). > > How do you feel about having both? I've made it so that if hidefile (or > perhaps you prefer another name -- rename_internal , rename_reference ?) > is not set, then use rename + unlink on close. Unless you won't > consider this for inclusion, I'll submit it. > > The reason is that in my own filesystem I'd prefer to handle the > internal-rename without treating it as a real rename because with a > rename I have to potentially do a bunch of extra work (one of the > encryption options allows data to be dependent on the filename, so > renaming a file on disk means having to rewrite part of the file).. > > Valient > -- Cody Pisto <co...@nm...> Chief Architect | New Mexico Software, Inc. http://www.nmxs.com/ |
From: Valient G. <vg...@po...> - 2004-06-23 21:32:25
|
The current libfuse behavior can lead to data loss or corruption for programs expecting Unix behavior of the filesystem. So I think leaving it as-is is not a good option. The reason for including an implementation using rename + unlink on close was an argument from Miklos that I agree with. The reasoning being that all filesystems would pick up support without having to do any state tracking of files. There has to be some amount of support in libfuse, because the implementing filesystem can't do it by itself. Another less visible support method would be to have a function in libfuse which renames an internal libfuse node. Then the implementing filesystem would have to detect what was going on and ask libfuse to rename its internal node. That is possible, but I don't think that is any cleaner then having the hidefile() interface, and also causes more duplicated work for filesystem implementors then what we're talking about. With my current patch if you implement the hidefile() function and return <0, then the operation will not be allowed. This is still much safer then the current libfuse functionality, because at least then an error is returned to the program which attempted the operation. Valient On Wed, 2004-06-23 at 23:17, Cody Pisto wrote: > Implementation details aside, does this really belong on the base > libfuse? It seems to me it belongs in individual filesystem > implementations as it can be easily built from the primatives libfuse > already provides. > > Maybe included with the fuse source as an example.. > > Just a thought, > > -Cody |
From: Cody P. <co...@hp...> - 2004-06-23 21:39:14
|
I know this has been discussed briefly before, but maybe now is a better time to bring it up: Do problems like this simply vanish if libfuse provides an alternate interface that uses inodes instead of paths? The kernel interface supports this easily, it would just mean reimplementing the userland portion. -Cody On Wed, Jun 23, 2004 at 11:32:21PM +0200, Valient Gough wrote: > > The current libfuse behavior can lead to data loss or corruption for > programs expecting Unix behavior of the filesystem. So I think leaving > it as-is is not a good option. > > The reason for including an implementation using rename + unlink on > close was an argument from Miklos that I agree with. The reasoning > being that all filesystems would pick up support without having to do > any state tracking of files. > > There has to be some amount of support in libfuse, because the > implementing filesystem can't do it by itself. > > Another less visible support method would be to have a function in > libfuse which renames an internal libfuse node. Then the implementing > filesystem would have to detect what was going on and ask libfuse to > rename its internal node. That is possible, but I don't think that is > any cleaner then having the hidefile() interface, and also causes more > duplicated work for filesystem implementors then what we're talking > about. > > With my current patch if you implement the hidefile() function and > return <0, then the operation will not be allowed. This is still much > safer then the current libfuse functionality, because at least then an > error is returned to the program which attempted the operation. > > Valient > > On Wed, 2004-06-23 at 23:17, Cody Pisto wrote: > > > Implementation details aside, does this really belong on the base > > libfuse? It seems to me it belongs in individual filesystem > > implementations as it can be easily built from the primatives libfuse > > already provides. > > > > Maybe included with the fuse source as an example.. > > > > Just a thought, > > > > -Cody > > |
From: Miklos S. <msz...@in...> - 2004-06-24 06:42:55
|
> Do problems like this simply vanish if libfuse provides an alternate > interface that uses inodes instead of paths? Yes. And some filesystems which just use the path as a key to a hash table will be simpler and faster from not having to convert inode numbers to paths and back. > The kernel interface supports this easily, it would just mean > reimplementing the userland portion. Exactly. The hard part (or maybe not that hard, only I'm lazy) is developing that API. Miklos |
From: Valient G. <vg...@po...> - 2004-06-24 10:34:23
Attachments:
fuse-1.2-hide2.diff
encfs-opentest.pl
|
Attached is a new version of the patch. This implements what I described - uses op.hidefile if it exists, otherwise uses rename() and unlinks the hidden file when it is no longer needed. I've tested both cases using my filesystem. Also I'll attach a simple perl script that I used as a sanity check against the delete-while-open and rename-while-open cases.. Valient |
From: Valient G. <vg...@po...> - 2004-06-23 11:33:12
|
I forgot to add an overview of what the patch does. I added open count tracking to the fuse node structure which is increment on each open and decremented on a release. When unlink is called on a file, it checks the open count and if it is open, then it hides the file first. Oops.. Actually there is a bug in the patch I sent - it doesn't actually delete the file after 'hiding' it. First thing to fix.. The other case is during a rename, if the new filename already exists and is open, then it hides that file first. Hiding consists simply of creating a new name for the file and calling op.hidefile(oldname, newname), to tell the filesystem implementation that the file has a new name. libfuse also changes it's idea of the name so that future read/write/close calls are sent using the new virtual name.. When I run a program like Evolution from within the filesystem and printout calls to hidename() from my filesystem, I see things like this: 13:25:39 (encfs.cpp:304) hiding /evolution/local/Outbox/mbox -> /evolution/local/Outbox/mbox###7Hwd// 13:25:41 (encfs.cpp:304) hiding /evolution/local/Contacts/addressbook.db.summary -> /evolution/local/Contacts/addressbook.db.summary###LAO5m/ Valient |
From: Miklos S. <msz...@in...> - 2004-06-25 07:00:02
|
> Attached is a new version of the patch. This implements what I > described - uses op.hidefile if it exists, otherwise uses rename() and > unlinks the hidden file when it is no longer needed. > > I've tested both cases using my filesystem. Also I'll attach a simple > perl script that I used as a sanity check against the delete-while-open > and rename-while-open cases.. Thanks! I integrated it into CVS, with the following modifications: - I left out the hidefile method. Not because I'm opposed to it, but because I can't test it, and anyway it's better to start off with the simpler aproach. - I made the hidden file name generation similar to the one in NFS. Including the original filename in the hidden filename has the drawback, that the new name might become too long. Miklos BTW. Your perl scripts remind me, that it would be very good to have some automated testing for FUSE. Anyone know of some good frameworks for this? |
From: Cody P. <co...@hp...> - 2004-06-25 11:52:39
|
PyUnit is a fantastic unit test framework if you like python. http://pyunit.sourceforge.net/ Id be happy to write some unit tests for fuse -Cody On Jun 25, 2004, at 12:59 AM, Miklos Szeredi wrote: > >> Attached is a new version of the patch. This implements what I >> described - uses op.hidefile if it exists, otherwise uses rename() and >> unlinks the hidden file when it is no longer needed. >> >> I've tested both cases using my filesystem. Also I'll attach a simple >> perl script that I used as a sanity check against the >> delete-while-open >> and rename-while-open cases.. > > Thanks! > > I integrated it into CVS, with the following modifications: > > - I left out the hidefile method. Not because I'm opposed to it, but > because I can't test it, and anyway it's better to start off with > the simpler aproach. > > - I made the hidden file name generation similar to the one in NFS. > Including the original filename in the hidden filename has the > drawback, that the new name might become too long. > > Miklos > > BTW. Your perl scripts remind me, that it would be very good to have > some automated testing for FUSE. Anyone know of some good frameworks > for this? > > > ------------------------------------------------------- > This SF.Net email sponsored by Black Hat Briefings & Training. > Attend Black Hat Briefings & Training, Las Vegas July 24-29 - > digital self defense, top technical experts, no vendor pitches, > unmatched networking opportunities. Visit www.blackhat.com > _______________________________________________ > Avf-fuse-dev mailing list > Avf...@li... > https://lists.sourceforge.net/lists/listinfo/avf-fuse-dev |
From: Miklos S. <msz...@in...> - 2004-06-25 12:45:36
|
> PyUnit is a fantastic unit test framework if you like python. > http://pyunit.sourceforge.net/ > > Id be happy to write some unit tests for fuse That would be cool! I think threre are two ways of testing fuse: 1) through the filesystem interface only: e.g. write something to a file, then read it back, and check if they are equal, etc... 2) by providing a "test" filesystem through which it can be tested that doing some filesystem operation will call the right filesystem operation with the right parameters. The second is probably better for testing the FUSE kernel module and the library. The first is easier to do and could be used for testing arbitary FUSE based filesystems. Any thoughts? Miklos |
From: Valient G. <vg...@po...> - 2004-06-26 10:44:20
|
On Fri, 2004-06-25 at 08:59, Miklos Szeredi wrote: > > I integrated it into CVS, with the following modifications: Thanks! I was backporting your change to a diff against 1.2 and noticed that you may be missing a "if(path) free(path)" at the end of do_open... Valient |
From: Miklos S. <msz...@in...> - 2004-06-26 20:59:52
|
> I was backporting your change to a diff against 1.2 and noticed that you > may be missing a "if(path) free(path)" at the end of do_open... Good spotting. Thanks, Miklos |