From: Ralf H. <ra...@bo...> - 2005-01-17 17:26:38
|
Hi, I have several questions regarding avfs and I hope to find some answers here. I want to use AVFS for my filemanager (www.boomerangsworld.de/worker). I already tried to use the VFS from MC but as it is not thread safe it is not an option. There are some other VFS but I think avfs is the system which best fits my needs. I recently checked out the avfs CVS code and found the shared library. I already use it and it works fine so far. However, I have a few questions and hints. First of all I miss some functions. I don't know if the avfs project want to implement the fopen/fread/fclose... function class but this is not a real problem. But there is no virt_remove function. I know it is only a wrapper for unlink/rmdir but according to the man-page the remove function is mentioned in ANSI C, POSIX and other standards so it is perhaps a good idea to implement it. A real problem is the init()/destroy() stuff. AVFS is thread safe but not fork safe. In the MC VFS code there are the functions vfs_init and vfs_shut which need to be called by the programmer. AVFS takes care of this itself which is a good thing but it cause problems when forking. If a forked child quits, the destroy function is called which will delete the temporary directory (/tmp/.avfs...) The original process thinks this directory still exists and the VFS does not work properly anymore. I workaround this problem for my filemanager by using _exit for the forked childs but I think there should be a possibility to prevent the destroy call (for example explicitly remove destroy from atexit list or use a lock file in the avfs temp dir for some kind of reference counting). BTW, can avfs handle shared tmp files between forked processes at all? A solution could be the usage of pthread_atfork to recreate temporary files for the child. Then there is serious problem with the character # inside filenames. I'm aware that filenames containing for example #utar cannot be accessed but atleast files with a # and other following characters not matching any vfs extensions should be accessible. For example for a directory containing a file "test#test" readdir will return "test##test" but lstat on this name fails. The directory /#avfsstat/cache has the inode 0 but inode 0 & 1 should not be used (if "http://www.gsp.com/cgi-bin/man.cgi?section=5&topic=inode" is true for all filesystems). A function "isLocal()" would really help me as I use the filecontent for filetype recognition and it is not always a good idea to read the files in virtual directories (slow ftp or ssh). This way I can skip such files. The last release is 3 years old and I don't know about bugfixes. But is there any reason I should not use AVFS for my filemanager? Do you mind if I use the latest CVS code and put it on my homepage so the user can use this? I can implement some of the changes myself so if you accept patches I will send them to the list. After all the criticism let me say that I think avfs is a great virtual filesystem especially since it is very easy to add to a program which uses normal file operations. Best Regards, Ralf Hoffmann -- Homepage: http://www.boomerangsworld.de E-Mail: Ralf Hoffmann <ra...@bo...> english or german |
From: Miklos S. <mi...@sz...> - 2005-01-17 19:45:59
|
> I have several questions regarding avfs and I hope to find some answers > here. I want to use AVFS for my filemanager > (www.boomerangsworld.de/worker). I already tried to use the VFS from MC > but as it is not thread safe it is not an option. There are some other > VFS but I think avfs is the system which best fits my needs. > I recently checked out the avfs CVS code and found the shared library. I > already use it and it works fine so far. However, I have a few questions > and hints. > First of all I miss some functions. I don't know if the avfs project > want to implement the fopen/fread/fclose... function class but this is > not a real problem. But there is no virt_remove function. I know it is > only a wrapper for unlink/rmdir but according to the man-page the remove > function is mentioned in ANSI C, POSIX and other standards so it is > perhaps a good idea to implement it. It's easy to implement: int virt_remove(const char *path) { struct stat buf; virt_stat(path, &buf); if (S_ISDIR(buf.st_mode)) return virt_rmdir(path); else return virt_unlink(path); } I'm sure I havent thought of everything, but it shouldn't be much more complex. > A real problem is the init()/destroy() stuff. AVFS is thread safe but > not fork safe. In the MC VFS code there are the functions vfs_init and > vfs_shut which need to be called by the programmer. AVFS takes care of > this itself which is a good thing but it cause problems when forking. If > a forked child quits, the destroy function is called which will delete > the temporary directory (/tmp/.avfs...) The original process thinks this > directory still exists and the VFS does not work properly anymore. I > workaround this problem for my filemanager by using _exit for the forked > childs but I think there should be a possibility to prevent the destroy > call (for example explicitly remove destroy from atexit list or use a > lock file in the avfs temp dir for some kind of reference counting). > BTW, can avfs handle shared tmp files between forked processes at all? A > solution could be the usage of pthread_atfork to recreate temporary > files for the child. It's not designed to be fork safe unfortunately. So if it works it's purely by chance. > Then there is serious problem with the character # inside filenames. I'm > aware that filenames containing for example #utar cannot be accessed but > atleast files with a # and other following characters not matching any > vfs extensions should be accessible. For example for a directory > containing a file "test#test" readdir will return "test##test" but lstat > on this name fails. Hmm. In theory it should work. If it doesn't thats a bug. > The directory /#avfsstat/cache has the inode 0 but inode 0 & 1 should > not be used (if > "http://www.gsp.com/cgi-bin/man.cgi?section=5&topic=inode" is true for > all filesystems). > > A function "isLocal()" would really help me as I use the filecontent for > filetype recognition and it is not always a good idea to read the files > in virtual directories (slow ftp or ssh). This way I can skip such files. Something like this should work: int is_local(const char *path) { int isloc = 1; ventry *ve; res = av_get_ventry(path, 1, &ve); if (res == 0) { if (ve->mnt->base != NULL) isloc = 0; av_free_ventry(ve); } return isloc; } > The last release is 3 years old and I don't know about bugfixes. But is > there any reason I should not use AVFS for my filemanager? I didn't really have time to keep up AVFS work, but I still haven't lost the hope, that someday I can still update it. > Do you mind if I use the latest CVS code and put it on my homepage > so the user can use this? Not at all. > I can implement some of the changes myself so if you accept patches I > will send them to the list. OK, thanks. Miklos |
From: Ralf H. <ra...@bo...> - 2005-02-08 21:54:11
Attachments:
avfs-cvs-patch1.diff
|
Hi, I attached a patch which fixes some of the problems I mentioned. I describe my changes below. On 17-Jan-2005 Miklos Szeredi wrote: >> not a real problem. But there is no virt_remove function. I know it >> is >> only a wrapper for unlink/rmdir but according to the man-page the >> remove >> function is mentioned in ANSI C, POSIX and other standards so it is >> perhaps a good idea to implement it. > > It's easy to implement: > > int virt_remove(const char *path) > { > struct stat buf; > virt_stat(path, &buf); > if (S_ISDIR(buf.st_mode)) > return virt_rmdir(path); > else > return virt_unlink(path); > } I have added this to virtual.c (but by using lstat instead of stat). > It's not designed to be fork safe unfortunately. So if it works it's > purely by chance. Okay, good to know. I have some ideas to solve this or atleast workaround but I don't need this currently. >> Then there is serious problem with the character # inside filenames. > > Hmm. In theory it should work. If it doesn't thats a bug. The problem was that preprocess_name is only called for non-local files in lookup_segment but escape_magic is called for every file in av_fd_readdir. Anyway I don't like the method to escape the magic character. The user would need to enter double "#" to add a single "#" to the filename. I came up with a different solution. In parse_path I firstly ignore the magic character for each segment and test for a local file. If this fails it is repeated with the magic character taken into account. With this solution it is possible to access any local file while still be able to access most virtual files (filenames in archives with the magic character are not accessable but they are not accessable at all without avfs on the other hand :-) ). A possible solution could be the use of the stat function of the corresponding avfs struct but this could slowdown the parse_path operation. >> A function "isLocal()" would really help me as I use the filecontent >> for >> filetype recognition and it is not always a good idea to read the >> files >> in virtual directories (slow ftp or ssh). This way I can skip such >> files. > > Something like this should work: > > int is_local(const char *path) > { [snip} > } Okay, I also added this function to virtual.c I hope you can find some time to look at my changes and tell me your thoughts about it. Thanks. Best Regards, Ralf Hoffmann -- Homepage: http://www.boomerangsworld.de E-Mail: Ralf Hoffmann <ra...@bo...> english or german |
From: Miklos S. <mi...@sz...> - 2005-02-09 09:19:52
|
> I have added this to virtual.c (but by using lstat instead of stat). Yes, lstat is the right thing to use. > >> Then there is serious problem with the character # inside filenames. > > > > Hmm. In theory it should work. If it doesn't thats a bug. > > The problem was that preprocess_name is only called for non-local files > in lookup_segment but escape_magic is called for every file in > av_fd_readdir. Oh, yes. I see the problem now. It wasn't noticed before because both avfscoda with redir and ld_preload bypass libavfs for local files. > Anyway I don't like the method to escape the magic character. The user > would need to enter double "#" to add a single "#" to the filename. > I came up with a different solution. In parse_path I firstly ignore the > magic character for each segment and test for a local file. If this > fails it is repeated with the magic character taken into account. Sounds good. > With this solution it is possible to access any local file while still > be able to access most virtual files (filenames in archives with the > magic character are not accessable but they are not accessable at all > without avfs on the other hand :-) ). > A possible solution could be the use of the stat function of the > corresponding avfs struct but this could slowdown the parse_path > operation. But only if magic char is present in path. And anyway stat() is a very fast operation. However I don't understand why you do this in such a complex way in parse_path(), instead of just doing the lstat() in av_get_ventry() to determine once and for all if the path is local or not. > Okay, I also added this function to virtual.c > > I hope you can find some time to look at my changes and tell me your > thoughts about it. Seems OK. One general note: you shuld follow the coding style of the original code: - indent by 4 spaces - use if () xxx; else yyy; instead of if () xxx; else yyy; - use if (xxx) instead of if ( xxx ) I know it's hard to follow other's style but it makes readig code much harder if it is written in mixed style. Thanks, Miklos |
From: Ralf H. <ra...@bo...> - 2005-02-09 17:12:31
|
Hi, On 02/09/2005 10:19 AM, Miklos Szeredi wrote: >>With this solution it is possible to access any local file while still >>be able to access most virtual files (filenames in archives with the >>magic character are not accessable but they are not accessable at all >>without avfs on the other hand :-) ). >>A possible solution could be the use of the stat function of the >>corresponding avfs struct but this could slowdown the parse_path >>operation. > > > But only if magic char is present in path. And anyway stat() is a > very fast operation. However I don't understand why you do this in > such a complex way in parse_path(), instead of just doing the lstat() > in av_get_ventry() to determine once and for all if the path is local > or not. This was my first solution. But this way it is not possible to access an archive in a path containing the magic char (/tmp/test#/test.zip#/). lstat fails of course but the "test#" directory will break parse_path. > Seems OK. One general note: you shuld follow the coding style of the > original code: [snip] > I know it's hard to follow other's style but it makes readig code much > harder if it is written in mixed style. Good point. I will change my patch according this coding style. Thanks for your fast reply. Best Regards, Ralf Hoffmann -- Homepage: http://www.boomerangsworld.de E-Mail: Ralf Hoffmann <ra...@bo...> english or german |
From: Miklos S. <mi...@sz...> - 2005-02-09 21:02:06
|
> This was my first solution. But this way it is not possible to access an > archive in a path containing the magic char (/tmp/test#/test.zip#/). > lstat fails of course but the "test#" directory will break parse_path. OK. How about doing the lstat() up to the path component containing the magic char, and if it exists iterating until a nonexisting one is found or no more magic char is present? Thanks, Miklos |
From: Ralf H. <ra...@bo...> - 2005-02-10 10:31:41
|
Hi, On 02/09/2005 10:01 PM, Miklos Szeredi wrote: > OK. How about doing the lstat() up to the path component containing > the magic char, and if it exists iterating until a nonexisting one is > found or no more magic char is present? If I understand you correctly, basically my code does this already. Until a path component contains the magic char, no lstat is executed. For this component the code will check the existence and continue to the next component containing the magic char (again no lstat is used for "clean" segments). For the more common case of accessing local files without a magic char, lstat is not used at all. If you mean to do this test outside parse_path than I think this should be also okay. Something like this (pseudocode) could work: name = ps->path for(;;) { while ( *name && *name != '#' ) name++; while ( *name && *name != '/' ) name++; *name = 0; if ( lstat( name ) != 0 ) break; } initial_seglen = name - ps->path; return parse_path( ps, initial_seglen ) But on the other hand I don't see any benefit from this solution. Also my code can be extented to allow for supporting files containing # inside archives if the lstat of the corresponding avfs struct is used (ps->mnt->avfs->lstat()?). BTW, I'm subscribed to the list so there is no need to CC me. Best Regards, Ralf Hoffmann -- Homepage: http://www.boomerangsworld.de E-Mail: Ralf Hoffmann <ra...@bo...> english or german |
From: Miklos S. <mi...@sz...> - 2005-02-10 17:08:55
|
> If you mean to do this test outside parse_path than I think this should > be also okay. > Something like this (pseudocode) could work: > > name = ps->path > for(;;) { > while ( *name && *name != '#' ) name++; > while ( *name && *name != '/' ) name++; > *name = 0; > if ( lstat( name ) != 0 ) > break; > } > initial_seglen = name - ps->path; > return parse_path( ps, initial_seglen ) > > But on the other hand I don't see any benefit from this solution. It's simpler, which is a great benifit :) > Also my code can be extented to allow for supporting files > containing # inside archives if the lstat of the corresponding avfs > struct is used (ps->mnt->avfs->lstat()?). Inside archives the escaping will work. I agree that escaping has it's drawbacks, on the other hand # is not a very often used charater in filenames (I know emacs and CVS use it for intrenal purposes). Thanks, Miklos |
From: Ralf H. <ra...@bo...> - 2005-02-10 21:45:49
|
Hi, On 10-Feb-2005 Miklos Szeredi wrote: >> But on the other hand I don't see any benefit from this solution. >=20 > It's simpler, which is a great benifit :) Agreed. I implemented this and it looks like it also works. Although it is simplier, this code does not allow for later extension to test virtual files the same way (example: /tmp/test#/test.zip#/test#/x) And it is not much simplier (10 lines less code ignoring comments). I'm currently not sure which one I like more. >> Also my code can be extented to allow for supporting files >> containing # inside archives if the lstat of the corresponding avfs >> struct is used (ps->mnt->avfs->lstat()?). >=20 > Inside archives the escaping will work. I agree that escaping has > it's drawbacks, on the other hand # is not a very often used charater > in filenames (I know emacs and CVS use it for intrenal purposes). My patch deactivate escaping at all (there is a ifdef ESCAPE_MAGIC to trigger this)! It would be harder to only escape magic chars for virtual files (/tmp/test#/test.zip#/test##/x). By extenting my solution to use stat also for virtual file, this could be solved but from my point of view it is acceptable not to be able to access such virtual files. Best Regards, Ralf Hoffmann --=20 Homepage: http://www.boomerangsworld.de E-Mail: Ralf Hoffmann <ra...@bo...> english or german |
From: Miklos S. <mi...@sz...> - 2005-02-12 07:57:09
|
> >> But on the other hand I don't see any benefit from this solution. > > > > It's simpler, which is a great benifit :) > > Agreed. I implemented this and it looks like it also works. Although it > is simplier, this code does not allow for later extension to test > virtual files the same way (example: /tmp/test#/test.zip#/test#/x) I'm not against getting rid of escaping. But then let's get rid of the ifdefs and make it work for all cases. > And it is not much simplier (10 lines less code ignoring comments). > I'm currently not sure which one I like more. > > >> Also my code can be extented to allow for supporting files > >> containing # inside archives if the lstat of the corresponding avfs > >> struct is used (ps->mnt->avfs->lstat()?). > > > > Inside archives the escaping will work. I agree that escaping has > > it's drawbacks, on the other hand # is not a very often used charater > > in filenames (I know emacs and CVS use it for intrenal purposes). > > My patch deactivate escaping at all (there is a ifdef ESCAPE_MAGIC to > trigger this)! It would be harder to only escape magic chars for > virtual files (/tmp/test#/test.zip#/test##/x). Yes, that's rather ugly, though avfscoda, and LD_PRELOAD did this, by completely bypassing avfs for local files. > By extenting my solution to use stat also for virtual file, this > could be solved but from my point of view it is acceptable not to be > able to access such virtual files. I understand. Please do as you see fit, but I'd rather only include a proper solution to this problem (with no ifdefs). Some specific details on how you can improve the readability of detecting files with magic char inside: - make this a separate function (skip_existing_magic(), whatever) - don't try to optimize away memory allocation for small buffers, it's really not worth the complexity. - in fact I think you can get away without a temporary buffer, since the original path buffer can be modified (insert a null charater at the right position). My feeling is that making it work for local and virtual files should make it less complex not more so. But of course I may be wrong. Thanks, Miklos |
From: Ralf H. <ra...@bo...> - 2005-02-14 20:42:37
|
Hi, On 12-Feb-2005 Miklos Szeredi wrote: > My feeling is that making it work for local and virtual files should > make it less complex not more so. But of course I may be wrong. I changed the code so segment_len will now check with another function is the next segment describes a local file. This way parse_path is almost untouched. I currently trying to add the test to virtual files to only enter the next avfs hierarchy if a file containing the magic char is not present in the current avfs mount. Is there a easy way to call av_file_getattr? It looks like I need a vfile which itself needs a ventry which the function currently tries to build. If it is possible to call some stat function for the current avfs (ps->ve->mnt->avfs) and only giving the rest of ps->path then it is easy to add this test to the new function segment_islocal. If this is not possible then I would go with your solution and just testing at the beginning of parse_path for local files. Although we cannot test for symlinks (remember followsymlinks code in parse_path) the lstat will fail for the first symlink to a virtual file. Of course this way we cannot support the magic char in virtual files. MC itself also doesn't support magic characters in virtual filename although it requires complete vfs name (#utar... instead of # shortcut). If we can find a way to call avfs->stat we can support this but it wouldn't hurt much if not. Best Regards, Ralf Hoffmann --=20 Homepage: http://www.boomerangsworld.de E-Mail: Ralf Hoffmann <ra...@bo...> english or german |
From: Miklos S. <mi...@sz...> - 2005-02-15 10:15:07
|
> I changed the code so segment_len will now check with another function > is the next segment describes a local file. This way parse_path is > almost untouched. I currently trying to add the test to virtual files > to only enter the next avfs hierarchy if a file containing the magic > char is not present in the current avfs mount. > Is there a easy way to call av_file_getattr? It looks like I need a > vfile which itself needs a ventry which the function currently tries to > build. Tricky. How about leaving the test till the next segment: so instead of blindly accepting the magic char, at the beginning of the segment, do a stat on the file which is the current ps->ve plus from the magic char up to the next slash. This way you already hold the correct ventry. This means that you'd have to modify lookup_segment() instead of segment_len(). Does this sound solvable? > If this is not possible then I would go with your solution and just > testing at the beginning of parse_path for local files. Although we > cannot test for symlinks (remember followsymlinks code in parse_path) > the lstat will fail for the first symlink to a virtual file. Hmm. Not following links to virtual files was one of the shortcomings of avfscoda and ld_preload, so yes it would be nice if this would also work. > If we can find a way to call avfs->stat we can support this but it > wouldn't hurt much if not. Agreed. Thanks, Miklos |
From: Ralf H. <ra...@bo...> - 2005-02-15 21:00:00
|
Hi, On 15-Feb-2005 Miklos Szeredi wrote: >> I changed the code so segment_len will now check with another >> function >> is the next segment describes a local file. This way parse_path is >> almost untouched. I currently trying to add the test to virtual >> files >> to only enter the next avfs hierarchy if a file containing the magic >> char is not present in the current avfs mount. >> Is there a easy way to call av_file_getattr? It looks like I need a >> vfile which itself needs a ventry which the function currently tries >> to >> build. >=20 > Tricky. How about leaving the test till the next segment: so instead > of blindly accepting the magic char, at the beginning of the segment, > do a stat on the file which is the current ps->ve plus from the magic > char up to the next slash. This way you already hold the correct > ventry. This means that you'd have to modify lookup_segment() > instead > of segment_len(). Does this sound solvable? Do I understand you correctly that you want to test for real file in lookup_segment before calling lookup_avfs? This could be a good solution. For the example "/tmp/test#test/x" we have a valid ventry for "/tmp/test" and now trying to lookup "#test". We need to get "#test" into the ventry. With this ventry I could call av_file_open. I think the result should be enough information, stat is not really needed. If it works there is a local file and we know to ignore the magic. This also works when testing for such special files in virtual files. On the other hand this wouldn't work for "test.tar.gz#ugz#something". lookup_segment wouldn't find a valid file for "test.tar.gz#ugz" so it would call lookup_avfs for #ugz. But we could test just before lookup_segment() in parse_path() using the rest of ps->path up to the slash and if it fails we could continue as before. To temporary add a segment to the ventry, can I copy the ventry and call lookup_virtual, test the existence (using av_file_open), then delete the copy and repeat with taking the magic into account? My current thought is something like this: l =3D segment_len(ignore_magic =3D 1) ps->path[l] =3D 0 av_copy_ventry(ps->ve, &tempve) ps->ve =3D tempve; lookup_virtual( ps ); if(!av_file_open(&vf, ps->ve,...)) { free_ventry(ps->ve), ps->ve =3D oldve; l =3D segment_len( ignore_magic =3D 0 ) } else { av_file_close() delete oldve (or redo the same with the original ve) } [continue as before] This code doesn't even care about local or virtual files. It just test for existence before entering the next level of avfs. If you think this should work, I will try to implement it. Best Regards, Ralf Hoffmann --=20 Homepage: http://www.boomerangsworld.de E-Mail: Ralf Hoffmann <ra...@bo...> english or german |
From: Miklos S. <mi...@sz...> - 2005-02-18 10:11:25
|
> Do I understand you correctly that you want to test for real file in > lookup_segment before calling lookup_avfs? This could be a good > solution. For the example "/tmp/test#test/x" we have a valid ventry for > "/tmp/test" and now trying to lookup "#test". We need to get "#test" > into the ventry. With this ventry I could call av_file_open. I think the > result should be enough information, stat is not really needed. If it > works there is a local file and we know to ignore the magic. This also > works when testing for such special files in virtual files. > > On the other hand this wouldn't work for "test.tar.gz#ugz#something". > lookup_segment wouldn't find a valid file for "test.tar.gz#ugz" so it > would call lookup_avfs for #ugz. The really general solution would be to try appending the segments until a '/' is found (in which case it's virtual) or until one file did exist. In fact to be perfectly correct, the testing should start from the whole path segement up to the slash, and if it doesn't exist go backwards removing #foo until either no more # is found or one file did exist. Does this sound too difficult? > l = segment_len(ignore_magic = 1) > ps->path[l] = 0 > av_copy_ventry(ps->ve, &tempve) > ps->ve = tempve; > lookup_virtual( ps ); > if(!av_file_open(&vf, ps->ve,...)) { > free_ventry(ps->ve), > ps->ve = oldve; > l = segment_len( ignore_magic = 0 ) > } else { > av_file_close() > delete oldve (or redo the same with the original ve) > } > [continue as before] > > This code doesn't even care about local or virtual files. It just test > for existence before entering the next level of avfs. > > If you think this should work, I will try to implement it. Something like that should work IMO. Thanks, Miklos |
From: Ralf H. <ra...@bo...> - 2005-02-18 12:33:03
|
Hi, On 02/18/2005 11:11 AM, Miklos Szeredi wrote: >>Do I understand you correctly that you want to test for real file in >>lookup_segment before calling lookup_avfs? This could be a good >>solution. For the example "/tmp/test#test/x" we have a valid ventry for >>"/tmp/test" and now trying to lookup "#test". We need to get "#test" >>into the ventry. With this ventry I could call av_file_open. I think the >>result should be enough information, stat is not really needed. If it >>works there is a local file and we know to ignore the magic. This also >>works when testing for such special files in virtual files. >> >>On the other hand this wouldn't work for "test.tar.gz#ugz#something". >>lookup_segment wouldn't find a valid file for "test.tar.gz#ugz" so it >>would call lookup_avfs for #ugz. > > > The really general solution would be to try appending the segments > until a '/' is found (in which case it's virtual) or until one file > did exist. In fact to be perfectly correct, the testing should start > from the whole path segement up to the slash, and if it doesn't exist > go backwards removing #foo until either no more # is found or one file > did exist. Does this sound too difficult? I already implemented the code I mentioned and it looks good. I still need some tests to be sure everything is okay especially for the undo step. But atleast I can access files like "test#" inside archives. I will think about your idea to go back until an existing file is found, it shouldn't be hard to add it to the code. Although I think it is acceptable to not be able to enter archives with the magic char in the filename, a general solution is much better. Best Regards, Ralf Hoffmann -- Homepage: http://www.boomerangsworld.de E-Mail: Ralf Hoffmann <ra...@bo...> english or german |