Tcl's path management code has a
fundamental feature that each path
belongs to exactly one filesystem.
The way this is implemented is that
Tcl iterates over all registered filesystems
in reverse order of their registration and
asks each one "does this file belong to you?"
(via a call to the FS's pathInFilesystemProc).
Each filesystem is allowed to answer YES
or NO, and the first one that answers YES
is declared the unique owner of that path.
The native filesystem is registered first, so
it is queried last, and always answers YES,
so the fallback filesystem for unclaimed path
values is the native filesystem.
All operations on a path will be passed
to its owning filesystem, and after any
shimmering, that owning filesystem will
be the one that gets to create the refreshed
internal rep, suitable only for that filesystem.
This feature is what makes the public
routine Tcl_FSGetFileSystemForPath(pathPtr)
make sense. You pass in a path, and get back
the unique filesystem that owns that path.
There are several limitations that this feature
imposes that are unattractive.
1) Impossible to implement "stacked" filesystems.
One could imagine a filesystem that claimed all
paths, did some kind of logging, or other filtering
on the operations, and then passed along to an
appropriate "real" filesystem to do the actual work.
It's simple enough to create a new filesystem that
claims all paths and does the filtering work, but
things would break as soon as it attempted to pass
the path along to the real filesystem. When the
real filesystem tried to do operations on the path,
the path would be recognized as belonging to the
stacked filesystem, so operations would get passed
back up to the filtering layer again. The real filesystem
would not be able to create/fetch its own internal rep
for the path, because the filtering filesystem's claim
on the path would keep producing the filtering
filesystem's interal rep instead.
One might imagine the filtering filesystem releasing
its claim on the path before it passes it down to the
real filesystem (change internal state, so the filtering
filesystem no longer says YES to this particular path),
but there doesn't seem to be any way to do that in
a thread safe manner. While the filtering filesystem
in Thread A abandons its claim on a path to pass it
on to the real filesystem, then operations on the same
path in Thread B bypass the filtering operation.
Another difficulty is that a filtering filesystem could
only work if it were registered after the filesystem(s)
it is filtering. For filtering the native filesystem, this
might work, as long as the native filesystem registered
first rule persists, but filtering of other filesystems
would not be robust, as order of filesystem registration
is essentially impossible to control (multi-threads).
2) Impossible to implement mountable archive file.
Imagine an archive file in some filesystem:
/path/to/archive.ar
[file system /path/to/archive.ar] will return the
filesystem in which that archive file is stored and
[file type /path/to/archive.ar] will return "file". Then
imagine mounting that archive with the mount command
appropriate for the filesystem in that archive:
fs::mount /path/to/archive.ar
The desire is that [file system /path/to/archive.ar] will
return "fs" and [file type /path/to/archive.ar] will return
"directory" and it will now be possible to access the
contents
of the archive as virtual files, as in:
open /path/to/archive.ar/internal/foo.bar
As in the first case, this will founder because it
depends on the single path /path/to/archive.ar
being able to belong to two filesystems at once.
The internal operations of filesystem "fs" will
need to be defined as operations of the original
filesystem on the archive file, but Tcl will keep
insisting that "fs" is the only filesystem that can
operate on that path.
"But wait!" you say. Doesn't Tclkit/Starkit/mkfs do
this? Well, yes it does, but it manages to do it
by having its access to the underlying filesystem not
pass back through Tcl. (Correct me if I'm wrong)
By not going back to Tcl to do the lower level
operations, it can avoid Tcl's insistent assignment
of the archive path to the vfs later. However, this
also means that only native files that are accessible
without going back through Tcl are able to be archive
files. This means no nesting of archives, and it means
that the illusion that virtual files and native files are
equivalent is incomplete. I'm pretty sure (again,
someone explain if I'm mistaken) that this means
it's not possible to mount such an archive remotely
(within an HTTP or FTP virtual filesystem).
3) Impossible to mount one FS within another.
I think this is probably the most significant limitation.
Each filesystem gets one chance to say YES or NO
to owning a path. If the path points to an existing
file in the filesystem, then YES is the easy answer.
If the (normalized) path is completely outside a
filesystem's mount points, then NO is an equally
easy answer. The remaining cases are tricky.
Say that the path
/foo/bar/soom/example
is to be tested, and my filesystem holds the
mountpoint /foo/bar . Should my filesystem say YES?
Several cases:
/foo/bar/soom/example exists in my FS -> YES
(modulo the mountable archive problem already noted)
/foo/bar/soom is not a directory in my FS -> NO
Otherwise -> ????
In the final case /foo/bar/soom is my directory, but
/foo/bar/soom/example does not (yet) exist in my
FS. It seems my FS should answer YES, because
if that file were to be created, it would involve
writing to a directory which is mine.
However, it's also possible that
/foo/bar/soom/example is a mountpoint for
another filesystem. If I answer YES, then that
filesystem will never get its chance to claim its
mountpoint. On the other hand, if I answer NO,
and there is no such other filesystem that claims
that mountpoint, then attempts to create that
path will be routed to the native FS, when they
ought to be routed to me. The only way to avoid
this dilemma is to not allow nested FS mounting.
As noted before filesystem registration order
is not really available to control to avoid this issue.
The exception is the native filesystem, which is
always last. This means the most common case,
adding mounts within the native filesystem works,
even though as a general operation it's fundamentally
broken. Another common case, adding mount points
outside of any existing filesystem, like http:// -rooted
paths, also works fine.
I'm assigning this report first to Andreas since
he expressed interest in these issues. It should
get passed along to Vince Darley as well. I'm
registering this as a bug, even though it's arguably
a feature request, because if these limitations are
to continue, they should be more clearly spelled
out in the documentation.
Just to record a few half-thoughts on an alternative,
it seems that mountpoints are the special cases that
get in the way of the current scheme. Perhaps if
the (pathInFilesystemProc)s had a richer set of
answers, things could be improved. Rather than a
simple YES (return TCL_OK) or NO (return -1),
some other answers might be "THAT'S MY MOUNT"
or "I COULD TAKE THAT IF NO ONE ELSE DOES".
The current docs clearly leave other return values
undefined, so they could be used for such an
expansion. The stacked filesystems problem seems
to be more difficult.
Andreas Kupries
37. File System
obsolete: 8.5a2
Public
|
Date: 2004-06-03 17:30 Logged In: YES |
|
Date: 2004-05-04 21:01 Logged In: YES |
|
Date: 2004-04-26 15:05 Logged In: YES |
| Field | Old Value | Date | By |
|---|---|---|---|
| summary | path<->FS function limitations | 2004-06-03 17:30 | dgp |
Copyright © 2010 Geeknet, Inc. All rights reserved. Terms of Use