relfs-devel Mailing List for Relational Filesystem (Page 3)
Status: Pre-Alpha
Brought to you by:
applejack
You can subscribe to this list here.
| 2004 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
(16) |
Sep
|
Oct
|
Nov
|
Dec
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2005 |
Jan
(12) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(5) |
Nov
(6) |
Dec
(3) |
| 2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(4) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2007 |
Jan
(5) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(3) |
| 2008 |
Jan
|
Feb
(2) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(6) |
Dec
|
| 2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(2) |
Oct
|
Nov
|
Dec
|
| 2016 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
|
From: Vincenzo a. N. N. <vin...@ya...> - 2004-08-14 09:13:36
|
There are two issues with the way one should create data on the database: 1. If I record attributes of a file on the db, the user is (not without a reason) tempted to change these attributes directly on the db, e.g. to fix a typo in the "author" tag of hundreds of mp3s. 2. If a plugin uses attributes of another plugin they must be called in the right order A possible solution, which pushes hard towards the decision of having a library to create datastructures on the database instead of allowing plugins to create tables (and this makes one think that having a separate user at least for groups of plugins is a good idea), is to create triggers on the database for attributes "owned" by plugins, both on read and on write: - Read triggers: When anybody tries to read an attribute, the database should check conditions, declared by plugins and choosen in a limited range (like "when the first 100 bytes of the file have changed"), which might have invalidated the attribute, and in this case run the appropriate indexer. This solves the dependency issue but NOT if there are circular dependencies which are a bad thing anyway. Several ways to avoid circular dependencies come to mind but none is free from troubles. - Write triggers: When anybody tries to write an attribute (except the owner of the attribute, which makes the choice "a db user for each plugin" an interesting concept) the database should check if there is a plugin which has declared itself able to write that attribute, or else return an error. Conclusion: we will have to implement a library, possibily easy to call by C and so "SWIG"able, which allows to declare "external types". Types should have - attributes with "DB primitive" types, that are translated to column of a table with the same name of the type - attributes with "external" types, which are translated using foreign keys - declarations of writeable-ness for attributes, and what to do in the case of a write. Types should be inserted into suitable database schemas to avoid naming conflicts, and what db schema to use might be chosen either by letting the plugin declare it, or as beerfun said using a fixed set of rules (like "what mimetypes do you match" but not limited to mimetypes) that group plugins together in a more automatized way. A last note is worth for "equalization" of attributes i.e. what if two plugins use different names for the same thing, and the user notices this? She should be enabled to "equalize" the names but I am unsure of the implications of equalization even if semantics of name passing is one of my favourite topics :) However we should see if this can be worked on easily because that would be a great advantage. Someone willing to criticize this architecture? This will be implemented in september because I'll be on vacation in the next two or three weeks. Bye Vincenzo |
|
From: Vincenzo a. N. N. <vin...@ya...> - 2004-08-13 21:39:33
|
On Friday 13 August 2004 08:09, co...@li... wrote: > this project came to my mind as we are working on pretty the same > idea. Well, not completly. So I thought that our projects could stay > in contact and get some ideas from each other and help out a little > bit. > > JDBFS will be a java based (100% java) indexing system with an > embeddable database to store rich metainformation from mp3, ogg, > movies, pdfs, images and other content storing filetypes. > We have working plugins (from GPL projects) for mp3 and ogg. PDF is > on the hook right now. Where is your project? If it is the one at sun site, is it free software? I can't access the page (I even registered). However, if you have a plugin interface and have chosen a data model, I don't see reasons why in principle I couldn't use the same plugin format and the same data model - how are you storing data? Is your embedded DB a relational database? In this case, maybe we both could abstract database access from plugins enough to have your plugins work in relfs and maybe the countrary (but would you like plugins written in languages other than java?). Do you have documentation about your data model and plugin architecture? Bye Vincenzo |
|
From: <co...@li...> - 2004-08-13 08:09:50
|
Hi, this project came to my mind as we are working on pretty the same idea. Well, not completly. So I thought that our projects could stay in contact and get some ideas from each other and help out a little bit. JDBFS will be a java based (100% java) indexing system with an embeddable database to store rich metainformation from mp3, ogg, movies, pdfs, images and other content storing filetypes. We have working plugins (from GPL projects) for mp3 and ogg. PDF is on the hook right now. The goal is to help the user organise his data, so that he just can work with his data, instead of organising it. regards, Bernd |
|
From: Vincenzo a. N. N. <vin...@ya...> - 2004-08-12 21:59:22
|
Just a couple of questions about the data model: 1. beerfun at users dot sf dot net has suggested, if I don't misinterpret her, to use a db schema where plugins can find their data somewhat by the type of data they address: a classification could be made allowing them to declare what mimetypes, from a standard list of those, that they index, and to get in exchange from the system access to tables specific to that mimetype. The idea could be to exploit the multiple schema feature of postgres to have a namespace for each used mimetype, a public namespace to share common information and a private namespace for each plugins. Would this be overkill for queries? 2. it might be interesting to provide applications an interface to create types on the database, where each type is a tuple of either basic sql types or other types and is translated to a table with columns for attributes of basic types and external keys for attributes of other user-defined types. But perhaps it's just simpler and more powerful to leave the raw db interface to the application. Someone willing to suggest tools and libraries to type a database which can be used with C++ and is relatively easy to use? I don't mean a full object-relational mapping with object persistence, just a typed way to declare tables. 3. And what about data protection? Would it be worth to have a DB user for each plugin, with ACLs set right to access shared tables, or is it more natural to allow each plugin to access and modify the whole data hierarchy? Bye Vincenzo |
|
From: Vincenzo a. N. N. <vin...@ya...> - 2004-08-08 16:26:22
|
We now have plugins working in another thread, threaded fuse mainloop, and doxygen documentation for all headers for developers pleasure. RelFS - Relational Filesystem ----------------------------- * sf.net project description: RelFS is a linux userspace "shadow" (file data remains on disk) relational filesystem using fuse and an SQL database to store metadata. Directories can represent queries, and powerful features (e.g. bayesian classification) are added through plugins. We already have a working prototype available, and are in active development state. The website for development and releases is http://www.sourceforge.net/projects/relfs while the homepage, where you can find the FAQs is at http://relfs.sourceforge.net At the moment we don't have user releases, since we are in an early stage of development. We are looking for developers and ideas on how to use the beast. You can subscribe to the mailing list that you will find on the sf.net project page. Be sure to read the FAQ at http://relfs.sourceforge.net/FAQ.html that include motivation, goals and differences with existing projects. And, quoting from the FAQ: # This is a very ambitious project, you won't even get to beta! I would like to use an incremental approach, in wich we first stabilize over a small goal, then improve the performance of the program, then we start working on another goal. The first goal, for example, is to obtain a working indexing filesystem with an easy to use plugin architecture. We are already close to that. See TODO.txt, section "RELEASE PLAN/MILESTONES". Bye Vincenzo Ciancia |
|
From: Vincenzo a. N. N. <vin...@ya...> - 2004-08-03 21:33:13
|
I have carefully audited the source code for RelFS.cpp. It should be exception-safe now but who can be sure - BTW I also checked the LOCAL macro so Guido should be happy now :P (it couldn't be path+1 because of "/" being of only 1 char, but it's better now). Could someone have a look at the file? A question: would it be wise to catch posix signals in the fuse module? And how should I handle them? Setting a global variable, and checking it at the end of each filesystem function? I committed changes to code yesterday; today I added the FAQs for the project to the homepage. The next thing to do, IMHO, is putting the MainIndexer singleton in a separate thread, objections anyone? After this I plan to document existing code using doxygen. Bye Vincenzo |
|
From: Vincenzo a. N. N. <vin...@ya...> - 2004-08-03 21:22:09
|
On Tuesday 03 August 2004 00:09, Guido Villa wrote: > The straightforward way is to treat "rename a b" is > "unlink b; rename a b", but are we sure it is the preferred meaning? > > Maybe, after choosing a meaning for rename, we could implement the > other two in separate programs... > That's a very good question. I would like a consistent behaviour if files outside the filesystem are renamed to file inside it, so I suppose that a good answer would be that metadata of the old file is not attached to the new file. This is the right thing when metadata depends only on the contents of files - unfortunately this condition will not always hold. I suppose that a good solution is to consider the file path only an attribute of files, which have an identity property of their own. In this case, we can both have metadata attached to a file name (such as comments attached by an user to /etc/shadow to remember what it is for, which should stay attached to that file even when that file is zeroed or squashed with another file by the user's brother/syster) and metadata attached to one file's contents, when it's the case. Usability concerns arise, but I guess that we should use this feature only when it's obvious that we need to. However, considering that the target file in a rename operation would be deleted, and that this would mean "move to trash", metadata attached to the destination file contents would be still preserved. > BTW: while you implement renaming, please bear in mind that if the > target exists no unlink system call is made, so you have to stat the > target before renaming, and then manually manage its deletion side > effects. It's a thing one might easily overlook. I will take care of it, thanks. Vincenzo -- First they ignore you, then they laugh at you, then they fight you, then you win. [Gandhi] |
|
From: Guido V. <gu...@vi...> - 2004-08-02 22:09:54
|
I want to share a question that came to my attention only a few days ago. This is unrelated from what you want do with the files, it is more related to the meaning that the renaming of a file has in filesystems that try to manage data somehow. I am not speaking of the trivial case of simply changing a file name, but of the more tricky one, when, by changing a file name, you delete another file. In this case, what has to happen to the metadata (whichever they are, entries in a database or sybolic links pointing to that file)? Maybe the user wants to change the file's content but not the metadata (as if "rename a b" would be interpreted as "cat a > b; unlink a"), maybe it's the contrary (as if "rename a b" would be interpreted as "unlink b; rename a b"). Maybe he would like to have the metadata from the files merged... What do you think? The straightforward way is to treat "rename a b" is "unlink b; rename a b", but are we sure it is the preferred meaning? Maybe, after choosing a meaning for rename, we could implement the other two in separate programs... BTW: while you implement renaming, please bear in mind that if the target exists no unlink system call is made, so you have to stat the target before renaming, and then manually manage its deletion side effects. It's a thing one might easily overlook. bye Guido |
|
From: Vincenzo a. N. N. <vin...@ya...> - 2004-08-02 19:48:26
|
On Monday 02 August 2004 10:41, Guido Villa wrote: > >> I hope we can keep your and mine project linked somehow, because I > >> like your ideas for future expansions, but I think I prefer mine > >> on the core functionalities :) > > > > Have you got a site for your project? BTW I forgot to mention the > > third > > not yet. I want clean a bit the source code before putting it on the > net. Can you explain shortly what is the fundamental difference between the two approaches? You told me that it's about using links to files instead of storing them. I plan to have a distinction between "internal" files, stored in the filesystem, and "external" files which the filesystem has somewhat seen and indexed, but whose data is not available, e.g. files on cdrom or an http proxy cache - the filesystem should keep an URI as a reference. If the URI has "file:" as protocol the filesystem could show, in directories that represent queries, links to these files. Maybe your ideas can fit better into this model. V. |
|
From: Guido V. <gu...@vi...> - 2004-08-02 08:41:28
|
Vincenzo aka Nick Name writes: > On Monday 02 August 2004 00:58, Guido Villa wrote: >> I know that a suggestion like mine is of no use at this stage of the >> project, but I was looking at differences in implementation between >> your FS and mine, and the LOCAL macro was the first thing I noticed >> :) >> >> I hope we can keep your and mine project linked somehow, because I >> like your ideas for future expansions, but I think I prefer mine on >> the core functionalities :) > > Have you got a site for your project? BTW I forgot to mention the third not yet. I want clean a bit the source code before putting it on the net. > performance improvement: keeping files opened until they are "released" > using an hash table of path<->file descriptor mapping. Absolutely. The implementation I am using now for this is rather ugly, but it works. If/when you need it, please tell me. > V. bye Guido -- Guido Villa Always remember that you are unique... gu...@vi... ...just like everybody else. |
|
From: Vincenzo a. N. N. <vin...@ya...> - 2004-08-02 07:37:52
|
On Monday 02 August 2004 00:58, Guido Villa wrote: > I know that a suggestion like mine is of no use at this stage of the > project, but I was looking at differences in implementation between > your FS and mine, and the LOCAL macro was the first thing I noticed > :) > > I hope we can keep your and mine project linked somehow, because I > like your ideas for future expansions, but I think I prefer mine on > the core functionalities :) Have you got a site for your project? BTW I forgot to mention the third performance improvement: keeping files opened until they are "released" using an hash table of path<->file descriptor mapping. V. |
|
From: Guido V. <gu...@vi...> - 2004-08-01 22:58:25
|
Vincenzo aka Nick Name writes: [...] > 2. asynchronous plugins. This is not as easy as it might seem, since we > can't just fork a process or a thread for each indexing operation, or > the system will almost freeze with tons of processes reading from disk. > It would be better to have a queue of indexers to run (and to remove > them from the queue if the file is removed in the meantime). Also, a yes, I too was thinking to a queue. [...] > Oh, not only they can, but they _will_ be moved out of sync asap :) The > most important things for now are a stable index plugins interface and > a good data model. I will think about the data model in the next days. > After these two steps, we will focus on getting maximum speed and > stability from relfs.cpp, including a separate thread or process for > plugins. If other people agree, of course! I am no chief here even if I > am the only coder right now :) > > I would be glad to see patches to code, but I recommend not to take care > of details at the moment, because it's highly probable for entire files > to be removed in this phase. It's better to focus on how features can > be implemented (and to see IF certain features can be implemented at > all using the current design). I know that a suggestion like mine is of no use at this stage of the project, but I was looking at differences in implementation between your FS and mine, and the LOCAL macro was the first thing I noticed :) I hope we can keep your and mine project linked somehow, because I like your ideas for future expansions, but I think I prefer mine on the core functionalities :) > bye > > Vincenzo bye Guido |
|
From: Vincenzo a. N. N. <vin...@ya...> - 2004-08-01 22:38:33
|
On Sunday 01 August 2004 22:37, Guido Villa wrote:
> I'm giving a quick look at the source code...
>
> Why not write:
>
> #define LOCAL(path) (path +1)
>
> instead of:
>
> #define LOCAL(path) (string(".")+string(path))
>
First of all, by now we are developing a prototype. C++ is not a good
language for rapid prototyping, neither is C, (I usually choose ocaml
when I am in a hurry, and I have many segfault-free programs written in
a couple of weeks :) ) but I had to choose between cross-language
interoperability and rapidity of development. Since I want a stable
interface for plugins, and the ability to write plugins in almost any
language, I choose C at first. But not having a rich standard library
convinced me to switch to C++ and here we are.
Since I am prototyping, I don't care about performance too much [and I
don't care too much about how I code - I don't code that bad
usually :P ].
My idea is that to make a good design you need to have implemented the
thing; in other words there's something good in that "I feel like
rewriting everything". I want to write a _very small_ and modular
program, and then to audit, and eventually rewrite some part, module by
module. But the plugin interface, and the data model, must be stable,
so that people can write plugins while we correct and /or rewrite the
core, being confident that their plugins will keep on working.
Said this, RelFS.cpp is almost plain C, and it will stay that way or
even more C, so feel free to audit and correct it. Even the class relfs
is unuseful and might be removed. In the new code however there is a
single parameter for all plugin operations (for ease of development),
of type "class File", so a bit of memory allocation is required.
The rest of the program will remain in C++ I guess, if not translated
into a more usable language, because of the need for high level
libraries.
In fact I had defined LOCAL exactly as you suggested, but then I needed
strings in the rest of the program and so I opted for using strings
everywhere. I am not sure about why I choose to add an initial "./",
feel free to experiment removing it to discover why :) However the most
important speedups will come from:
1. using the threaded fuse mainloop. This is necessary, since we can't
block the whole filesystem waiting for a cdrom read. By now this would
mean implementing either a thread waiting for SQL queries and sending
it over a protected connection, or making as many SQL connections as
queries in the program, since libpqxx doesn't accept multiple threads
on a single opened connection. I would opt for solution 1.
2. asynchronous plugins. This is not as easy as it might seem, since we
can't just fork a process or a thread for each indexing operation, or
the system will almost freeze with tons of processes reading from disk.
It would be better to have a queue of indexers to run (and to remove
them from the queue if the file is removed in the meantime). Also, a
good idea would be to try to gather more information about what exactly
plugins will read from a file: if three different indexers will read a
huge file from the beginning to the end, it could be better to allow
them to declare it, and to pass the same data to all of them. Caching
mitigates this problem but this doesn't mean we can't improve it.
>
> I like the plugin proxy idea, because plugin calls (which could be
> slow) can be moved out of sync from the main process.
Oh, not only they can, but they _will_ be moved out of sync asap :) The
most important things for now are a stable index plugins interface and
a good data model. I will think about the data model in the next days.
After these two steps, we will focus on getting maximum speed and
stability from relfs.cpp, including a separate thread or process for
plugins. If other people agree, of course! I am no chief here even if I
am the only coder right now :)
I would be glad to see patches to code, but I recommend not to take care
of details at the moment, because it's highly probable for entire files
to be removed in this phase. It's better to focus on how features can
be implemented (and to see IF certain features can be implemented at
all using the current design).
bye
Vincenzo
--
I was dressed for success,
but success it never comes
[Pavement - Here]
|
|
From: Guido V. <gu...@vi...> - 2004-08-01 20:37:19
|
I'm giving a quick look at the source code...
Why not write:
#define LOCAL(path) (path +1)
instead of:
#define LOCAL(path) (string(".")+string(path))
?
You wouldn't need to use the string class everywhere, which would be better
(performance-wise), because you do not need all those memory allocations.
I think that the core of the filesystem implementation should be as fast as
possible, and then still a bit more: for example, I would suggest using C
instead of C++.
Filesystem calls are very very frequent, so they should be very small and
very optimised.
I like the plugin proxy idea, because plugin calls (which could be slow) can
be moved out of sync from the main process.
Just my two cents.
Regards,
Guido
|
|
From: Vincenzo a. N. N. <vin...@ya...> - 2004-08-01 17:46:45
|
When the cvs commit I just made will be available, there will be a file named "PLUGIN_HOWTO.txt" in the top directory. There is an implementation for plugins, with a base class called "Index", defined in "IndexPlugin.h". Each operation of this class takes a "File" as an argument. "File" represents a cache of information we already know about files. Things will change a lot. By now I would like to focus on the plugin interface. Once that is settled, an asynchronous proxy for plugins, a dlopen interface and all the rest can be written (since I realize that by now these are just source-code plugins, and cannot be really "plugged in"). I would like opinions on the interface. The protected method "getParent" has a "watch_OPERATION" for each operation, which takes an "Index" as an argument. This allows for a plugin to add other plugins to the indexing chain (e.g. when it's a proxy). The protected method "getConnection" is given. It can be overloaded, this has been made so that an "asynchronous proxy plugin" can be implemented, which does not allow to directly access the main db connection (libpqxx does not allow concurrent access to a connection), but creates a new connection in the overloaded "getConnection" method. By the use of preprocessor macros (ARGH!) code size is greatly reduced, at the expense of readability if one does not read the "FOR_ALL" macros like the one at the top of "IndexPlugin.h". The next step is to implement some useful plugin (those already present are implemented quick & dirty, and do not work properly, they must be regarded as examples). Bye Vincenzo |
|
From: Vincenzo a. N. N. <vin...@ya...> - 2004-07-29 17:29:01
|
On Thursday 29 July 2004 16:13, Fabio Tranchitella wrote: > I didn't try any database-based filesystem, but AFAIK Apple is > developing a similar solution. I'm sure we need next generation > filesystems, with automatic handles of file types, automatic indexing > and a powerful search engine. > As anybody willing to write something "different", I know I will have to face sooner or later the challenge of prooving that I am not copying apple :) Well, the author of storage already had to do the same thing, see http://www.gnome.org/~seth/blog/document-indexing basically apple's new jewel is an indexing system and not a complete filesystem. > You answered me that mssql server is not lightweight. Yes, I agree, > but PostgreSQL too. In general, a database-based filesystem will be > slower (and slower) than a traditional tree-based filesystem. Postgres is not going to eat that much memory, but in general what we should try to achieve is asynchronous indexing, in order to keep storage speed comparable to the native one. > IMHO this type of filesystem can't replace traditional filesystems. > I'd prefer to use a traditional fs for my debian system, and I think > this is the best choice. A relational filesystem can be useful only > for document and personal users folders. Do you agree? Well, I think that there are many files in an unix filesystem which would live better if stored and retrieved through a database (this is not in contrast with the principle of keeping data on disk, but maybe that principle is too restrictive anyway). For example, linux distributions have huge package databases, which are often queried with grep or awk, and would fit perfectly into a database model (they _are_ databases in fact). Also, the ability to synthesize a file on the fly would allow for automatic maintenance of things like /etc/modutils/ -> /etc/modules, and so on. There are many kinds of files which would benefit from a db indexing - if you have ever typed "find /usr/lib -exec nm "{}" ";" |grep something" you know what I mean. > > Regarding to storage, I have written to seth but received no reply. > > Either he is busy, or he is not trusting an open development > > process. Or he just missed my message, I will try to contact him > > again and on some mailing list, too. > > I think we have to avoid code duplication. I've not yet tried > storage.. Did you do this? Is that usable? The best thing would be > extend storage adding some features actually are missing... Maybe > relfs could be integrated in storage, as you suggested. > I still can't compile storage, but I read on seth's page that the version in gnome-cvs is not recent code. So our first task seems to be "look better at storage". I'll have a second look, if you can look at it yourself. In the meanwhile I have written a second e-mail to Seth asking about recent code. > > (e.g. I would tend to keep > > real data on the hard drive, to ensure data visibility in case of > > the hard drive being mounted from a PC without postgresql - Maybe > > psql can do this by itself, I don't know yet) > And in the case you used to mount a relfs filesystem "without > postgresql" and removed, added or modified some documents, how the > metadatas could be updated? How do you imagine this case? > I am still unsure about how files should be _stored_, but regarding to indexing, if plugins have the proof obligation that their effects on the db are the same if data stored in a file is the same, we can reindex modified files (sigh, looking at the modification time or upon user request...). It's not an ideal solution but I can't think of anything better right now, without totally replacing the filesystem with the database (but this way destroying any hope of native-like speed, I suppose, not to mention disk usage). Metadata can be stored, as a backup, in XFS/patched ext2/ext3 extended attributes (but will not be updated on external modifications). The main reason to store data on the disk, besides speed, is that users need to trust the system. Nobody would use a system which makes user files "disappear" from the disk. Bye Vincenzo |
|
From: Fabio T. <ko...@ko...> - 2004-07-29 14:13:33
|
On Thu, Jul 29, 2004 at 03:24:25PM +0200, Vincenzo aka Nick Name wrote: > To summarize what I think about winfs: "I am not used to believe=20 > microsoft rumours until I see the product" and "I won't buy a 1gb=20 > memory addon for my laptop just to open a desktop folder" :) I have=20 > been using ms sql server at work, and sure it is not lightweight.=20 > Moreover, I don't expect too much features to be included in the first=20 > release of winfs, or extensibility through plugins. I didn't try any database-based filesystem, but AFAIK Apple is developing a similar solution. I'm sure we need next generation filesystems, with automatic handles of file types, automatic indexing and a powerful search engine. You answered me that mssql server is not lightweight. Yes, I agree, but PostgreSQL too. In general, a database-based filesystem will be slower (and slower) than a traditional tree-based filesystem.=20 IMHO this type of filesystem can't replace traditional filesystems.=20 I'd prefer to use a traditional fs for my debian system, and I think this is the best choice. A relational filesystem can be useful only for document and personal users folders. Do you agree? > Regarding to storage, I have written to seth but received no reply.=20 > Either he is busy, or he is not trusting an open development process.=20 > Or he just missed my message, I will try to contact him again and on=20 > some mailing list, too.=20 I think we have to avoid code duplication. I've not yet tried storage.. Did you do this? Is that usable? The best thing would be extend storage adding some features actually are missing... Maybe relfs could be integrated in storage, as you suggested. > RelFS and Storage are partially overlapping projects: both have a=20 > compatiblity layer for other applications (those of storage is=20 > gnomevfs, but I don't think there could be any objection to a kernel=20 > interface), and both extract information from files in real time,=20 > storing it into an SQL database, so the best thing to do would be to=20 > use relfs as the lower-level layer of storage, which could then focus=20 > on its higher level goals, such as the natural language query=20 > interface. We'll see, but consider there are differences between the=20 > two approaches (e.g. I would tend to keep real data on the hard drive,=20 > to ensure data visibility in case of the hard drive being mounted from=20 > a PC without postgresql - Maybe psql can do this by itself, I don't=20 > know yet). Moreover, storage has a very simple data model; I am going=20 > to use a richer data model. And in the case you used to mount a relfs filesystem "without postgresql" and removed, added or modified some documents, how the metadatas could=20 be updated? How do you imagine this case? > I encourage people to download the CVS version and try it, then decide=20 > what you would like to do next, and start a thread on this mailing list= =20 > to discuss the details. I'll do this ASAP. Talk to you later, Fabio. --=20 Fabio Tranchitella <!> kobold.it, Turin, Italy - Free is better! ----------------------------------------------------------------------- <http://www.kobold.it>, <ko...@ko...>, <ko...@ja...> ----------------------------------------------------------------------- GPG Key fingerprint: 5465 6E69 E559 6466 BF3D 9F01 2BF8 EE2B 7F96 1564 |
|
From: Vincenzo a. N. N. <vin...@ya...> - 2004-07-29 13:24:28
|
On Thursday 29 July 2004 11:27, Fabio Tranchitella wrote: > Hi, I'm really interested in developing a relational filesystem. > I've heard about WinFS and Storage, but I didn't try neither of them. > Here there are some questions about relfs: Welcome, you are the very first :)=20 To summarize what I think about winfs: "I am not used to believe=20 microsoft rumours until I see the product" and "I won't buy a 1gb=20 memory addon for my laptop just to open a desktop folder" :) I have=20 been using ms sql server at work, and sure it is not lightweight.=20 Moreover, I don't expect too much features to be included in the first=20 release of winfs, or extensibility through plugins. > =A01. Why simply don't contribute to Storage? And how much relfs > differs from Storage? Regarding to storage, I have written to seth but received no reply.=20 Either he is busy, or he is not trusting an open development process.=20 Or he just missed my message, I will try to contact him again and on=20 some mailing list, too.=20 RelFS and Storage are partially overlapping projects: both have a=20 compatiblity layer for other applications (those of storage is=20 gnomevfs, but I don't think there could be any objection to a kernel=20 interface), and both extract information from files in real time,=20 storing it into an SQL database, so the best thing to do would be to=20 use relfs as the lower-level layer of storage, which could then focus=20 on its higher level goals, such as the natural language query=20 interface. We'll see, but consider there are differences between the=20 two approaches (e.g. I would tend to keep real data on the hard drive,=20 to ensure data visibility in case of the hard drive being mounted from=20 a PC without postgresql - Maybe psql can do this by itself, I don't=20 know yet). Moreover, storage has a very simple data model; I am going=20 to use a richer data model. > > =A02. Which features you plan to add? I'd like to see a good > =A0 =A0 implementation of trash, undo and document history. > The very first thing to do is a plugin architecture, in order to allow=20 the fuse daemon not to grow over 3-4k lines of code. An user should be=20 able to attach plugins at runtime. Then, I would like to take advantage of such a powerful system in any=20 way we can think of: we already are wasting cpu cycles and IPC latency,=20 at least we should make advantage of it. Coolest features that come to mind are directories which represent=20 queries, the ability to update fields in the SQL database, which update=20 files in real-time using triggers, directories which automatically do=20 bayesian classification of all the user files, using as input data the=20 files that an user adds or removes from there, but there are many other=20 ideas. I wrote a detailed TODO.txt, but as you can see... the second thing to=20 do is a trash implementation :)=20 > =A03. How can I contribute? :-) It depends on how you feel like with the following subjects: =2D C++ coding =2D database design =2D system security =2D postgres programming or ODBC programming in case you are willing to=20 try this route (I don't have time to do it) I accept contributions on any area, and will not refuse CVS write access=20 to people in the future. By now we should focus on: 1. detailing what we expect from the system 2. deciding what is the "next" feature to add to the codebase in order=20 to allow requirements chosen at 1 3. implementing the feature 4. testing it, possibly with unit tests too, then go back to 1/2. and in parallel we should decide a good data model on the database. I encourage people to download the CVS version and try it, then decide=20 what you would like to do next, and start a thread on this mailing list=20 to discuss the details. Bye Vincenzo [Will add a FAQ section on the website with part of this e-mail] |
|
From: Fabio T. <ko...@ko...> - 2004-07-29 09:27:59
|
Hi, I'm really interested in developing a relational filesystem.
I've heard about WinFS and Storage, but I didn't try neither of them.
Here there are some questions about relfs:
1. Why simply don't contribute to Storage? And how much relfs differs=20
from Storage?
2. Which features you plan to add? I'd like to see a good
implementation of trash, undo and document history.
3. How can I contribute? :-)
Thanks for working on it,
Talk to you later.
--=20
Fabio Tranchitella
<!> kobold.it, Turin, Italy - Free is better!
-----------------------------------------------------------------------
<http://www.kobold.it>, <ko...@ko...>, <ko...@ja...>
-----------------------------------------------------------------------
GPG Key fingerprint: 5465 6E69 E559 6466 BF3D 9F01 2BF8 EE2B 7F96 1564
|
|
From: Vincenzo a. N. N. <vin...@ya...> - 2004-07-28 11:42:13
|
[ ] Yes it is [ ] No it's not |
|
From: Vincenzo a. N. N. <vin...@ya...> - 2004-07-28 11:34:57
|
-- Microsoft is trying to patent virtual desktops. I didn't hate microsoft before this, and now I do. The full story: http://yro.slashdot.org/yro/04/02/25/1346201.shtml |