relfs-devel Mailing List for Relational Filesystem (Page 3)

Status: Pre-Alpha

Brought to you by: applejack

relfs-devel — Development mailing list for the relational filesystem project

You can subscribe to this list here.

2004	Jan	Feb	Mar	Apr	May	Jun	Jul (6)	Aug (16)	Sep	Oct	Nov	Dec
2005	Jan (12)	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct (5)	Nov (6)	Dec (3)
2006	Jan	Feb	Mar	Apr	May	Jun	Jul (4)	Aug	Sep	Oct	Nov	Dec
2007	Jan (5)	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec (3)
2008	Jan	Feb (2)	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2009	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov (6)	Dec
2010	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep (2)	Oct	Nov	Dec
2016	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov (1)	Dec

Flat | Threaded

<< < 1 2 3 (Page 3 of 3)

[Relfs-devel] Pushing towards a library "type-alike" interface to create tables and relations

From: Vincenzo a. N. N. <vin...@ya...> - 2004-08-14 09:13:36

There are two issues with the way one should create data on the 
database:

1. If I record attributes of a file on the db, the user is (not without 
a reason) tempted to change these attributes directly on the db, e.g. 
to fix a typo in the "author" tag of hundreds of mp3s.

2. If a plugin uses attributes of another plugin they must be called in 
the right order

A possible solution, which pushes hard towards the decision of having a 
library to create datastructures on the database instead of allowing 
plugins to create tables (and this makes one think that having a 
separate user at least for groups of plugins is a good idea), is to 
create triggers on the database for attributes "owned" by plugins, both 
on read and on write:

- Read triggers:

When anybody tries to read an attribute, the database should check 
conditions, declared by plugins and choosen in a limited range (like 
"when the first 100 bytes of the file have changed"), which might have 
invalidated the attribute, and in this case run the appropriate 
indexer. This solves the dependency issue but NOT if there are circular 
dependencies which are a bad thing anyway. Several ways to avoid 
circular dependencies come to mind but none is free from troubles.

- Write triggers:

When anybody tries to write an attribute (except the owner of the 
attribute, which makes the choice "a db user for each plugin" an 
interesting concept) the database should check if there is a plugin 
which has declared itself able to write that attribute, or else return 
an error. 

Conclusion:

we will have to implement a library, possibily easy to call by C and so 
"SWIG"able, which allows to declare "external types". Types should have

- attributes with "DB primitive" types, that are translated to column 	
of a table with the same name of the type

- attributes with "external" types, which are translated using foreign 
keys

- declarations of writeable-ness for attributes, and what to do in the 
case of a write.

Types should be inserted into suitable database schemas to avoid naming 
conflicts, and what db schema to use might be chosen either by letting 
the plugin declare it, or as beerfun said using a fixed set of rules 
(like "what mimetypes do you match" but not limited to mimetypes) that 
group plugins together in a more automatized way.

A last note is worth for "equalization" of attributes i.e. what if two 
plugins use different names for the same thing, and the user notices 
this? She should be enabled to "equalize" the names but I am unsure of 
the implications of equalization even if semantics of name passing is 
one of my favourite topics :) However we should see if this can be 
worked on easily because that would be a great advantage.

Someone willing to criticize this architecture? This will be implemented 
in september because I'll be on vacation in the next two or three 
weeks.

Bye

Vincenzo

Re: [Relfs-devel] JDBFS and RelFS

From: Vincenzo a. N. N. <vin...@ya...> - 2004-08-13 21:39:33

On Friday 13 August 2004 08:09, co...@li... wrote:
> this project came to my mind as we are working on pretty the same
> idea. Well, not completly. So I thought that our projects could stay
> in contact and get some ideas from each other and help out a little
> bit.
>
> JDBFS will be a java based (100% java) indexing system with an
> embeddable database to store rich metainformation from mp3, ogg,
> movies, pdfs, images and other content storing filetypes.
> We have working plugins (from GPL projects) for mp3 and ogg. PDF is
> on the hook right now.

Where is your project? If it is the one at sun site, is it free 
software? I can't access the page (I even registered).

However, if you have a plugin interface and have chosen a data model, I 
don't see reasons why in principle I couldn't use the same plugin 
format and the same data model - how are you storing data? Is your 
embedded DB a relational database? In this case, maybe we both could 
abstract database access from plugins enough to have your plugins work 
in relfs and maybe the countrary (but would you like plugins written in 
languages other than java?).

Do you have documentation about your data model and plugin architecture?

Bye

Vincenzo

[Relfs-devel] JDBFS and RelFS

From: <co...@li...> - 2004-08-13 08:09:50

Hi,

this project came to my mind as we are working on pretty the same idea.
Well, not completly. So I thought that our projects could stay in
contact and get some ideas from each other and help out a little bit.

JDBFS will be a java based (100% java) indexing system with an embeddable
database to store rich metainformation from mp3, ogg, movies, pdfs,
images and other content storing filetypes.
We have working plugins (from GPL projects) for mp3 and ogg. PDF is on
the hook right now.

The goal is to help the user organise his data, so that he just can work
with his data, instead of organising it.

regards,
Bernd

[Relfs-devel] Data model

From: Vincenzo a. N. N. <vin...@ya...> - 2004-08-12 21:59:22

Just a couple of questions about the data model: 

1. beerfun at users dot sf dot net has suggested, if I don't 
misinterpret her, to use a db schema where plugins can find their data 
somewhat by the type of data they address: a classification could be 
made allowing them to declare what mimetypes, from a standard list of 
those, that they index, and to get in exchange from the system access 
to tables specific to that mimetype. The idea could be to exploit the 
multiple schema feature of postgres to have a namespace for each used 
mimetype, a public namespace to share common information and a private 
namespace for each plugins. Would this be overkill for queries?

2. it might be interesting to provide applications an interface to 
create types on the database, where each type is a tuple of either 
basic sql types or other types and is translated to a table with 
columns for attributes of basic types and external keys for attributes 
of other user-defined types. But perhaps it's just simpler and more 
powerful to leave the raw db interface to the application. Someone 
willing to suggest tools and libraries to type a database which can be 
used with C++ and is relatively easy to use? I don't mean a full 
object-relational mapping with object persistence, just a typed way to 
declare tables.

3. And what about data protection? Would it be worth to have a DB user 
for each plugin, with ACLs set right to access shared tables, or is it 
more natural to allow each plugin to access and modify the whole data 
hierarchy?

Bye

Vincenzo

[Relfs-devel] New prototype released

From: Vincenzo a. N. N. <vin...@ya...> - 2004-08-08 16:26:22

We now have plugins working in another thread, threaded fuse mainloop, 
and doxygen documentation for all headers for developers pleasure.



RelFS - Relational Filesystem 
-----------------------------

* sf.net project description:

  RelFS is a linux userspace "shadow" (file data remains on disk)
  relational filesystem using fuse and an SQL database to store
  metadata. Directories can represent queries, and powerful features
  (e.g. bayesian classification) are added through plugins.

We already have a working prototype available, and are in active
development state.

The website for development and releases is

http://www.sourceforge.net/projects/relfs

while the homepage, where you can find the FAQs is at

http://relfs.sourceforge.net

At the moment we don't have user releases, since we are in an early
stage of development. We are looking for developers and ideas on how to
use the beast. You can subscribe to the mailing list that you will find
on the sf.net project page. Be sure to read the FAQ at

http://relfs.sourceforge.net/FAQ.html

that include motivation, goals and differences with existing
projects. And, quoting from the FAQ:

# This is a very ambitious project, you won't even get to beta!

I would like to use an incremental approach, in wich we first stabilize
over a small goal, then improve the performance of the program, then we
start working on another goal. The first goal, for example, is to obtain
a working indexing filesystem with an easy to use plugin
architecture. We are already close to that. See TODO.txt, section
"RELEASE PLAN/MILESTONES".

Bye

Vincenzo Ciancia

[Relfs-devel] RelFS.cpp should be stable now

From: Vincenzo a. N. N. <vin...@ya...> - 2004-08-03 21:33:13

I have carefully audited the source code for RelFS.cpp. It should be 
exception-safe now but who can be sure - BTW I also checked the LOCAL 
macro so Guido should be happy now :P (it couldn't be path+1 because of 
"/" being of only 1 char, but it's better now). Could someone have a 
look at the file? 

A question: would it be wise to catch posix signals in the fuse module? 
And how should I handle them? Setting a global variable, and checking 
it at the end of each filesystem function?

I committed changes to code yesterday; today I added the FAQs for the 
project to the homepage.

The next thing to do, IMHO, is putting the MainIndexer singleton in a 
separate thread, objections anyone? After this I plan to document 
existing code using doxygen.

Bye

Vincenzo

Re: [Relfs-devel] the question of rename

From: Vincenzo a. N. N. <vin...@ya...> - 2004-08-03 21:22:09

On Tuesday 03 August 2004 00:09, Guido Villa wrote:
> The straightforward way is to treat "rename a b" is
> "unlink b; rename a b", but are we sure it is the preferred meaning?
>
> Maybe, after choosing a meaning for rename, we could implement the
> other two in separate programs...
>

That's a very good question. I would like a consistent behaviour if 
files outside the filesystem are renamed to file inside it, so I 
suppose that a good answer would be that metadata of the old file is 
not attached to the new file. This is the right thing when metadata 
depends only on the contents of files - unfortunately this condition 
will not always hold. I suppose that a good solution is to consider the 
file path only an attribute of files, which have an identity property 
of their own. In this case, we can both have metadata attached to a 
file name (such as comments attached by an user to /etc/shadow to 
remember what it is for, which should stay attached to that file even 
when that file is zeroed or squashed with another file by the user's 
brother/syster) and metadata attached to one file's contents, when it's 
the case. Usability concerns arise, but I guess that we should use this 
feature only when it's obvious that we need to.

However, considering that the target file in a rename operation would be 
deleted, and that this would mean "move to trash", metadata attached to 
the destination file contents would be still preserved.

> BTW: while you implement renaming, please bear in mind that if the
> target exists no unlink system call is made, so you have to stat the
> target before renaming, and then manually manage its deletion side
> effects. It's a thing one might easily overlook.

I will take care of it, thanks.

Vincenzo

-- 
First they ignore you, then they laugh at you,
then they fight you, then you win.
[Gandhi]

[Relfs-devel] the question of rename

From: Guido V. <gu...@vi...> - 2004-08-02 22:09:54

I want to share a question that came to my attention only a few days ago. 

This is unrelated from what you want do with the files, it is more related 
to the meaning that the renaming of a file has in filesystems that try to 
manage data somehow. 

I am not speaking of the trivial case of simply changing a file name, but of 
the more tricky one, when, by changing a file name, you delete another file.
In this case, what has to happen to the metadata (whichever they are, 
entries in a database or sybolic links pointing to that file)? Maybe the 
user wants to change the file's content but not the metadata (as if "rename 
a b" would be interpreted as "cat a > b; unlink a"), maybe it's the contrary 
(as if "rename a b" would be interpreted as "unlink b; rename a b"). Maybe 
he would like to have the metadata from the files merged...
What do you think? The straightforward way is to treat "rename a b" is 
"unlink b; rename a b", but are we sure it is the preferred meaning? 

Maybe, after choosing a meaning for rename, we could implement the other two 
in separate programs... 

BTW: while you implement renaming, please bear in mind that if the target 
exists no unlink system call is made, so you have to stat the target before 
renaming, and then manually manage its deletion side effects.
It's a thing one might easily overlook. 

bye
Guido

Re: [Relfs-devel] Re: "LOCAL" #define in Common.h

From: Vincenzo a. N. N. <vin...@ya...> - 2004-08-02 19:48:26

On Monday 02 August 2004 10:41, Guido Villa wrote:
> >> I hope we can keep your and mine project linked somehow, because I
> >> like your ideas for future expansions, but I think I prefer mine
> >> on the core functionalities :)
> >
> > Have you got a site for your project? BTW I forgot to mention the
> > third
>
> not yet. I want clean a bit the source code before putting it on the
> net.

Can you explain shortly what is the fundamental difference between the 
two approaches? You told me that it's about using links to files 
instead of storing them. I plan to have a distinction between 
"internal" files, stored in the filesystem, and "external" files which 
the filesystem has somewhat seen and indexed, but whose data is not 
available, e.g. files on cdrom or an http proxy cache - the filesystem 
should keep an URI as a reference. If the URI has "file:" as protocol 
the filesystem could show, in directories that represent queries, links 
to these files. Maybe your ideas can fit better into this model.

V.

[Relfs-devel] Re: =?iso-8859-1?Q?=22LOCAL=22?= #define in Common.h

From: Guido V. <gu...@vi...> - 2004-08-02 08:41:28

Vincenzo aka Nick Name writes: 

> On Monday 02 August 2004 00:58, Guido Villa wrote:
>> I know that a suggestion like mine is of no use at this stage of the
>> project, but I was looking at differences in implementation between
>> your FS and mine, and the LOCAL macro was the first thing I noticed
>> :) 
>>
>> I hope we can keep your and mine project linked somehow, because I
>> like your ideas for future expansions, but I think I prefer mine on
>> the core functionalities :)
> 
> Have you got a site for your project? BTW I forgot to mention the third 

not yet. I want clean a bit the source code before putting it on the net. 

> performance improvement: keeping files opened until they are "released" 
> using an hash table of path<->file descriptor mapping.

Absolutely. The implementation I am using now for this is rather ugly, but 
it works. If/when you need it, please tell me. 

> V.

bye
Guido 

 --
Guido Villa                  Always remember that you are unique...
gu...@vi...             ...just like everybody else.

Re: [Relfs-devel] Re: "LOCAL" #define in Common.h

From: Vincenzo a. N. N. <vin...@ya...> - 2004-08-02 07:37:52

On Monday 02 August 2004 00:58, Guido Villa wrote:
> I know that a suggestion like mine is of no use at this stage of the
> project, but I was looking at differences in implementation between
> your FS and mine, and the LOCAL macro was the first thing I noticed
> :)
>
> I hope we can keep your and mine project linked somehow, because I
> like your ideas for future expansions, but I think I prefer mine on
> the core functionalities :)

Have you got a site for your project? BTW I forgot to mention the third 
performance improvement: keeping files opened until they are "released" 
using an hash table of path<->file descriptor mapping.

V.

[Relfs-devel] Re: =?iso-8859-1?Q?=22LOCAL=22?= #define in Common.h

From: Guido V. <gu...@vi...> - 2004-08-01 22:58:25

Vincenzo aka Nick Name writes: 

[...] 

> 2. asynchronous plugins. This is not as easy as it might seem, since we 
> can't just fork a process or a thread for each indexing operation, or 
> the system will almost freeze with tons of processes reading from disk. 
> It would be better to have a queue of indexers to run (and to remove 
> them from the queue if the file is removed in the meantime). Also, a 

yes, I too was thinking to a queue. 

[...] 

> Oh, not only they can, but they _will_ be moved out of sync asap :) The 
> most important things for now are a stable index plugins interface and 
> a good data model. I will think about the data model in the next days. 
> After these two steps, we will focus on getting maximum speed and 
> stability from relfs.cpp, including a separate thread or process for 
> plugins. If other people agree, of course! I am no chief here even if I 
> am the only coder right now :) 
> 
> I would be glad to see patches to code, but I recommend not to take care 
> of details at the moment, because it's highly probable for entire files 
> to be removed in this phase. It's better to focus on how features can 
> be implemented (and to see IF certain features can be implemented at 
> all using the current design).

I know that a suggestion like mine is of no use at this stage of the 
project, but I was looking at differences in implementation between your FS 
and mine, and the LOCAL macro was the first thing I noticed :) 

I hope we can keep your and mine project linked somehow, because I like your 
ideas for future expansions, but I think I prefer mine on the core 
functionalities :) 

> bye 
> 
> Vincenzo

bye
Guido

Re: [Relfs-devel] "LOCAL" #define in Common.h

From: Vincenzo a. N. N. <vin...@ya...> - 2004-08-01 22:38:33

On Sunday 01 August 2004 22:37, Guido Villa wrote:
> I'm giving a quick look at the source code...
>
> Why not write:
>
> #define LOCAL(path) (path +1)
>
> instead of:
>
> #define LOCAL(path) (string(".")+string(path))
>

First of all, by now we are developing a prototype. C++ is not a good 
language for rapid prototyping, neither is C, (I usually choose ocaml 
when I am in a hurry, and I have many segfault-free programs written in 
a couple of weeks :) ) but I had to choose between cross-language 
interoperability and rapidity of development. Since I want a stable 
interface for plugins, and the ability to write plugins in almost any 
language, I choose C at first. But not having a rich standard library 
convinced me to switch to C++ and here we are.

Since I am prototyping, I don't care about performance too much [and I 
don't care too much about how I code - I don't code that bad 
usually :P ].

My idea is that to make a good design you need to have implemented the 
thing; in other words there's something good in that "I feel like 
rewriting everything". I want to write a _very small_ and modular 
program, and then to audit, and eventually rewrite some part, module by 
module. But the plugin interface, and the data model, must be stable, 
so that people can write plugins while we correct and /or rewrite the 
core, being confident that their plugins will keep on working.

Said this, RelFS.cpp is almost plain C, and it will stay that way or 
even more C, so feel free to audit and correct it. Even the class relfs 
is unuseful and might be removed. In the new code however there is a 
single parameter for all plugin operations (for ease of development), 
of type "class File", so a bit of memory allocation is required. 

The rest of the program will remain in C++ I guess, if not translated 
into a more usable language, because of the need for high level 
libraries.

In fact I had defined LOCAL exactly as you suggested, but then I needed 
strings in the rest of the program and so I opted for using strings 
everywhere. I am not sure about why I choose to add an initial "./", 
feel free to experiment removing it to discover why :) However the most 
important speedups will come from:

1. using the threaded fuse mainloop. This is necessary, since we can't 
block the whole filesystem waiting for a cdrom read. By now this would 
mean implementing either a thread waiting for SQL queries and sending 
it over a protected connection, or making as many SQL connections as 
queries in the program, since libpqxx doesn't accept multiple threads 
on a single opened connection. I would opt for solution 1.

2. asynchronous plugins. This is not as easy as it might seem, since we 
can't just fork a process or a thread for each indexing operation, or 
the system will almost freeze with tons of processes reading from disk. 
It would be better to have a queue of indexers to run (and to remove 
them from the queue if the file is removed in the meantime). Also, a 
good idea would be to try to gather more information about what exactly 
plugins will read from a file: if three different indexers will read a 
huge file from the beginning to the end, it could be better to allow 
them to declare it, and to pass the same data to all of them. Caching 
mitigates this problem but this doesn't mean we can't improve it.

>
> I like the plugin proxy idea, because plugin calls (which could be
> slow) can be moved out of sync from the main process.

Oh, not only they can, but they _will_ be moved out of sync asap :) The 
most important things for now are a stable index plugins interface and 
a good data model. I will think about the data model in the next days. 
After these two steps, we will focus on getting maximum speed and 
stability from relfs.cpp, including a separate thread or process for 
plugins. If other people agree, of course! I am no chief here even if I 
am the only coder right now :)

I would be glad to see patches to code, but I recommend not to take care 
of details at the moment, because it's highly probable for entire files 
to be removed in this phase. It's better to focus on how features can 
be implemented (and to see IF certain features can be implemented at 
all using the current design).

bye

Vincenzo

-- 
I was dressed for success,
but success it never comes
[Pavement - Here]

[Relfs-devel] =?iso-8859-1?Q?=22LOCAL=22?= #define in Common.h

From: Guido V. <gu...@vi...> - 2004-08-01 20:37:19

I'm giving a quick look at the source code... 

Why not write: 

#define LOCAL(path) (path +1) 

instead of: 

#define LOCAL(path) (string(".")+string(path)) 

? 

You wouldn't need to use the string class everywhere, which would be better 
(performance-wise), because you do not need all those memory allocations.
I think that the core of the filesystem implementation should be as fast as 
possible, and then still a bit more: for example, I would suggest using C 
instead of C++.
Filesystem calls are very very frequent, so they should be very small and 
very optimised. 

I like the plugin proxy idea, because plugin calls (which could be slow) can 
be moved out of sync from the main process. 

Just my two cents. 

Regards,
Guido

[Relfs-devel] A first attempt for plugins

From: Vincenzo a. N. N. <vin...@ya...> - 2004-08-01 17:46:45

When the cvs commit I just made will be available, there will be a file 
named "PLUGIN_HOWTO.txt" in the top directory. There is an 
implementation for plugins, with a base class called "Index", defined 
in "IndexPlugin.h". Each operation of this class takes a "File" as an 
argument. "File" represents a cache of information we already know 
about files. Things will change a lot.

By now I would like to focus on the plugin interface. Once that is 
settled, an asynchronous proxy for plugins, a dlopen interface and all 
the rest can be written (since I realize that by now these are just 
source-code plugins, and cannot be really "plugged in"). I would like 
opinions on the interface.

The protected method "getParent" has a "watch_OPERATION" for each 
operation, which takes an "Index" as an argument. This allows for a 
plugin to add other plugins to the indexing chain (e.g. when it's a 
proxy).

The protected method "getConnection" is given. It can be overloaded, 
this has been made so that an "asynchronous proxy plugin" can be 
implemented, which does not allow to directly access the main db 
connection (libpqxx does not allow concurrent access to a connection), 
but creates a new connection in the overloaded "getConnection" method.

By the use of preprocessor macros (ARGH!) code size is greatly reduced, 
at the expense of readability if one does not read the "FOR_ALL" macros 
like the one at the top of "IndexPlugin.h".

The next step is to implement some useful plugin (those already present 
are implemented quick & dirty, and do not work properly, they must be 
regarded as examples).

Bye

Vincenzo

Re: [Relfs-devel] About relfs-devel

From: Vincenzo a. N. N. <vin...@ya...> - 2004-07-29 17:29:01

On Thursday 29 July 2004 16:13, Fabio Tranchitella wrote:

> I didn't try any database-based filesystem, but AFAIK Apple is
> developing a similar solution. I'm sure we need next generation
> filesystems, with automatic handles of file types, automatic indexing
> and a powerful search engine.
>

As anybody willing to write something "different", I know I will have to 
face sooner or later the challenge of prooving that I am not copying 
apple :) Well, the author of storage already had to do the same thing, 
see

http://www.gnome.org/~seth/blog/document-indexing

basically apple's new jewel is an indexing system and not a complete 
filesystem.

> You answered me that mssql server is not lightweight. Yes, I agree,
> but PostgreSQL too. In general, a database-based filesystem will be
> slower (and slower) than a traditional tree-based filesystem.

Postgres is not going to eat that much memory, but in general what we 
should try to achieve is asynchronous indexing, in order to keep 
storage speed comparable to the native one.

> IMHO this type of filesystem can't replace traditional filesystems.
> I'd prefer to use a traditional fs for my debian system, and I think
> this is the best choice. A relational filesystem can be useful only
> for document and personal users folders. Do you agree?

Well, I think that there are many files in an unix filesystem which 
would live better if stored and retrieved through a database (this is 
not in contrast with the principle of keeping data on disk, but maybe 
that principle is too restrictive anyway). For example, linux 
distributions have huge package databases, which are often queried with 
grep or awk, and would fit perfectly into a database model (they _are_ 
databases in fact). Also, the ability to synthesize a file on the fly 
would allow for automatic maintenance of things like /etc/modutils/ 
-> /etc/modules, and so on. There are many kinds of files which would 
benefit from a db indexing - if you have ever typed "find /usr/lib 
-exec nm "{}" ";" |grep something" you know what I mean. 

> > Regarding to storage, I have written to seth but received no reply.
> > Either he is busy, or he is not trusting an open development
> > process. Or he just missed my message, I will try to contact him
> > again and on some mailing list, too.
>
> I think we have to avoid code duplication. I've not yet tried
> storage.. Did you do this? Is that usable? The best thing would be
> extend storage adding some features actually are missing... Maybe
> relfs could be integrated in storage, as you suggested.
>

I still can't compile storage, but I read on seth's page that the 
version in gnome-cvs is not recent code. So our first task seems to be 
"look better at storage". I'll have a second look, if you can look at 
it yourself. In the meanwhile I have written a second e-mail to Seth 
asking about recent code.

> > (e.g. I would tend to keep
> > real data on the hard drive, to ensure data visibility in case of
> > the hard drive being mounted from a PC without postgresql - Maybe
> > psql can do this by itself, I don't know yet)

> And in the case you used to mount a relfs filesystem "without
> postgresql" and removed, added or modified some documents, how the
> metadatas could be updated? How do you imagine this case?
>

I am still unsure about how files should be _stored_, but regarding to 
indexing, if plugins have the proof obligation that their effects on 
the db are the same if data stored in a file is the same, we can 
reindex modified files (sigh, looking at the modification time or upon 
user request...). It's not an ideal solution but I can't think of 
anything better right now, without totally replacing the filesystem 
with the database (but this way destroying any hope of native-like 
speed, I suppose, not to mention disk usage). Metadata can be stored, 
as a backup, in XFS/patched ext2/ext3 extended attributes (but will not 
be updated on external modifications). The main reason to store data on 
the disk, besides speed, is that users need to trust the system. Nobody 
would use a system which makes user files "disappear" from the disk.

Bye

Vincenzo

Re: [Relfs-devel] About relfs-devel

From: Fabio T. <ko...@ko...> - 2004-07-29 14:13:33

On Thu, Jul 29, 2004 at 03:24:25PM +0200, Vincenzo aka Nick Name wrote:
> To summarize what I think about winfs: "I am not used to believe=20
> microsoft rumours until I see the product" and "I won't buy a 1gb=20
> memory addon for my laptop just to open a desktop folder" :) I have=20
> been using ms sql server at work, and sure it is not lightweight.=20
> Moreover, I don't expect too much features to be included in the first=20
> release of winfs, or extensibility through plugins.

I didn't try any database-based filesystem, but AFAIK Apple is
developing a similar solution. I'm sure we need next generation
filesystems, with automatic handles of file types, automatic indexing
and a powerful search engine.

You answered me that mssql server is not lightweight. Yes, I agree,
but PostgreSQL too. In general, a database-based filesystem will be
slower (and slower) than a traditional tree-based filesystem.=20

IMHO this type of filesystem can't replace traditional filesystems.=20
I'd prefer to use a traditional fs for my debian system, and I think this
is the best choice. A relational filesystem can be useful only for
document and personal users folders. Do you agree?

> Regarding to storage, I have written to seth but received no reply.=20
> Either he is busy, or he is not trusting an open development process.=20
> Or he just missed my message, I will try to contact him again and on=20
> some mailing list, too.=20

I think we have to avoid code duplication. I've not yet tried storage..
Did you do this? Is that usable? The best thing would be extend storage
adding some features actually are missing... Maybe relfs could be
integrated in storage, as you suggested.

> RelFS and Storage are partially overlapping projects: both have a=20
> compatiblity layer for other applications (those of storage is=20
> gnomevfs, but I don't think there could be any objection to a kernel=20
> interface), and both extract information from files in real time,=20
> storing it into an SQL database, so the best thing to do would be to=20
> use relfs as the lower-level layer of storage, which could then focus=20
> on its higher level goals, such as the natural language query=20
> interface. We'll see, but consider there are differences between the=20
> two approaches (e.g. I would tend to keep real data on the hard drive,=20
> to ensure data visibility in case of the hard drive being mounted from=20
> a PC without postgresql - Maybe psql can do this by itself, I don't=20
> know yet). Moreover, storage has a very simple data model; I am going=20
> to use a richer data model.

And in the case you used to mount a relfs filesystem "without postgresql"
and removed, added or modified some documents, how the metadatas could=20
be updated? How do you imagine this case?

> I encourage people to download the CVS version and try it, then decide=20
> what you would like to do next, and start a thread on this mailing list=
=20
> to discuss the details.

I'll do this ASAP.

Talk to you later,
Fabio.

--=20

Fabio Tranchitella

<!> kobold.it, Turin, Italy  - Free is better!

-----------------------------------------------------------------------
 <http://www.kobold.it>, <ko...@ko...>, <ko...@ja...>
-----------------------------------------------------------------------
GPG Key fingerprint: 5465 6E69 E559 6466 BF3D  9F01 2BF8 EE2B 7F96 1564

Re: [Relfs-devel] About relfs-devel

From: Vincenzo a. N. N. <vin...@ya...> - 2004-07-29 13:24:28

On Thursday 29 July 2004 11:27, Fabio Tranchitella wrote:

> Hi, I'm really interested in developing a relational filesystem.
> I've heard about WinFS and Storage, but I didn't try neither of them.
> Here there are some questions about relfs:

Welcome, you are the very first :)=20

To summarize what I think about winfs: "I am not used to believe=20
microsoft rumours until I see the product" and "I won't buy a 1gb=20
memory addon for my laptop just to open a desktop folder" :) I have=20
been using ms sql server at work, and sure it is not lightweight.=20
Moreover, I don't expect too much features to be included in the first=20
release of winfs, or extensibility through plugins.

> =A01. Why simply don't contribute to Storage? And how much relfs
> differs from Storage?

Regarding to storage, I have written to seth but received no reply.=20
Either he is busy, or he is not trusting an open development process.=20
Or he just missed my message, I will try to contact him again and on=20
some mailing list, too.=20

RelFS and Storage are partially overlapping projects: both have a=20
compatiblity layer for other applications (those of storage is=20
gnomevfs, but I don't think there could be any objection to a kernel=20
interface), and both extract information from files in real time,=20
storing it into an SQL database, so the best thing to do would be to=20
use relfs as the lower-level layer of storage, which could then focus=20
on its higher level goals, such as the natural language query=20
interface. We'll see, but consider there are differences between the=20
two approaches (e.g. I would tend to keep real data on the hard drive,=20
to ensure data visibility in case of the hard drive being mounted from=20
a PC without postgresql - Maybe psql can do this by itself, I don't=20
know yet). Moreover, storage has a very simple data model; I am going=20
to use a richer data model.

>
> =A02. Which features you plan to add? I'd like to see a good
> =A0 =A0 implementation of trash, undo and document history.
>

The very first thing to do is a plugin architecture, in order to allow=20
the fuse daemon not to grow over 3-4k lines of code. An user should be=20
able to attach plugins at runtime.

Then, I would like to take advantage of such a powerful system in any=20
way we can think of: we already are wasting cpu cycles and IPC latency,=20
at least we should make advantage of it.

Coolest features that come to mind are directories which represent=20
queries, the ability to update fields in the SQL database, which update=20
files in real-time using triggers, directories which automatically do=20
bayesian classification of all the user files, using as input data the=20
files that an user adds or removes from there, but there are many other=20
ideas.

I wrote a detailed TODO.txt, but as you can see... the second thing to=20
do is a trash implementation :)=20

> =A03. How can I contribute? :-)

It depends on how you feel like with the following subjects:

=2D C++ coding

=2D database design

=2D system security

=2D postgres programming or ODBC programming in case you are willing to=20
try this route (I don't have time to do it)

I accept contributions on any area, and will not refuse CVS write access=20
to people in the future. By now we should focus on:

1. detailing what we expect from the system
2. deciding what is the "next" feature to add to the codebase in order=20
to allow requirements chosen at 1
3. implementing the feature
4. testing it, possibly with unit tests too, then go back to 1/2.

and in parallel we should decide a good data model on the database.

I encourage people to download the CVS version and try it, then decide=20
what you would like to do next, and start a thread on this mailing list=20
to discuss the details.

Bye

Vincenzo

[Will add a FAQ section on the website with part of this e-mail]

[Relfs-devel] About relfs-devel

From: Fabio T. <ko...@ko...> - 2004-07-29 09:27:59

Hi, I'm really interested in developing a relational filesystem.
I've heard about WinFS and Storage, but I didn't try neither of them.
Here there are some questions about relfs:

 1. Why simply don't contribute to Storage? And how much relfs differs=20
    from Storage?

 2. Which features you plan to add? I'd like to see a good
    implementation of trash, undo and document history.

 3. How can I contribute? :-)

Thanks for working on it,
Talk to you later.

--=20

Fabio Tranchitella

<!> kobold.it, Turin, Italy  - Free is better!

-----------------------------------------------------------------------
 <http://www.kobold.it>, <ko...@ko...>, <ko...@ja...>
-----------------------------------------------------------------------
GPG Key fingerprint: 5465 6E69 E559 6466 BF3D  9F01 2BF8 EE2B 7F96 1564

[Relfs-devel] Is this list working the right way?

From: Vincenzo a. N. N. <vin...@ya...> - 2004-07-28 11:42:13

[ ] Yes it is
[ ] No it's not

[Relfs-devel] Ciao

From: Vincenzo a. N. N. <vin...@ya...> - 2004-07-28 11:34:57

-- 
Microsoft is trying to patent virtual desktops. I didn't hate
microsoft before this, and now I do. The full story:
http://yro.slashdot.org/yro/04/02/25/1346201.shtml

Flat | Threaded

<< < 1 2 3 (Page 3 of 3)