Re: [Relfs-devel] New RelFS prototype, with query ability - and binary release

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Alle Wednesday 14 December 2005 23:13, rel...@li... ha 
scritto:
> Hi Vincenzo et al,
>
> As far as I can see the only way round this is to use the physical part of
> the disk as the id of the file, and the very long name as the human
> readable format that a user chooses.
> So it means going to down to disk level to find out a unique identifier,
> like cylinders and heads and sectors, or even disk address!
> although there is the problem of transferring between disks, which would
> mean relfs depending on the human readable format maybe maybe not, as
> transfering it would take the first available numbers of an id that is not
> an address but an incremental number that relfs would recognise as such, it
> being not on the same disk as the operating system, but transferring to a
> disk would have to be different.
> Does this give any light to the obnoxious problem of:
>
> ->"get(ting) around what to do with identical filenames in one directory."
>
> Hal

Well, I am convinced that - once that a file has gotten an unique id some way, 
it can be exposed as an extended attribute, so smart user programs will be 
able to differentiate files as now is done in kde with the "trash:" protocol 
and in any e-mail client when showing different e-mail messages but with the 
same visible fields. 

1. obtaining ids

In the long run I think that the best way to obtain ids from files would be to 
use an incremental hashing algorithm, i.e. one that can rehash a file just by 
rehashing  a block (if such an algorithm can be robust, I don't know).  For 
example one could use an md5 for each block, and then md5 the resulting list. 
This way hashing can be "efficient" at least in principle.

Another mentioned possibility was to use UUIDs but I would prefer hashes 
because they can be used to distribute the filesystem over a network (even 
existing peer-to-peer networks) and they persist when moving files.

For now, ids will keep on being just integers obtained from a psql sequence

2. presenting files to "non-smart" applications

What should we show in "ls" when there are two files? I think we already 
decided to adopt a quick-and-dirty strategy (which I should have implemented 
three weeks ago, however it will come "soon" :)) and then wait for  requst 
for enhancements.

Do you like this all or would you prefer another solution?

Vincenzo