From: Darren G. <da...@on...> - 2010-07-06 21:20:22
|
Nicely done. An interesting thought to add (although probably a digression), is that with the URI indirection provided by aperture:// you can easily write an Aperture URL scheme handler that performs the necessary de-referencing and exposes the file stream data. In this way, one can use new URL("aperture://my-file-uri") to read from the Aperture virtual file system-like namespace.... As such, you now have a common virtual URI naming scheme that can span multiple distributed or otherwise disparate file systems....allowing the physical management of those files to be managed transparently over time. On Tue, 2010-07-06 at 20:22 +0200, Antoni Mylka wrote: > Hello All, > > I've summarized this discussion on a wiki page: > > https://sourceforge.net/apps/trac/aperture/wiki/SupportForMovingDataObjects > > more comments inline > > W dniu 2010-07-06 10:35, Leo Sauermann pisze: > > Hi, > > > > It was Christiaan Fluit who said at the right time 05.07.2010 17:19 the > > following words: > >> Storing it in the DataSource sounds like the best way to me. Both the > >> FileSystemDataSource and MboxDataSource could use this. > >> > >> Storing both the original and the current prefix is ok - just the > >> current one would also be enough for our purposes though. > >> > > I would really enforce that we have "valid" URIs stored in the data > > > > we need the originalURI because the code will somehow be like this: > > (I make up the methodnames, too lazy to look into the code now, but you > > get the picture) > > > > String uri = dataobject.getURI(); // the URI as stored or as crawled > > String uriprefixstored = datasource.getConfiguration().getUriPrefixStored(); > > if (!uri.startswith(uriprefixstored)) throw Exception("42! the universe > > ends now"); > > Strng strippedUri = uri.substring(uriprefixstore.length()); > > String accessibleUri = > > datasource.getConfiguration().getUriPrefixCurrent() + strippedUri. > > > > ==> hence, you need both > > My proposal was about using an 'aperture://' uri scheme and substituting > it with 'currentPrefix'. It's the same though, we store two strings in > the data source and substitute the occurence of one string in the uri > with another string. Whether we want "valid" uris in RDF is a matter of > taste. IMHO if we say that we have a file uri > > 'file:///G:/myfolder/myfile.txt' > > but in fact G is a usb stick, which has been remounted now and is in > fact Z:, then this URI is not a URL of the file, moreover, there may be > another usb stick with completely unrelated content which happens to be > mounted under G: and happens to have those files. > > My proposal is similar to Sebastian's, use string ids. Sebastian gets > the IDs from the USB driver, we could let the user invent them. Since > they are stored with the DataSource configuration - they are always > applied to that particular data source and therefore don't even have to > be unique between data sources (if a user needs to be sure that files > from different sources get different uris - he/she should take care > about the uniqueness of the ids by him/herself). > > >> The only thing I am still thinking about is the fact that in our system > >> all DataSources will have the same prefix value, i.e. the same value is > >> duplicated a couple of times. Duplication is usually not a good thing, > >> but perhaps it gets too complex if we create a shared storage for this? > >> > > It is duplicated, but on the other hand it is much more stable (= > > self-contained) and readable. > > I had this idea too. We could create a global map of id<->urisubstrings, > place it in ApertureRuntime, propagate it to all registries, via > constructors, they would propage it to all factories via constructors, > which would propagate it to all crawlers, accessors and openers. This > would be quite a lot of work and would somehow feel less "clean" and > "modular". It's a matter of debate though, do we want ApertureRuntime to > be a one-stop-shop for all Aperture users and store aperture-wide state > there, or do we want to keep everything separate as it is now - food for > thought for Aperture 2. > > I'd go for storing the two strings in DataSource and providing a static > utility class that would perform the conversion appropriately. > > > I am conservative and boring here: > > just keep the system "as is" working and keep the URIs "valid", instead > > of relying only on the suffix relative to the configured one. > > > > But PLEASE - hit this proposal and invalidate it and show us that my > > thinking is wrong and harmful, > > because I don't want to cause technical trouble with this proposal > > Once again, it's a matter of taste. > > file://G:/myfolder/myfile.txt > > aperture://thumbdrive/myfolder/myfile.txt > > which is better, if the myfile.txt is actually on a Z:/ disk at the moment? > > Both approaches would work exactly the same with the ideas I outlined on > the wiki page. > > All kinds of comments welcome. > > Antoni Mylka > ant...@gm... > > ------------------------------------------------------------------------------ > This SF.net email is sponsored by Sprint > What will you do first with EVO, the first 4G phone? > Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first > _______________________________________________ > Aperture-devel mailing list > Ape...@li... > https://lists.sourceforge.net/lists/listinfo/aperture-devel |