I think this a good direction to go. Its becoming less practical to leave files or data exactly where they are crawled.

I'd offer the notion of an abstraction that handles the location and URI assignment for files, allowing their
physical location to be re-assigned, perhaps during the crawling process.

In that vane, perhaps something like a custom scheme has advantages over relative URI (is there such a thing?).
e.g. aperture://some-identifier-location

A good use case for this could be: I am crawling files on removable media and in the process, copying them
to a cloud drive. This movement of the original data requires assigning a new URI/URL. To hide the actual
physical transport behind the URI, standard or custom scheme handlers can thus be used.

On Tue, 2010-06-29 at 17:59 +0200, Antoni Mylka wrote:
Hello Aperturians,

I'd like to know what you think about making a new release.

1. It's been six months since the last one.
2. We've simplified the codebase and slashed the module count.
3. The tracker shows 6 bugs and 7 feature requests fixed.

But most importantly I'd like to start working on a new, broad topic - 
relative URIs. Right now, when a crawler crawls a file - it's assigned 
an absolute URI, with the drive letter and a full path. If you want to 
use a DataAccessor on that file later, or annotate it - it needs to 
reside in the same physical location.

Sebastian Trüg from Nepomuk KDE community - noticed this problem some 
time ago:


Now this is becoming an important problem for Aduna too. We'd like to 
crawl a folder on a usb-stick, remove the stick and reattach it under a 
different drive letter without loosing the ability to use DataAccessors 
on it.

Doing it properly will require substantial changes to all URI-generating 
and URI-consuming components - crawlers, subcrawlers and accessors. 
Basically everything should keep working as it does now, but there 
should be a magic switch that will turn everything into a "relative" 
mode and all new uris will be relative to a given base, while all 
accessed uris, will be converted from relative to absolute - with a 
given base.

I have some ideas how this might be done in a backward-compatible 
manner, but before, I'd like to brush up on the issues left undone in 
March, namely:

  - making sure that sources and javadocs are neatly wrapped up in 
  - creating a separate assembly for the big osgi bundle

... and release what we have as 1.5 (some new classes, e.g. 
PoiXmlExtractor mean that 1.4.1 would be against the rules).

What do you think?

Antoni Myłka

This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
Aperture-devel mailing list