I think this a good direction to go. Its becoming less practical to leave files or data exactly where they are crawled.
I'd offer the notion of an abstraction that handles the location and URI assignment for files, allowing their
physical location to be re-assigned, perhaps during the crawling process.
In that vane, perhaps something like a custom scheme has advantages over relative URI (is there such a thing?).
A good use case for this could be: I am crawling files on removable media and in the process, copying them
to a cloud drive. This movement of the original data requires assigning a new URI/URL. To hide the actual
physical transport behind the URI, standard or custom scheme handlers can thus be used.
On Tue, 2010-06-29 at 17:59 +0200, Antoni Mylka wrote:
I'd like to know what you think about making a new release.
1. It's been six months since the last one.
2. We've simplified the codebase and slashed the module count.
3. The tracker shows 6 bugs and 7 feature requests fixed.
But most importantly I'd like to start working on a new, broad topic -
relative URIs. Right now, when a crawler crawls a file - it's assigned
an absolute URI, with the drive letter and a full path. If you want to
use a DataAccessor on that file later, or annotate it - it needs to
reside in the same physical location.
Sebastian Trüg from Nepomuk KDE community - noticed this problem some
Now this is becoming an important problem for Aduna too. We'd like to
crawl a folder on a usb-stick, remove the stick and reattach it under a
different drive letter without loosing the ability to use DataAccessors
Doing it properly will require substantial changes to all URI-generating
and URI-consuming components - crawlers, subcrawlers and accessors.
Basically everything should keep working as it does now, but there
should be a magic switch that will turn everything into a "relative"
mode and all new uris will be relative to a given base, while all
accessed uris, will be converted from relative to absolute - with a
I have some ideas how this might be done in a backward-compatible
manner, but before, I'd like to brush up on the issues left undone in
- making sure that sources and javadocs are neatly wrapped up in
- creating a separate assembly for the big osgi bundle
... and release what we have as 1.5 (some new classes, e.g.
PoiXmlExtractor mean that 1.4.1 would be against the rules).
What do you think?
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
Aperture-devel mailing list