From: Diego Menéndez <dfmenendez@ps...> - 2012-05-10 20:54:26
I have recently installed and been trying to gather some experience with
DSpace. After a couple of weeks, some questions arose for which I have
not been able to find an answer even after searching throughout the lists.
Is DSpace able to retrieve possibly large data files directly from tape
archives? I wonder how the response to the user would be handled if the
file is not readily available and the user may have to wait until the
tape be placed in the tape drive. Optionally, is there any extension to
achieve that? If yes, has anyone experience with that that would like to
I have found some questions regarding tape usage for backup purposes;
however, I am interested in data retrieval directly from tapes, a
different usage model.
Thank you in advance,
On Thu, May 10, 2012 at 04:54:17PM -0400, Diego Menéndez wrote:
> Is DSpace able to retrieve possibly large data files directly from tape
> archives? I wonder how the response to the user would be handled if the
> file is not readily available and the user may have to wait until the
> tape be placed in the tape drive. Optionally, is there any extension to
> achieve that? If yes, has anyone experience with that that would like to
I don't believe there is anything in stock DSpace to do this.
There are several layers to this problem:
o You noted the issue of long access delays. I don't recall anything
in the user interface design which would permit the insertion of a
"please wait for retrieval of offline resources" page in the flow.
One might return a status page saying that the request has failed
temporarily because the file is being fetched, and invite the user
to repeat the request after a short wait. One could even send an
email to a registered user when the bitstream is available. (Heh,
extend the EPerson model a little and one could offer to send a TXT
o I would approach the actual storage and retrieval as a new type of
assetstore. It would have to deal with the linearity of tape
storage and possibly the great length of bitstreams' internal
identifiers. You'd probably want a catalog of tape contents stored
in a new database table. It sounds like it would be fun to write.
for the place to begin exploring.
(I just realized that you may be talking about something like a tape
containing a single 'tar' archive or the like, or even several
tapes containing segments of such an archive. I had been thinking
of individual files on ANSI-labelled tapes (which may give you a
clue to my age :-) . Archive container files could be even more fun.)
I'd probably want to cache a few recently-retrieved files on disk.
More fun stuff to write.
I wonder whether the cost of developing all that would be less than
the cost of enough disk drives to replace the tapes. Or are these
existing tapes to be registered in the assetstore as-is?
I have heard rumors that someone at IU Bloomington has an assetstore
implemented on an IBM HPSS nearline (tape staged to disk) storage. I
haven't heard any details. The tape robots and staging process are
speedy -- for tape -- but would still require of the user considerable
patience if the required bitstream is not currently cached.
Mark H. Wood, Lead System Programmer mwood@...
Asking whether markets are efficient is like asking whether people are smart.