On Sun, Aug 16, 2009 at 5:42 AM, Matteo
> First of all, thanks for the reply and the answer.
> On Fri, Aug 14, 2009 at 8:56 PM, Chris Wilper <cwilper@...> wrote:
>> Hi Matteo,
>> When ingesting an object with managed content, you have a couple
>> options. One is to provide it in base64 format inside the FOXML
>> itself. (export an object with some managed content in the "archive"
>> context to see an example). The other option is to provide it by
>> reference. With this option, you give it an HTTP URL (you have to
>> have a webserver fronting the content) and Fedora copies the content
>> into the repository at ingest time.
>> More detail:
>> For managed content, the way content is included (or referenced)
>> within FOXML depends on a couple factors.
>> If the FOXML is about to be ingested, managed datastreams can be
>> included as base64-encoded content right inside the XML. It can also
>> be referenced via URL (http). In the latter case, Fedora retrieves
>> the content automatically as part of the ingest operation.
> I fear I can not use such an option, sice I have someting like 1M objects,
> each with a few Mbytes datastream associated.
Yes, the inline base64 option is best for few, small files.
I think the better option for you is to temporarily put a webserver in
front of the content wherever it's sitting, and refer to it that way
on the way in (via a http://localhost/path/to/datastream).
There's an even better option under development, which is to have the
ability to use a file:/// url to load the content in if it's local.
But we're not sure when that capability will be ready. To follow its
progress, see https://fedora-commons.org/jira/browse/FCREPO-453
The fastest possible option (which I personally haven't tried, but
maybe others have?) is to pre-stage the content and FOXML in the
$FEDORA_HOME/data/ directory, then run the rebuilder. See below for
more on that.
>> If the object is inside the repository already, managed content is
>> referenced within the FOXML in using a special kind of reference (e.g.
>> "changeme:6+DS1+DS1.0"). This kind of reference is only used inside
>> the repository to get the content from the appropriate place (the low
>> level file storage) when appropriate.
> So, in the FOXML I should have a line that reads "<foxml:contentLocation
> Is this correct ?
That is the form that Fedora changes it to once it's ingested, but you
can't ingest with such a reference because Fedora doesn't yet know
what path on disk that is associated with.
>> You typically don't use or
>> create these kind of references.
> why ? In my naiveness, I thought this might be useful, at least for my
> purposes: I havve all datastreams already on server, I create a batch XML,
> all the objects XML and then ingest them, with the datastreams already in
> the server.
Actually, it's probably possible (haven't tried) to author your
objects like this this, and manually put the FOXML and datastreams in
low level storage, then run a rebuild instead of trying to send each
one through via ingest. As mentioned above, when you ingest through
the API with such references, Fedora doesn't know how to resolve them.
But running a rebuild would basically reconstruct the mapping of
those IDs to locations on disk in the Fedora tables.
This would certainly be the fastest approach to getting your content
bulk-loaded into the repository if it's already on the local disk.
But I can't guarantee it will work. Just make sure if you try it:
- Put the already-constructed FOXML somewhere under the "objects" directory,
and the managed content somewhere under the "datastreams" directory.
- Make sure the filenames of the above files are what Fedora expects;
for FOXML, the filename should be the PID with "_" (underscore) in
place of of ":" (colon). For the managed datastreams, the filename
should be the same as the ID used in the contentLocation (e.g.
pid+dsId+dsVersionId)...also with the ":" in the PID part replaced
After running fedora-rebuild and starting Fedora again, make sure you
can access the managed content via http (e.g.
If you try this option, I (and I'm sure others) would be interested in
seeing how it works out. Good luck :)