I developed a set of four related command line apps for dspace:
1) a lister / report generator
2) a policy tool, that adds / removes policies
3) a metada tool, that adds / removes specific metadata values
4) a bitstream replacer tool
All expect 4 work on a set of dspace objects specified by two command line arguments:
--root ROOT - where ROOT is a handle or object type follow by an ID
--type TYPE - where TYPE is one of collection, item, bundle, or bitstream
—-root COLLECTION.10 --type BITSTREAM
means work on all bitstreams in collection with ID 10
--root handle/12345 --type ITEM
means work on all items contained in the object designated by handle
There is an additional argument, --doWorkFlowItems, that restricts sets to items in workflows and by extension to bundles or bitstreams in items in workflows.
The lister generates tsv or txt formatted output, printing properties of the selected set of DSpace objects. Its
--include option determines which properties are printed. You can choose to print IDs and handles, as well as policy information, or specify select item metadata fields. You can include an items 'withdrawn' status or a bundle's embargo
state. Bitstream reports may print mimeType, checksum, ... When printing DSpace objects, you can choose to print properties of enclosing Dspace objects. For example when printing bitstreams in a collection, you can include bundle names, item handles, even
item metadata values by using options like these:
The lister works nicely with the other commands, since all four commands use the same mechanism to select the objects they work on. For example you might use the lister to review which DSpace objects need policy or metadata changes.
After applying changes, it comes in handy, when making sure the changes performed are in fact the ones, that were intended.
The policy tool decides which action to apply to each DSpaceObject selected by the
--root and --type parameters based on three options:
--action [ADD | DEL ] - whether to add or delete policies
--dspace_action [READ | WRITE | REMOVE | ... ]
--who [group | eperson]
dspace bulk-pols -r handle/712657 -t BITSTREAM —action ADD —dspace_action WRITE --who EPERSON.monikam
gives the eperson monikam WRITE priviledges on all bitstreams contained in the object behind the given handle, which may be a community, collection, or item.
dspace bulk-pols -r handle/712657 -t BITSTREAM -a DEL -d READ -w GROUP.Anonymous
removes the READ permission from the Anonymous group
The metadata tool works similar to the policy tool. Of cause it makes only sense to apply to item sets.
The bitstream replacer works on single bitstreams. It is related to the other tools in that it selects the bitstream to work on in the same fashion, aka with --root and --type arguments.
I developed these commands in connection with a project here at Princeton, where I needed to add a cover page to all bitstreams in original bundles in a community. The lister gave me the list of bitstreams. Printing the list in txt
format, allowed me to grep for name=ORIGINAL. I included the mimeType in the listing, so I would only work on pdf documents. Including the internalId allowed me to use the file right from the assetstore and stick it into my ‘add the cover page’ script. I
replaced the old bitstream using the IDs, printed earlier, to define the —root parameter to the bitstream replacer. Finally I used the lister to check on the access policies of the bitstreams. Right now I run the lister command in a cronjob to watch the
submission progress in one of our communities.
I wrote more detailed documentation which is part of the pull request that I created for this code. Here at Princeton we are still running 1.8. The bulk-do code mostly lives in its own package and should play well with version 3 (I
have not tried it). The PR is based on the master. In other words unless you run pre 1.8, merging this into your version should be relatively painless - and it goes without saying - I'd help sort out conflicts.
The PR is HERE
and the documentation is THERE
I believe this code would be useful for many DSpace administrators. It would be straight forward to add a JSON/XML output format to offer this functionally in the REST API. So please have a look, send feedback, and possibly step
up as a volunteer tester / reviewer.