Hi Simon,

On 23 February 2010 17:47, Simon Brown <stb28@cam.ac.uk> wrote:
Sorry to stick my oar in here, but...

Oars are always welcome.
I don't think this is the case. I'm sure it was the intention, but
from what we've been able to determine, each DescribeStep for in-
progress items calls InProgressSubmission.update() which for both
workflow and non-workflow items calls Item.update(), which will fire a
MODIFY_METADATA event for that item. The BrowseConsumer will process
that event whether the item is installed or not. We determined this
after an increasing number of user complaints about the submission
process "slowing down" and added an isArchived() check to our
BrowseConsumer, which made the submissions process noticeably snappier.

Yes, that probably is happening at a BrowseConsumer level (the event mechanism / BrowseConsumer was added after this browse code was committed, I'm not 100% sure of the circumstances of it's use).

However, the BrowseConsumer calls indexItem(), which has the explicit check in it:

        if (item.isArchived() || item.isWithdrawn())
            indexItem(new ItemMetadataProxy(item));

            // Ensure that we remove any invalid entries

So, the indexing / pruneIndexing won't happen if the item is not in either 'archive' or 'withdrawn' state - and it shouldn't be in either whilst it is still in the workspace / workflow. Whilst it passes through the browse indexer, it shouldn't be doing anything that is expensive (or gets more expensive with repository size), before installItem() is called.

AFAIK, the BrowseConsumer shouldn't have just an isArchived() check, as that would prevent indexes being updated correctly when an item is withdrawn. But it could replicate the if (isArchived() || isWithdrawn()) check, and doing it in the BrowseConsumer would avoid some overhead that is incurred when IndexBrowse is created.