The GBIF Portal already includes a multi-threaded
component for indexing the contents of DiGIR-Darwin
Core and BioCASe-ABCD data providers. This component
needs to be automated, so that it performs the
following tasks:
1. Detects which providers should be (re-)indexed
(regular schedule)
2. Selects an appropriate mode for indexing (full index
of all records, index of all records modified since
previous index, continuation of previous failed index
attempt)
3. Manages number of concurrent indexer threads
4. Schedules indexing of providers at times of low
activity in provider countries (night-time, weekends)
5. Allows an administrator to over-ride the schedule to
request an immediate indexing of a provider
The solution should include the following components:
1. Automated version of the specimen/observation
indexer component (capable of running within the main
Tomcat portal instance or on a separate server)
2. Administrator console to over-ride scheduling
See:
http://circa.gbif.net/Members/irc/gbif/dadidev/library?l=/work_items/item00002automateindexer/
Logged In: YES
user_id=593398
Use the following location for the design document:
http://circa.gbif.net/Public/irc/gbif/dadi/library?l=/software_design/designindexing_doc/