Often there is a need to produce time-series based on the different collections. To do this we need a single date for the datasource. I propose
add 'date_collection_started', 'date_collection_finished' (and 'collection_midpoint' for convenience?) as DATETIME columns
Then something like this (pseudo perl):
foreach $currSource (@datasources) {
# find the earliest date_collected across all the tables
# is it always in a particular place (what do you get first?)
# maybe the entry in the datasource table is always the first one?
$earliest
# find latest somehow (MAX(date_collected) WHERE datasource_id= $currSource across all tables?)
$latest
UPDATE datasources SET 'date_collection_started' TO $earliest AND
'date_collection_ended' TO $latest WHERE datasource_id = $currSource
}
Then we can select datasources in order to produce time-series.
Anonymous