Menu

#33 fresco: uniqueness of reading

open
5
2007-01-04
2007-01-04
No

there is a problem ingesting files which may describe the same tiles. with normal level2 data, one reads ozone and cloud data at the same time with the tileinfo. it's not clear what we're actually doing with fresco. in general, we expect the tileinfo to be already present in the database, generating the tileinfo if this is missing is taken as a sort of fallback from an unexpected situation.

this raises a whole list of questions about what to do in case we ingest a new ozone/cloud level2 file after corresponding fresco(2P) data has been ingested.

what is a tileinfo record? what does it really describe? as far as I understood, it describes an entity existing 'an sich' rather than its description in each of the files ingested.

if this is the first case, we should create it once and update (or discard older data) each time we meet its description in the file being ingested.

I'm not sure this is what happens when reading ozone/cloud files and I'm sure this is not what happens when ingesting fresco files.

also I'm not sure how we should decide when the data being read should be considered older or newer than the one already in the database. possibly we must match each tileinfo being described with the highest softVersion (or receiveDate? or procStage?) of the files describing it previously and update it only if the one being ingested is newer...

Discussion

  • Mario Frasca

    Mario Frasca - 2007-01-04

    Logged In: YES
    user_id=512199
    Originator: YES

    about matching a 2P file (fresco or otherwise) to its corresponding 1P.

    when ingesting a fresco file, we look for the corresponding standard level 1 file.
    shouldn't we be doing the same for a standard level 2 file?

    when ingesting a tile which is already in the tileinfo table, which file must be looked for, in order to retrieve the "version of the tile"? and how do we make this information available quickly and at low space cost?

     
  • Mario Frasca

    Mario Frasca - 2007-01-04

    Logged In: YES
    user_id=512199
    Originator: YES

    "this raises a whole list of questions *about* what to do in case we ingest a new ozone/cloud level2 file after corresponding fresco(2P) data has been ingested."
    should read
    "this raises a whole list of questions *like, for example*, what to do in case we ingest a new ozone/cloud level2 file after corresponding fresco(2P) data has been ingested."

     
  • Mario Frasca

    Mario Frasca - 2007-01-05

    Logged In: YES
    user_id=512199
    Originator: YES

    tileinfo records are generated on fresco 2P or standard level 2 data.
    with fresco 2P we reject a file if no 1P file is found in the database.
    if we were doing the same for standard level 2 data, we would be able to guarantee that a tileinfo record is associated to a standard level 1 file.

    does this help at all, in deciding if a tile should be overwritten or not?
    or do we need to keep some version information in the tileinfo record itself?
    or is it possible to assume that the 'softVersion' field in the corresponding stateinfo record holds this information?

     
  • Mario Frasca

    Mario Frasca - 2007-01-08

    Logged In: YES
    user_id=512199
    Originator: YES

    this comment is written thinking around the scia database. mutatis mutandis, it also stands for gome.

    the four tables tileinfo, ozone__2P, cld__2P, fresco__2P together describe an object which is built using data from various files... each time such a file is ingested, the data contained is checked against what is already present in the database and possibly written to the database. the four distinct parts are written each atomically (in the sense that we reject, insert or update a whole record for each of the four tables) but in general you might want to take distinct actions for the distinct parts of the object.

    in each of the four 'atoms' we will keep a distinct softVersion field.

    in ozone__2P and cld__2P it is the version of the 2P file containing the info.

    in tileinfo it is taken from the corresponding stateinfo record and it is the highest softVersion among the level 1 files describing the stateinfo.

    in fresco__2P it is the version of the fresco data.

    when ingesting a file, ozone__2P, cld__2P and fresco__2P info get
    - inserted if there is no corresponding record,
    - discarded if the softVersion in the corresponding record is higher or equal,
    - updated if the softversion in the corresponding record is lower.

    for tileinfo, the same logic is followed, but the softVersion is taken from stateinfo and it is compared either to the softVersion of level 2 file being ingested or to the softVersion of the level 1 file corresponding to the fresco file being ingested.

     
  • Mario Frasca

    Mario Frasca - 2007-01-08

    Logged In: YES
    user_id=512199
    Originator: YES

    short addition to previous comment:

    softVersion of level2 data cannot be compared with softVersion of level1 data. it's two different pieces of software.

    we should do the same with standard level 2 data as what we do with fresco__2P data, that is: permit ingestion of a file only if the matching level1 file can be found. this way we will decide about updating tileinfo by comparing level 1 softVersion (stored in stateinfo) with level 1 softVersion (from the meta__1P record corresponding to the file being ingested).

     
  • Mario Frasca

    Mario Frasca - 2007-01-10

    Logged In: YES
    user_id=512199
    Originator: YES

    the work to be done on this seems to be quite a bit more than I can possibly complete in a week.

    what I can do in this time is writing a technical description of what I have done and what remains to be done. I don't think it is a good idea to let me start implementing new things before I'm sure I have described all that has to be done.

    I've added 'softVersion' fields in the *__2P tables attached to tileinfo.
    these will hold the version of the software computing the readings in the record.
    notice that these versions are NOT comparable with each other, since while ozone and cld are standard products, fresco is not. (they are produced by different software).

    the already existing 'softVersion' field in stateinfo now holds the version of a level 1 or 2 product describing the stateinfo record. in order to make a comparison possible, this field is now defined to hold the highest software version of the standard *LEVEL 1* product describing the state. the ingestion software for standard products should be modified just slightly.

    (actually, when reading a level 1 or 2 standard file, also orbitPhase and the geographic information about the stateinfo get updated in the database, based on the value of non-comparable software versions. the test on version is possibly ok, as long as we stick to level 1 files)

    a file at level 1 must store its softVersion in all corresponding stateinfo records, if meta__1P.softVersion > stateinfo.softVersion. also, you must be careful about the separator between the name and the version of the software (it should be '/') and possibly about things like 5.04, 5.1, 5.10...

    a file at level 2 must be rejected if no corresponding file at level 1 can be found (this is already implemented for fresco, it must be done for the standard products). this has to be done issueing a log message at warning level.

     
MongoDB Logo MongoDB