Meteorological Product Exchanger / Feature Requests / #71 replace checksums on 0, L and R packets with replicable algorithm.

#71 replace checksums on 0, L and R packets with replicable algorithm.

Milestone: Sarra Beta

Status: closed

Owner: nobody

Labels: None

Priority: 5

Updated: 2018-06-02

Created: 2017-07-23

Creator: psilva

Private: No

currently when generating checksum fields for items with certain checksum: 0, Link, or Remove, the checksum value is a random integer, to avoid load all going to a single node when there is load splitting (such as post_exchange_split ) It would be better if the load splitting were predictable, as well as distributed, so that files with the same name would go to the same node. First idea: a good way of accomplishing that would be to change the algorithms to use the 'n' algorithm... ie. take an md5 checksum of the filename as the value.

That way the computation is reproducible. for 'R', that should work well. for Link, it would be even better to have the checksum be based on the link content. so that the checksum will change if the link ever does.

Even the 'n' algorithm, as is, has a problem. If a file is partitioned, all of the parts will go to the same node (as they have the same checksum.) It is probably important to include the partition information in the checksum calculation. Doing so would also make it perfect for use by sr_winnow.

So, perhaps ideally, we create an 'N' algorithm, which concatenates name, and the 'parts' header,
and uses the checksum of that string.

Discussion

psilva - 2017-07-23

Implemented in the C-side.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

psilva - 2017-07-23

used the SHA512, which is way over the top, but it avoid discussions...

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

psilva - 2018-06-02

python support is now issue 27 on github.com/MetPX/Sarracenia

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

psilva - 2018-06-02

status: open --> closed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

replace checksums on 0, L and R packets with replicable algorithm.

Sundew:a tcp/ip WMO switch, Sarracenia: next gen transfer engine.

Group

Searches

Help

#71 replace checksums on 0, L and R packets with replicable algorithm.

Discussion