But in fact, s3cmd wasn't issuing the Content-MD5 header. Even when we could know what to put into it. So S3 couldn't explicitly tell us when an object was corrupted on upload (which should be a rarity anyhow). Who needs end-to-end checking anyhow?
So, a few patches later, and we can. I'd appreciate some more eyes on this patch series before pulling it into master, but it feels about right.https://github.com/mdomsch/s3cmd/commits/feature/content-md5
Matt Domsch (5):
add Content-MD5 header to PUT objects (not multipart)
add Content-MD5 header for each multipart chunk
Don't double-calculate MD5s on multipart chunks
add Content-MD5 on put, not just sync
handle errors during multipart uploads
If I got this right, if S3 returns a 400 BadDigest, we retry (like we would any other retry-able error).
I'll also note, we aren't explicitly capturing a 503 SlowDown anywhere else (multipart does now in this series) to use for future operations; it's only caught and used during retries of this one operation on this one object, not thereafter. Maybe that's OK. But I'm tempted to catch SlowDown at the send_request() level rather than higher, and retry there (with exponential backoff rather than the linear backoff currently being used). Otherwise we have to scatter retry and backoff logic all over the place.
I'm wondering if this isn't the source of the various "failed retry" errors that are routinely posted to the bug list.