I think there is still a bug here.  When doing a bucket delete with 10,000 files in the bucket, it deletes some, but not all, the files.  If you run s3cmd del --recursive repeatedly, it eventually gets them all.  Needs more debugging...


On Thu, Apr 10, 2014 at 10:29 PM, Matt Domsch <matt@domsch.com> wrote:
I've worked with Mike offline on this some today, and believe I have a fix now on the github.com/s3tools/s3cmd master branch.

The failure here was a timeout on a batch delete call to S3 that included roughly 40,000 files in the single batch delete request, rather than the 1000 per request that the API specifies should be sent.  To resolve it, I chose to add a getslice operator to the SortedDict objects (our list of files), to then iterate over that in batches of 1000 calling the batch delete API.  That also cleans up the bit about last keys and markers in the key list.

this wasn't a big change, but it does affect all remote delete operations of a bunch of files, so I'd appreciate some extra test coverage.

Thanks,
Matt


On Thu, Apr 10, 2014 at 4:48 PM, WagnerOne <wagner@wagnerone.com> wrote:
Running this same sync in debug, I see additional detail following the "INFO: Summary: ..." line.

I'm not sure what I should anonymize in that output, so I'd prefer to share it with a dev only. I can produce that on request.

Mike

On Apr 10, 2014, at 3:21 PM, WagnerOne <wagner@wagnerone.com> wrote:

> Hi,
>
> Encountered what appears to be a bug today.
>
> I am syncing a local directory and an s3 prefix that I have not been in control of (unlike the many other s3cmd syncs I have done successfully).
>
> When trying to sync existing local directories with prefixes in this bucket, I am encountering 2 things I've not seen before using s3cmd.
>
> One is during sync preparation "WARNING: Empty object name on S3 found, ignoring." and is obviously intended and seemingly innocuous.
>
> The other is "ERROR: timed out" when s3cmd gets to the point of beginning the deletes and/or transfers as needed. I've not encountered a time out before.
>
> If I delete the prefix entirely on the bucket side and issue the same s3cmd command again, it sync fine.
>
> Makes me wonder if the deletion step is choking on the "empty object" that was found (which I believe I understand from research is objects deleted manually via the AWS web gui).
>
>
> INFO: Retrieving list of remote files for s3://mybucket/myprefix/ ...
> WARNING: Empty object name on S3 found, ignoring.
> INFO: Found 788889 local files, 75128 remote files
> INFO: Verifying attributes...
> INFO: Summary: 763456 local files to upload, 0 files to remote copy, 49694 remote files to delete
> ERROR: timed out
>
>
> I'm running my s3cmd via a git clone of the master branch from github (fresh pull today).
>
> Mike
>
> --
> wagner@wagnerone.com
> "Good music is good no matter what kind of music it is."-Miles Davis
>
>
>
> ------------------------------------------------------------------------------
> Put Bad Developers to Shame
> Dominate Development with Jenkins Continuous Integration
> Continuously Automate Build, Test & Deployment
> Start a new project now. Try Jenkins in the cloud.
> http://p.sf.net/sfu/13600_Cloudbees
> _______________________________________________
> S3tools-general mailing list
> S3tools-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/s3tools-general

--
wagner@wagnerone.com
"Every generation laughs at the old fashions, but follows religiously the new."-Thoreau



------------------------------------------------------------------------------
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees
_______________________________________________
S3tools-general mailing list
S3tools-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/s3tools-general