#120 s3cmd put/get with utf-8 characters in uri

Malfunction
closed-fixed
nobody
s3cmd (118)
5
2014-06-17
2012-07-23
Anonymous
No

Trying to "put" a file onto the s3bucket that contain non-ascii characters in the filename (sub-directory) cause crash

# s3cmd put --encoding UTF-8 --no-progress --rr userLogs/Xxxxxxxxx/xxxaña/JXK4W4EWST7AMVAST2UUXNX85ATHQ6/app_2011_05_20.log s3://XXXXXXXX/userLogs/Xxxxxxxxx/xxxaña/JXK4W4EWST7AMVAST2UUXNX85ATHQ6/app_2011_05_20.log

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
An unexpected error has occurred.
Please report the following lines to:
s3tools-bugs@lists.sourceforge.net
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Problem: UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in position 45: ordinal not in range(128)
S3cmd: 1.1.0-beta2

Traceback (most recent call last):
File "/usr/bin/s3cmd", line 1805, in <module>
main()
File "/usr/bin/s3cmd", line 1746, in main
cmd_func(args)
File "/usr/bin/s3cmd", line 255, in cmd_object_put
destination_base = str(destination_base_uri)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in position 45: ordinal not in range(128)

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
An unexpected error has occurred.
Please report the above lines to:
s3tools-bugs@lists.sourceforge.net
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

I've currently "patched" my s3cmd as follow to allow to work;
in s3cmd_put
- destination_base = str(destination_base_uri)
+ destination_base = str(destination_base_uri.uri().encode(Config().encoding))

I had a similar error trying to get the file so here is the patch for this one.

in /usr/lib/python2.6/site-packages/S3/FileLists.py
compare_filelists
- uri_str = str(uri)
+ uri_str = str(uri.uri().encode(cfg.encoding))

Discussion


  • Anonymous
    2012-07-23

    Here is my unified diff for version 1.1.0-beta2

    --- s3cmd 2012-07-23 14:19:58.726618474 +0000
    +++ s3cmd 2012-07-23 14:18:05.000000000 +0000
    @@ -252,7 +252,7 @@
    destination_base_uri = S3Uri(args.pop())
    if destination_base_uri.type != 's3':
    raise ParameterError("Destination must be S3Uri. Got: %s" % destination_base_uri)
    - destination_base = str(destination_base_uri)
    + destination_base = str(destination_base_uri.uri().encode(Config().encoding))

    if len(args) == 0:
    raise ParameterError("Nothing to upload. Expecting a local file or directory.")

    --- S3/FileLists.py 2012-07-23 14:22:03.594622900 +0000
    +++ S3/FileLists.py 2012-07-23 14:14:20.000000000 +0000
    @@ -229,7 +229,7 @@
    remote_list[key] = objectlist[key]
    else:
    for uri in remote_uris:
    - uri_str = str(uri)
    + uri_str = str(uri.uri().encode(cfg.encoding))
    ## Wildcards used in remote URI?
    ## If yes we'll need a bucket listing...
    if uri_str.find('*') > -1 or uri_str.find('?') > -1:

     
  • Matt Domsch
    Matt Domsch
    2014-06-17

    Please try with upstream github.com/s3tools/s3cmd master branch. Many unicode bugs like this have been fixed in recent months.

    commit ea5451d8e42e79ee5bffb81c6e8cdd9bff4e5f99
    Author: Matt Domsch <matt@domsch.com>
    Date:   Wed Mar 26 12:10:53 2014 -0500
    
        unicode fixes for put <unicodename> s3://<bucket>/<unicodename>
    

    Thanks,
    Matt

     
  • Matt Domsch
    Matt Domsch
    2014-06-17

    • status: open --> closed-fixed