#20 Support for unicode errors

MySQLdb (53)
Ken Kinder

MySQLdb takes a keyword argument in connect() to encode
unicode strings under a certain encoding type, such as
utf8. If, however, MySQL returns data which cannot be
encoded in the specified encoding type, MySQLdb raises
errors like this one:

Traceback (most recent call last):
File "<stdin>", line 1, in ?
line 95, in execute
return self._execute(query, args)
line 114, in _execute
self.errorhandler(self, exc, value)
line 33, in defaulterrorhandler
raise errorclass, errorvalue
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80
in position 2: unexpected code byte

To handle encoding errors, the Python unicode
constructor and encode() methods take an extra argument
called "errors" which specifies how to handle encoding
errors, as documented here:

unicode(string [, encoding[, errors]]) -> object

Create a new Unicode object from the given encoded
encoding defaults to the current default string encoding.
errors can be 'strict', 'replace' or 'ignore' and
defaults to 'strict'.

MySQLdb, not passing errors, defaults to 'strict' which
may not be what the developer needs. The attached patch
allows the developer to send MySQLdb.connect an
additional keyword argument, unicode_errors, which will
be passed to Python's unicode constructor, thus
allowing the developer to specify how unicode encoding
problems should be handled.


  • Ken Kinder

    Ken Kinder - 2004-09-23

    Patch to allow developers to set unicode encoding type

  • Andy Dustman

    Andy Dustman - 2004-09-23

    Logged In: YES

    I'll probably put this in both 1.1.6 and 1.0.1; it seems
    quite reasonable.

  • Andy Dustman

    Andy Dustman - 2004-10-31

    Logged In: YES

    Your patch, or a variation, has been applied to the current CVS tree.


Log in to post a comment.