MySQL for Python / Patches / #20 Support for unicode errors

#20 Support for unicode errors

Milestone: MySQLdb

Status: closed

Owner: Andy Dustman

Labels: MySQLdb (53)

Priority: 5

Updated: 2012-09-19

Created: 2004-09-23

Creator: Ken Kinder

Private: No

MySQLdb takes a keyword argument in connect() to encode
unicode strings under a certain encoding type, such as
utf8. If, however, MySQL returns data which cannot be
encoded in the specified encoding type, MySQLdb raises
errors like this one:

Traceback (most recent call last):
File "<stdin>", line 1, in ?
File
"/usr/lib/python2.3/site-packages/MySQLdb/cursors.py",
line 95, in execute
return self._execute(query, args)
File
"/usr/lib/python2.3/site-packages/MySQLdb/cursors.py",
line 114, in _execute
self.errorhandler(self, exc, value)
File
"/usr/lib/python2.3/site-packages/MySQLdb/connections.py",
line 33, in defaulterrorhandler
raise errorclass, errorvalue
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80
in position 2: unexpected code byte

To handle encoding errors, the Python unicode
constructor and encode() methods take an extra argument
called "errors" which specifies how to handle encoding
errors, as documented here:

unicode(string [, encoding[, errors]]) -> object

Create a new Unicode object from the given encoded
string.
encoding defaults to the current default string encoding.
errors can be 'strict', 'replace' or 'ignore' and
defaults to 'strict'.

MySQLdb, not passing errors, defaults to 'strict' which
may not be what the developer needs. The attached patch
allows the developer to send MySQLdb.connect an
additional keyword argument, unicode_errors, which will
be passed to Python's unicode constructor, thus
allowing the developer to specify how unicode encoding
problems should be handled.

Discussion

Ken Kinder - 2004-09-23

Patch to allow developers to set unicode encoding type

unicode-errors.patch

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Andy Dustman - 2004-09-23

Logged In: YES
user_id=71372

I'll probably put this in both 1.1.6 and 1.0.1; it seems
quite reasonable.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Andy Dustman - 2004-10-31

Logged In: YES
user_id=71372

Your patch, or a variation, has been applied to the current CVS tree.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Support for unicode errors

MySQL database connector for Python programming

Group

Searches

Help

#20 Support for unicode errors

Discussion