It should be possible to use custom unicode string classes (like MyUnicode in the following example) and pass them as arguments to cur.execute.
import MySQLdb
db = MySQLdb.connect(host=DATABASE_HOST, user=DATABASE_USER, passwd=DATABASE_PASSWORD, db=DATABASE_NAME, charset='utf8')
cur = db.cursor()
class MyUnicode(unicode):
pass
print "this works"
cur.execute("SELECT * from demo_test WHERE text
=%s", [u'\xc6'])
print "this doesn't"
cur.execute("SELECT * from demo_test WHERE text
=%s", [MyUnicode(u'\xc6')])
Result (Tested with MySQLdb 1.2.2):
$ python test.py
this works
this doesn't
Traceback (most recent call last):
File "test.py", line 15, in ?
cur.execute("SELECT * from demo_test WHERE text
=%s", [MyUnicode(u'\xc6')])
File "/usr/lib/python2.4/site-packages/MySQLdb/cursors.py", line 148, in execute
query = query % db.literal(args)
File "/usr/lib/python2.4/site-packages/MySQLdb/connections.py", line 232, in literal
return self.escape(o, self.encoders)
File "/usr/lib/python2.4/site-packages/MySQLdb/connections.py", line 174, in string_literal
return db.string_literal(obj)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xc6' in position 0: ordinal not in range(128)
$
This is breaking Django which is using a class derived from the unicode class. You can see the Django bug here: http://code.djangoproject.com/ticket/6052
I had a look when I posted to the Django bug, and I think it's just a matter of making the type detection stuff in MySQLdb a bit smarter, but I couldn't quite figure out how it was working.