A string type column with a utf8_bin collation will not be converted to a
Python Unicode string, but instead will be returned as a utf8 (byte) string.
The MySQL documentation though clearly states: "A nonbinary string has a
character set and is converted to another character set in many cases, even
when the string has a _bin collation".
I understand that a string with utf8_bin collation is still a string and
thus should not be dealt with differently. The utf8_bin collation is
essential when working with Unicode without wanting the Unicode collation
algorithm to kick in.
How to reproduce:
CREATE TABLE t1 (
a CHAR(10) CHARACTER SET utf8 COLLATE utf8_bin,
INSERT INTO t1 VALUES ('ü');
db = MySQLdb.connect(db='pymysqltest', charset='utf8', use_unicode=True)
cur = db.cursor()
cur.execute("SELECT a FROM t1;")
Chosing utf8_general_ci instead of utf8_bin will properly yield Unicode
cur.execute("SELECT a COLLATE utf8_general_ci FROM t1;")
Log in to post a comment.