Problem: Zope(2.7.4) & ZMySQLDA(2.0.9b3), set the character-set for the connection to utf8 (Funny thing is I have to do this in db.DB.init(), neither unicode=1|True|utf8 in the connection string nor setdefaultencoding in sitecustomize.py works). Go to the test tab of the ZMySQLDA and enter the following:
show variables like "%char%"
Everything is set to utf8, as it should be. Now restart your MySQL-server and enter the same query in the test-tab. The client character set is reset to latin1.
Is this a bug? Is there another fix than entering the following line in db.DB._begin():
self.db.query("set character set utf8")
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
First of all, you can't set the character set with the connection method. Versions older than 1.2.0 used unicode=charset to set the character set that results would be decoded with. 1.2.0 has simply unicode=bool, and if True, results are returned in the character set that MySQL says it is using.
However, neither of these have a damn thing to do with ZMySQLDA, since it does not use MySQLdb.connect, but instead _mysql.connect, which knows nothing of those options. You cannot put them in the connection string.
I put a request on zope list when I announced 2.0.9b3 asking what people wanted to do about unicode. So far there's no clear consenus, so everything that is a character type is returned as string, which is what it's always done.
The reason you still see latin1 is because that's your server's default. You don't say what version of MySQL you are using for either the client or server. I'll assume 4.1 or newer, since older versions don't have "set character set xxx". If you administer the server, you can configure the default character set in the system my.cnf file.
You could also do the "set character set" query when the connection is created in ZMySQLDA (in db.DB.init()). This is not enough to get you unicode results. You'll need to do something like this:
u = lambda s, c=self.db.get_character_set(): s.decode(c)
self.db.conv[FIELD_TYPE.CHAR] = u
self.db.conv[FIELD_TYPE.VARCHAR] = u
Beware of TEXT columns: They are BLOB with the BINARY field set. Since 2.0.9.b3 removes the conversion for BLOB (so they are always returned as string), you will have to undo this, but at the same time, you'll need a converter for the TEXT columns which means you need to delete the line that removes the conversion, and also add this as well:
Problem: Zope(2.7.4) & ZMySQLDA(2.0.9b3), set the character-set for the connection to utf8 (Funny thing is I have to do this in db.DB.init(), neither unicode=1|True|utf8 in the connection string nor setdefaultencoding in sitecustomize.py works). Go to the test tab of the ZMySQLDA and enter the following:
show variables like "%char%"
Everything is set to utf8, as it should be. Now restart your MySQL-server and enter the same query in the test-tab. The client character set is reset to latin1.
Is this a bug? Is there another fix than entering the following line in db.DB._begin():
self.db.query("set character set utf8")
First of all, you can't set the character set with the connection method. Versions older than 1.2.0 used unicode=charset to set the character set that results would be decoded with. 1.2.0 has simply unicode=bool, and if True, results are returned in the character set that MySQL says it is using.
However, neither of these have a damn thing to do with ZMySQLDA, since it does not use MySQLdb.connect, but instead _mysql.connect, which knows nothing of those options. You cannot put them in the connection string.
I put a request on zope list when I announced 2.0.9b3 asking what people wanted to do about unicode. So far there's no clear consenus, so everything that is a character type is returned as string, which is what it's always done.
The reason you still see latin1 is because that's your server's default. You don't say what version of MySQL you are using for either the client or server. I'll assume 4.1 or newer, since older versions don't have "set character set xxx". If you administer the server, you can configure the default character set in the system my.cnf file.
You could also do the "set character set" query when the connection is created in ZMySQLDA (in db.DB.init()). This is not enough to get you unicode results. You'll need to do something like this:
Beware of TEXT columns: They are BLOB with the BINARY field set. Since 2.0.9.b3 removes the conversion for BLOB (so they are always returned as string), you will have to undo this, but at the same time, you'll need a converter for the TEXT columns which means you need to delete the line that removes the conversion, and also add this as well:
I'd consider doing this by default, but it's hard to know what will break.