#229 FLAG_BINARY for utf8 fields breaks unicode conversion

MySQLdb-1.3
open
Andy Dustman
MySQLdb (285)
5
2012-09-19
2007-04-03
Domas Mituzas
No

if any data in resultset is stored on server with utf8_bin or latin1_bin or any other _bin collation, BINARY flag is passed for the field.

As BINARY flag is used to differentiate TEXT from BLOB too, MySQLdb chooses not to apply unicode() on utf8 data that has the flag set.

Though there might be plans to separate TEXT from BLOB in protocol, for now checking character set existence/value for the field in information on the wire could work better than current behavior.

That of course might require changing of converters interface..

Discussion

  • Andy Dustman
    Andy Dustman
    2007-04-03

    Logged In: YES
    user_id=71372
    Originator: NO

    There's already a related bug for this, but I think I closed it with a Later resolution, because I'll need API changes to fix it.

    Also see the bottom of this page:

    http://dev.mysql.com/doc/refman/5.0/en/c-api-datatypes.html

    "To distinguish between binary and non-binary data for string data types, check whether the charsetnr value is 63. If so, the character set is binary, which indicates binary rather than non-binary data. This is how to distinguish between BINARY and CHAR, VARBINARY and VARCHAR, and BLOB and TEXT."

    Right now there's no good way to do this.