Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo


#229 FLAG_BINARY for utf8 fields breaks unicode conversion

Andy Dustman
MySQLdb (285)
Domas Mituzas

if any data in resultset is stored on server with utf8_bin or latin1_bin or any other _bin collation, BINARY flag is passed for the field.

As BINARY flag is used to differentiate TEXT from BLOB too, MySQLdb chooses not to apply unicode() on utf8 data that has the flag set.

Though there might be plans to separate TEXT from BLOB in protocol, for now checking character set existence/value for the field in information on the wire could work better than current behavior.

That of course might require changing of converters interface..


  • Andy Dustman
    Andy Dustman

    Logged In: YES
    Originator: NO

    There's already a related bug for this, but I think I closed it with a Later resolution, because I'll need API changes to fix it.

    Also see the bottom of this page:


    "To distinguish between binary and non-binary data for string data types, check whether the charsetnr value is 63. If so, the character set is binary, which indicates binary rather than non-binary data. This is how to distinguish between BINARY and CHAR, VARBINARY and VARCHAR, and BLOB and TEXT."

    Right now there's no good way to do this.