#30 Don't convert binary (var)chars to unicode objects

MySQLdb
open
Andy Dustman
MySQLdb (53)
5
2012-09-19
2005-12-21
M.H.
No

Hello,

the attached patch stops MySQLdb's converting binary
char and binary varchar fields to unicode objects, if
one has connected with use_unicode=True. The patch is
against 1.2.1c3 and also applies against 1.2.0.

Regards,
Milan

Discussion

  • Andy Dustman
    Andy Dustman
    2006-02-25

    Logged In: YES
    user_id=71372

    Your patch, or a variation, has been applied to the current CVS tree.

     
  • Andy Dustman
    Andy Dustman
    2006-03-04

    Logged In: YES
    user_id=71372

    Your patch causes char and varchar columns with a binary
    collation to returned as array('c',...). I find that, for
    example, on the 5.0 (and probably 4.1) privilege tables, a
    lot of the fields have utf8_bin collation, which causes
    these fields to be returned as array which means they are
    not properly decoded (left in utf8 encoding).

    Unless you can provide some use cases where this is
    required, I'll need to remove this.

     
  • M.H.
    M.H.
    2006-03-05

    Logged In: YES
    user_id=1261581

    I actually wanted only to not convert CHAR / VARCHAR
    columns with collation 'BINARY' to unicode objects; these
    columns are then "converted" to the type BINARY or
    VARBINARY (like TEXT with collation 'BINARY' is converted
    to BLOB).
    I think the patch accomplished this, but I didn't test for
    '*_bin' collations, where I would want the conversion to
    unicode objects to happen.
    Is there a way to do this?

    Thanks & Regards,
    Milan

     
  • Andy Dustman
    Andy Dustman
    2006-03-05

    Logged In: YES
    user_id=71372

    I don't think there is a way to distinquish between the two
    cases with the C API as both will set BINARY_FLAG/FLAG.BINARY.

    Also see:

    http://dev.mysql.com/doc/refman/5.0/en/charset-binary-op.html

     
  • Andy Dustman
    Andy Dustman
    2006-08-11

    Logged In: YES
    user_id=71372

    I'm rethinking this a bit. Binary columns really can't be
    returned as unicode strings, since they have contain invalid
    unicode data, so they need to be returned either as
    array('c') or string, and I'm inclined lately to make the
    latter choice user- or site-configurable, since a lot of
    people hate array('c') (I think it's annoying too).

     
  • M.H.
    M.H.
    2006-09-08

    Logged In: YES
    user_id=1261581

    Hello,

    I just tested 1.2.2b1 and saw that my VARBINARY fields are
    now returned as byte (binary) string :) (which I like
    better than array('c') too btw)

    I also tested VARCHAR and text with utf8_bin, and saw that
    they are also returned as byte string. I think we agreed
    that utf8_bin should be decoded, and I verified that the
    MySQL manual also says it's UTF-8 data. And I also found
    out that the manual tells us how to distinguish VARBINARY
    from VARCHAR etc :)

    First, this page clears sth up: <URL:http: dev.mysql.com="" doc="" refman="" 4.1="" en="" charset-binary-op.html="">
    It says that instead of the BINARY /attribute/ we must now
    (in 4.1) use the BINARY /character set/ for binary data. In
    4.1, the binary attribute causes the charecter set's _bin
    collation to be used (e.g. utf8_bin). This is why VARCHAR
    with
    _bin has that attribute set.

    <URL:http: dev.mysql.com="" doc="" refman="" 4.1="" en="" c-api-="" datatypes.html=""> only knows one constant for VARCHAR/
    VARBINARY, CHAR/BINARY and TEXT/BLOB types resp., but tells
    how to distinguish anyway:

    To distinguish between binary and non-binary data for
    string data types, check whether the charsetnr value is
    63. If so, the character set is binary, which indicates
    binary rather than non-binary data. This is how to
    distinguish between BINARY and CHAR, VARBINARY and
    VARCHAR, and BLOB and TEXT.

    I have no intention on writing a patch myself, as I don't
    need it for know, but I think you might be interested ;)

    Regards,
    Milan

     
  • Andy Dustman
    Andy Dustman
    2006-09-08

    Logged In: YES
    user_id=71372

    Wow, that really sucks.

     
  • Andy Dustman
    Andy Dustman
    2007-02-10

    Logged In: YES
    user_id=71372
    Originator: NO

    I don't really what to do with this at this point.