Cursor returns array instead of string for blob column type
MySQL database connector for Python programming
Brought to you by:
adustman
The standard cursor class ('Cursor') used to return
MySQL blob row types as Python string values. In
version 1.0 it returns arrays containing a single item
with the string 'c' as key and the blob string as the
value.
So, where you used to get:
'A value'
you now get:
array('c', 'Another first'))
when selecting values from blobs.
Is there any reason for this, or should this be
considered a bug?
Logged In: YES
user_id=1066725
According to the DB-API specs, Python array objects are the
preferred representation for Blobs. This appears to have
changed a couple of times for MySQLdb and the latest change
seems to be missing from the CHANGELOG.
But it's probably not a bug.
Logged In: YES
user_id=71372
It was pointed out that returning BLOBs as strings didn't
match the spec; originally BLOBs were returned as strings.
It took a whlie to figure out how to fix, too, because
columns like TEXT are also internally a BLOB type, whereas
real BLOBs have the BINARY flag set, and previously there
was no conversion mechanism that could pick up on the BINARY
flag.
There are a couple options for you:
If your BLOB is really containing text, change it to a
TEXT column.
Change the default conversion dictionary, removing the
FIELD.BLOB entry (default is to return as string)
I'm leaving this as an Open bug for awhile so it'll be visible.
Logged In: YES
user_id=1066725
Another option is converting in your python code using the
'tostring()' method of the array object. To make older
applications work with both the old and the newer versions
of the db api, you could do something like:
try:
field = field.tostring()
except AttributeError:
pass
Logged In: NO
I've been having a problem with the following code:
CONCAT(
IFNULL(
claim_begin,
'No Date Entered'
),
' - ',
co_name
) AS list_string''',
The type of claim_begin is date, co_name is a VARCHAR(255).
Which used to return a string value in earlier versions of
MySQLdb, I read this thread and it convinced me that the
return type of CONCAT might be a BLOB, in this specific
case, however I checked the MySQL docs, and they claim that
the return type should be a String. If I take the
claim_begin column out the return type that results is a
string an works properly.
I'm posting to here because I have no idea where else to
post, or how to look into the matter further to find out if
MySQL is indeed converting this to a blob, or if it's
MySQLdb's fault, any ideas as to what I could do?
Logged In: NO
Woops, I should have included the MySQL version in my last
post: 4.0.20.
Logged In: YES
user_id=537938
The DB-API PEP:
http://www.python.org/peps/pep-0249.html
implies that the preferred way of storing BLOB columns is
buffer' and not
array.array'. To quote:"""The preferred object type for Binary objects are the
buffer types available in standard Python starting with
version 1.5.2. Please see the Python documentation for
details. For information about the the C interface have a
look at Include/bufferobject.h and Objects/bufferobject.c in
the Python source distribution."""
It tells that the preferred way is any buffer object (including
array.array'), but then refers to the "bare" buffer object
buffer'.The advantages of using buffer over array are:
contents. str(array) does not give you this.
The main difference between arrays and buffers seems to be
that arrays can contain other elements that single bytes.
This functionality is not required however for BLOB columns.
Arrays are difficult to work with. If you want to do any I/O on
them, you basically have to use .tostring() go go back to a
string again.
Considering the above, I think MySQLdb should be changed to
return buffer objects for BLOB columns.
Logged In: YES
user_id=71372
You are correct on the contents of PEP-249. Note that since
PEP-249 was written (07 Apr 1999), there was talk (in 2002)
of silently deprecating buffer.
http://mail.python.org/pipermail/python-dev/2002-June/026018.html
http://mail.python.org/pipermail/python-dev/2000-October/009974.html
Despite this, buffer is still in Python 2.4. However, the
only documenation for it is this:
http://docs.python.org/lib/non-essential-built-in-funcs.html#l2h-86
buffer( object[, offset[, size]])
The object argument must be an object that supports the
buffer call interface (such as strings, arrays, and
buffers). A new buffer object will be created which
references the object argument. The buffer object will be a
slice from the beginning of object (or from the specified
offset). The slice will extend to the end of object (or will
have a length given by the size argument).
This is probably why I used array instead.
Fortunately, it's not something I need to fix. You can
change the object types returned by MySQLdb by doing simple
assignment:
Alternately:
(Note that there are BLOB, LONG_BLOB, MEDIUM_BLOB, and
TINY_BLOB FIELD_TYPE)
MySQL (at least prior to the prepared statements API)
returns all values as strings, or rather in C they are *
char with a separate length (not NULL terminated). _mysql
converts these first to Python string objects. It then uses
the conversions dictionary (or rather the connection
object's copy of it) to find a conversion function; if it
does not find one, it leaves it as a string.
buffer retains a reference to the original object. Once the
original Python string is used to create the buffer, it
would not be deallocated since the buffer still references
it. buffer has less capabilities than string in general,
although you can write to it. This would end up modifying
the string, which is supposed to be immutable, but since
nothing else references it, this is probably not a big problem.
Try some of the above code and let me know how it works. At
this point, I may be more inclined to revert to strings
instead of arrays and avoid buffers. I'll also see what the
consensus is in the DB-SIG.
Logged In: YES
user_id=71372
User feedback indicates that array works well for BLOB, so
I'm not changing it for 1.2. I may revisit this in 1.3-2.0,
since the MySQL-4.1 API has a mechanism for getting and
putting BLOBs piecemeal.
Logged In: YES
user_id=537938
Andy,
apologies for not replying sooner. I've used the conversions
dictionary similar as you describe and it works as expected.
How are the users that indicate array is working well using it?
I think they are primarily converting it to a string with
the .tostring() method. Also, did you get any consensus from
the DB-SIG?
Thanks for doing a great job with MySQLdb.
Logged In: YES
user_id=71372
I didn't get any specific details how how array was being
used, but I expect if .tostring() was being used, they woudl
have suggested using strings.
I surveyed a couple of other DB API modules: mx.ODBC uses
strings, and psycopg (PostgreSQL) uses buffer. Ultimately
there may be a special BLOB type, but not in 1.2