I have an INSERT statement which I can use without problem at the MySQL monitor but cannot get to work through MySQLdb (v0.9.2). The statement is made by:
sql = u"INSERT INTO customers (customerID, contactName) VALUES ('%s','%s')"
and then using string interpolation to put in the values. The unicode string is so that the Python String interploation worlks when the contact name has accented charaters in. Without that I get the unicode error:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' (e with acute accent)
There are several statements to execute andthe code for each one is like this:
Everything goes fine till the insert with \xe9 comes up. Then I get this traceback:
Traceback (most recent call last):
File "C:\home\teach\SY23_2005\site\lectures\lecture11\copydata.py", line 39, i
n ?
cursor.execute(query)
File "C:\Python23\lib\site-packages\MySQLdb\cursors.py", line 95, in execute
return self._execute(query, args)
File "C:\Python23\lib\site-packages\MySQLdb\cursors.py", line 123, in _execute
r = self._query(query)
File "C:\Python23\lib\site-packages\MySQLdb\cursors.py", line 261, in _query
rowcount = self._BaseCursordo_query(q)
File "C:\Python23\lib\site-packages\MySQLdb\cursors.py", line 182, in do_que
ry
db.query(q)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 67:
ordinal not in range(128)
I should mention that I have commented out some error trapping code in cursors.py to get this traceback. Specifically the methof _execute of cursors.py has no error trapping code. If I restore the error-trapping then the traceback is:
Traceback (most recent call last):
File "C:\home\teach\SY23_2005\site\lectures\lecture11\copydata.py", line 39, i
n ?
cursor.execute(query)
File "C:\Python23\lib\site-packages\MySQLdb\cursors.py", line 95, in execute
return self._execute(query, args)
File "C:\Python23\lib\site-packages\MySQLdb\cursors.py", line 114, in _execute
self.errorhandler(self, exc, value)
File "C:\Python23\lib\site-packages\MySQLdb\connections.py", line 33, in defau
lterrorhandler
raise errorclass, errorvalue
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 67:
ordinal not in range(128)
I'm using Windows XP Pro SP2 and Python 2.3.4
Anyone know anything about this issue?
Peter
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
First, why are you using 0.9.2? It is old, old, old. Use 1.2.0.
Second: How many times does it need to be said: Just because MySQLdb uses the string formatting code (%s) for parameters doesn't mean you should use % to put your parameters in the query. DON'T DO THIS. READ PEP-249. READ THE USER'S GUIDE. DO THIS INSTEAD:
id = rs.Fields("CustomerID").value
cn = rs.Fields("ContactName").value
cursor.execute(query, (id, cn))
If you fail to do this, your parameters will probably not be quoted correctly.
And now, unicode.
You should keep in mind that with MySQL-4.0 and earlier, the character set used by your connection is determined by the server. By default it uses latin1, and you can't change this on the client side.
In contrast, MySQL-4.1 has very fine-grained control over the character set encoding, and you can affect it from the client (with SQL).
To actually get unicode strings out of MySQLdb as results, pass use_unicode=True to connect(). This will return all text-like column as unicode, decoding the results with the character set of the connection.
Even if you don't set this, you can still write unicode strings out, i.e. pass them to execute(). However, I am wondering how well this has been tested with characters outside of the normal ASCII range. Plus I never really thought about the idea of somebody passing unicode SQL statements. The MySQL C API requires a regular C string, and so this must be encoded correctly. I'm not so sure MySQLdb will do this. If it doesn't it's a bug and I'll fix it. The workaround would be to call .encode('utf8') (for example) on your unicode strings before passing to .execute().
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
sql = "INSERT INTO customers (customerID, contactName) VALUES ('%s','%s')"
and then try to interpolate an 8-bit character like \xe9 I get a unicode error I assume because the string is plain ascii. This error goes away if sql is a unicode string. However if I use the SQL sql = "INSERT INTO customers (customerID, contactName) VALUES (%s,%s)" and the parameterised form of execute (as advised :-) ) then there is no problem with the contactName data extracted from MSAccess. I guess that solves the original issue. It's better than that actually because I can encode a contactName as utf-8 and execute still works fine. Then I can decode the utf-8 and in this case encode it as latin-1 and get the result back. So can handle any charcater set.
Unfortunately I need to use Python 2.3 on Windows and FreeBSD along with MySQL v3 and v4.0 . Before I can upgrade I also need to persuade my ISP to upgrade too. Presently I seem to be stuck with 0.9.2 - is that so?
Thanks
Peter
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The only Windows build for 1.2.0 I have has the MySQL-4.1 client libraries. However, this should be backwards-compatible with a 4.0 or 3.23 server, I think. Unfortunately for you, it's for Python-2.4. I depend on people to donate these, since I don't do Windows.
If someone out there can put together a Python-2.3 version, maybe with MySQL-4.0. I think there are a lot of Zope users who would want it, since Zope still uses Python-2.3 for the foreseeable future (I do not think 2.8 will recommend using Python-2.4).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have an INSERT statement which I can use without problem at the MySQL monitor but cannot get to work through MySQLdb (v0.9.2). The statement is made by:
sql = u"INSERT INTO customers (customerID, contactName) VALUES ('%s','%s')"
and then using string interpolation to put in the values. The unicode string is so that the Python String interploation worlks when the contact name has accented charaters in. Without that I get the unicode error:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' (e with acute accent)
There are several statements to execute andthe code for each one is like this:
Everything goes fine till the insert with \xe9 comes up. Then I get this traceback:
Traceback (most recent call last):
File "C:\home\teach\SY23_2005\site\lectures\lecture11\copydata.py", line 39, i
n ?
cursor.execute(query)
File "C:\Python23\lib\site-packages\MySQLdb\cursors.py", line 95, in execute
return self._execute(query, args)
File "C:\Python23\lib\site-packages\MySQLdb\cursors.py", line 123, in _execute
File "C:\Python23\lib\site-packages\MySQLdb\cursors.py", line 261, in _query
rowcount = self._BaseCursordo_query(q)
File "C:\Python23\lib\site-packages\MySQLdb\cursors.py", line 182, in do_que
ry
db.query(q)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 67:
ordinal not in range(128)
I should mention that I have commented out some error trapping code in cursors.py to get this traceback. Specifically the methof _execute of cursors.py has no error trapping code. If I restore the error-trapping then the traceback is:
Traceback (most recent call last):
File "C:\home\teach\SY23_2005\site\lectures\lecture11\copydata.py", line 39, i
n ?
cursor.execute(query)
File "C:\Python23\lib\site-packages\MySQLdb\cursors.py", line 95, in execute
return self._execute(query, args)
File "C:\Python23\lib\site-packages\MySQLdb\cursors.py", line 114, in _execute
File "C:\Python23\lib\site-packages\MySQLdb\connections.py", line 33, in defau
lterrorhandler
raise errorclass, errorvalue
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 67:
ordinal not in range(128)
I'm using Windows XP Pro SP2 and Python 2.3.4
Anyone know anything about this issue?
Peter
... I can add that if I add sitecustomize.py to the site-packages directory with:
import sys
sys.setdefaultencoding("utf-8") #or Latin-1
the error no longer occurs. But in the 'production' environment I don't have access to that.
Peter
First, why are you using 0.9.2? It is old, old, old. Use 1.2.0.
Second: How many times does it need to be said: Just because MySQLdb uses the string formatting code (%s) for parameters doesn't mean you should use % to put your parameters in the query. DON'T DO THIS. READ PEP-249. READ THE USER'S GUIDE. DO THIS INSTEAD:
id = rs.Fields("CustomerID").value
cn = rs.Fields("ContactName").value
cursor.execute(query, (id, cn))
If you fail to do this, your parameters will probably not be quoted correctly.
And now, unicode.
You should keep in mind that with MySQL-4.0 and earlier, the character set used by your connection is determined by the server. By default it uses latin1, and you can't change this on the client side.
In contrast, MySQL-4.1 has very fine-grained control over the character set encoding, and you can affect it from the client (with SQL).
To actually get unicode strings out of MySQLdb as results, pass use_unicode=True to connect(). This will return all text-like column as unicode, decoding the results with the character set of the connection.
Even if you don't set this, you can still write unicode strings out, i.e. pass them to execute(). However, I am wondering how well this has been tested with characters outside of the normal ASCII range. Plus I never really thought about the idea of somebody passing unicode SQL statements. The MySQL C API requires a regular C string, and so this must be encoded correctly. I'm not so sure MySQLdb will do this. If it doesn't it's a bug and I'll fix it. The workaround would be to call .encode('utf8') (for example) on your unicode strings before passing to .execute().
If I use
sql = "INSERT INTO customers (customerID, contactName) VALUES ('%s','%s')"
and then try to interpolate an 8-bit character like \xe9 I get a unicode error I assume because the string is plain ascii. This error goes away if sql is a unicode string. However if I use the SQL sql = "INSERT INTO customers (customerID, contactName) VALUES (%s,%s)" and the parameterised form of execute (as advised :-) ) then there is no problem with the contactName data extracted from MSAccess. I guess that solves the original issue. It's better than that actually because I can encode a contactName as utf-8 and execute still works fine. Then I can decode the utf-8 and in this case encode it as latin-1 and get the result back. So can handle any charcater set.
Unfortunately I need to use Python 2.3 on Windows and FreeBSD along with MySQL v3 and v4.0 . Before I can upgrade I also need to persuade my ISP to upgrade too. Presently I seem to be stuck with 0.9.2 - is that so?
Thanks
Peter
The only Windows build for 1.2.0 I have has the MySQL-4.1 client libraries. However, this should be backwards-compatible with a 4.0 or 3.23 server, I think. Unfortunately for you, it's for Python-2.4. I depend on people to donate these, since I don't do Windows.
If someone out there can put together a Python-2.3 version, maybe with MySQL-4.0. I think there are a lot of Zope users who would want it, since Zope still uses Python-2.3 for the foreseeable future (I do not think 2.8 will recommend using Python-2.4).