Thread: [SQLObject] Patch for mysqlconnection.py to handle unicode queries
SQLObject is a Python ORM.
Brought to you by:
ianbicking,
phd
From: Markus G. <m.g...@gm...> - 2007-04-18 18:57:10
|
Hi, I am using the patch below in my local version of SQLObject for several month now. I think I already proposed it some time ago. It allows me to use an application I wrote using the SQLite backend unmodified also with the MySQL backend. The SQLite backend of SQLObject has no problem with handling unicode queries. Without the patch, unicode queries do not work using the MySQL backend. IMO it makes no sense to prevent unicode queries from working, since MySQLdb supports unicode queries. Even more, the current mysqlconnection.py code explicitly converts to unicode, in case of self.need_unicode. The problem with the current code is, that it also tries to perform this conversion even when the query is already of type unicode. The patch below fixes this. --- sqlobject_orig/mysql/mysqlconnection.py +++ sqlobject/mysql/mysqlconnection.py @@ -92,7 +92,7 @@ # done by calling ping(True) on the connection. for count in range(3): try: - if self.need_unicode: + if self.need_unicode and not isinstance(query, unicode): # For MysqlDB 1.2.1 and later, we go # encoding->unicode->charset (in the mysql db) myquery = unicode(query, self.encoding) Kind regards, Markus |
From: Oleg B. <ph...@ph...> - 2007-04-18 19:00:29
|
On Wed, Apr 18, 2007 at 08:57:09PM +0200, Markus Gritsch wrote: > + if self.need_unicode and not isinstance(query, unicode): Can you add a test for this? Some test that produces unicode query... Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Markus G. <m.g...@gm...> - 2007-04-18 19:21:18
|
On 4/18/07, Oleg Broytmann <ph...@ph...> wrote: > On Wed, Apr 18, 2007 at 08:57:09PM +0200, Markus Gritsch wrote: > > + if self.need_unicode and not isinstance(query, unicode): > > Can you add a test for this? Some test that produces unicode query... from sqlobject import * # Works out of the box. ##sqlhub.threadConnection = connectionForURI('sqlite:/:memory:') # Works only after applying the patch. sqlhub.threadConnection = connectionForURI('mysql://markus@localhost/test?use_unicode=1&charset=utf8&sqlobject_encoding=utf-8') class Person(SQLObject): name = UnicodeCol() Person.dropTable(ifExists=True) Person.createTable(ifNotExists=True) p = Person(name=u'\u20ac') # \u20ac is the 'Euro symbol'. print Person.select(LIKE(Person.q.name, u'\u20ac'))[0].name.encode('utf-8') |
From: Oleg B. <ph...@ph...> - 2007-04-18 19:24:20
|
On Wed, Apr 18, 2007 at 09:21:15PM +0200, Markus Gritsch wrote: > p = Person(name=u'\u20ac') # \u20ac is the 'Euro symbol'. > print Person.select(LIKE(Person.q.name, u'\u20ac'))[0].name.encode('utf-8') Thank you. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Oleg B. <ph...@ph...> - 2007-04-19 15:42:28
|
On Wed, Apr 18, 2007 at 09:21:15PM +0200, Markus Gritsch wrote: > p = Person(name=u'\u20ac') # \u20ac is the 'Euro symbol'. > print Person.select(LIKE(Person.q.name, u'\u20ac'))[0].name.encode('utf-8') Doesn't work with Postgres and SQLite. To demonstrate the exact place of the problem I changed the last line to persons = Person.select(LIKE(Person.q.name, u'\u20ac')) person = persons[0] print person.name.encode('utf-8') Both Pg and SQLite fail on person = persons[0] The tracebacks are - SQLite: Traceback (most recent call last): File "./test1.py", line 19, in ? person = persons[0] File "/usr/local/lib/python2.4/site-packages/SQLObject-0.8.2dev_r2514-py2.4.egg/sqlobject/sresults.py", line 154, in __getitem__ return list(self.clone(start=start, end=start+1))[0] File "/usr/local/lib/python2.4/site-packages/SQLObject-0.8.2dev_r2514-py2.4.egg/sqlobject/sresults.py", line 160, in __iter__ return iter(list(self.lazyIter())) File "/usr/local/lib/python2.4/site-packages/SQLObject-0.8.2dev_r2514-py2.4.egg/sqlobject/sresults.py", line 168, in lazyIter return conn.iterSelect(self) File "/usr/local/lib/python2.4/site-packages/SQLObject-0.8.2dev_r2514-py2.4.egg/sqlobject/dbconnection.py", line 374, in iterSelect select, keepConnection=False) File "/usr/local/lib/python2.4/site-packages/SQLObject-0.8.2dev_r2514-py2.4.egg/sqlobject/dbconnection.py", line 759, in __init__ dbconn.printDebug(rawconn, self.query, 'Select') File "/usr/local/lib/python2.4/site-packages/SQLObject-0.8.2dev_r2514-py2.4.egg/sqlobject/dbconnection.py", line 303, in printDebug print '%(n)2i%(threadName)s/%(name)s%(spaces)s%(sep)s %(s)s' % locals() File "/usr/local/lib/python2.4/encodings/koi8_r.py", line 18, in encode return codecs.charmap_encode(input,errors,encoding_map) UnicodeEncodeError: 'charmap' codec can't encode character u'\u20ac' in position 82: character maps to <undefined> PostgreSQL: Traceback (most recent call last): File "./test1.py", line 19, in ? person = persons[0] File "/usr/local/lib/python2.4/site-packages/SQLObject-0.8.2dev_r2514-py2.4.egg/sqlobject/sresults.py", line 154, in __getitem__ return list(self.clone(start=start, end=start+1))[0] File "/usr/local/lib/python2.4/site-packages/SQLObject-0.8.2dev_r2514-py2.4.egg/sqlobject/sresults.py", line 160, in __iter__ return iter(list(self.lazyIter())) File "/usr/local/lib/python2.4/site-packages/SQLObject-0.8.2dev_r2514-py2.4.egg/sqlobject/sresults.py", line 168, in lazyIter return conn.iterSelect(self) File "/usr/local/lib/python2.4/site-packages/SQLObject-0.8.2dev_r2514-py2.4.egg/sqlobject/dbconnection.py", line 374, in iterSelect select, keepConnection=False) File "/usr/local/lib/python2.4/site-packages/SQLObject-0.8.2dev_r2514-py2.4.egg/sqlobject/dbconnection.py", line 759, in __init__ dbconn.printDebug(rawconn, self.query, 'Select') File "/usr/local/lib/python2.4/site-packages/SQLObject-0.8.2dev_r2514-py2.4.egg/sqlobject/dbconnection.py", line 303, in printDebug print '%(n)2i%(threadName)s/%(name)s%(spaces)s%(sep)s %(s)s' % locals() UnicodeEncodeError: 'ascii' codec can't encode character u'\u20ac' in position 82: ordinal not in range(128) Resume: SQLObject doesn't work with unicode expressions yet. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Markus G. <m.g...@gm...> - 2007-04-23 08:26:42
Attachments:
EmacsShot.png
|
On 4/19/07, Oleg Broytmann <ph...@ph...> wrote: > On Wed, Apr 18, 2007 at 09:21:15PM +0200, Markus Gritsch wrote: > > p = Person(name=u'\u20ac') # \u20ac is the 'Euro symbol'. > > print Person.select(LIKE(Person.q.name, u'\u20ac'))[0].name.encode('utf-8') > > Doesn't work with Postgres and SQLite. I don't get it. I do not use Postgres, but when I run the example using SQLite, it works without any problem. I use Python 2.5 on Windows XP and SQLObject 0.9.0b1. The output "terminal" is a buffer in Emacs, which can display unicode characters very well. As can be seen from the attached screenshot, the program prints the Euro symbol. Any thoughts? Markus |
From: Markus G. <m.g...@gm...> - 2007-04-24 11:21:13
|
Hi all, can some people on this list please be so kind and run the following short program and confirm that it works or send the traceback in case of any? It is self-contained and does not require any further setup. Thanks in advance, Markus from sqlobject import * sqlhub.threadConnection = connectionForURI('sqlite:/:memory:') class Person(SQLObject): name = UnicodeCol() Person.dropTable(ifExists=True) Person.createTable(ifNotExists=True) p = Person(name=u'\u20ac') # \u20ac is the 'Euro symbol'. persons = Person.select(LIKE(Person.q.name, u'\u20ac')) person = persons[0] print person.name.encode('utf-8') |
From: Robert F. <xro...@go...> - 2007-04-24 11:31:16
|
robert@astroman:~$ python Python 2.4.4c1 (#2, Oct 11 2006, 21:51:02) [GCC 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from sqlobject import * >>> >>> sqlhub.threadConnection =3D connectionForURI('sqlite:/:memory:') >>> >>> class Person(SQLObject): ... name =3D UnicodeCol() ... >>> Person.dropTable(ifExists=3DTrue) >>> Person.createTable(ifNotExists=3DTrue) >>> >>> p =3D Person(name=3Du'\u20ac') # \u20ac is the 'Euro symbol'. >>> persons =3D Person.select(LIKE(Person.q.name, u'\u20ac')) >>> person =3D persons[0] >>> print person.name.encode('utf-8') =80 On 4/24/07, Markus Gritsch <m.g...@gm...> wrote: > Hi all, > > can some people on this list please be so kind and run the following > short program and confirm that it works or send the traceback in case > of any? It is self-contained and does not require any further setup. > > Thanks in advance, > Markus > > > from sqlobject import * > > sqlhub.threadConnection =3D connectionForURI('sqlite:/:memory:') > > class Person(SQLObject): > name =3D UnicodeCol() > > Person.dropTable(ifExists=3DTrue) > Person.createTable(ifNotExists=3DTrue) > > p =3D Person(name=3Du'\u20ac') # \u20ac is the 'Euro symbol'. > persons =3D Person.select(LIKE(Person.q.name, u'\u20ac')) > person =3D persons[0] > print person.name.encode('utf-8') > > ------------------------------------------------------------------------- > This SF.net email is sponsored by DB2 Express > Download DB2 Express C - the FREE version of DB2 express and take > control of your XML. No limits. Just data. Click to get it now. > http://sourceforge.net/powerbar/db2/ > _______________________________________________ > sqlobject-discuss mailing list > sql...@li... > https://lists.sourceforge.net/lists/listinfo/sqlobject-discuss > |
From: Oleg B. <ph...@ph...> - 2007-04-24 13:00:17
|
Found the problem: On Thu, Apr 19, 2007 at 07:42:20PM +0400, Oleg Broytmann wrote: > File "/usr/local/lib/python2.4/site-packages/SQLObject-0.8.2dev_r2514-py2.4.egg/sqlobject/dbconnection.py", line 759, in __init__ > dbconn.printDebug(rawconn, self.query, 'Select') > File "/usr/local/lib/python2.4/site-packages/SQLObject-0.8.2dev_r2514-py2.4.egg/sqlobject/dbconnection.py", line 303, in printDebug > print '%(n)2i%(threadName)s/%(name)s%(spaces)s%(sep)s %(s)s' % locals() > File "/usr/local/lib/python2.4/encodings/koi8_r.py", line 18, in encode > return codecs.charmap_encode(input,errors,encoding_map) > UnicodeEncodeError: 'charmap' codec can't encode character u'\u20ac' in position 82: character maps to <undefined> It's debugging print. Without it ("?debug=") both SQLite and Postgres work; Postgres requires "createdb -E utf-8". Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Oleg B. <ph...@ph...> - 2007-04-24 13:03:53
|
On Tue, Apr 24, 2007 at 05:00:09PM +0400, Oleg Broytmann wrote: > It's debugging print. In SQLObject 0.9 one can setup logging instead of using stdout: __connection__ = "sqlite:/:memory:?debug=1&logger=TEST&loglevel=debug" Logging correctly does debugging prints in utf-8. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Markus G. <m.g...@gm...> - 2007-04-24 13:03:26
|
On 4/24/07, Oleg Broytmann <ph...@ph...> wrote: > Found the problem: > > On Thu, Apr 19, 2007 at 07:42:20PM +0400, Oleg Broytmann wrote: > > File "/usr/local/lib/python2.4/site-packages/SQLObject-0.8.2dev_r2514-py2.4.egg/sqlobject/dbconnection.py", line 759, in __init__ > > dbconn.printDebug(rawconn, self.query, 'Select') > > File "/usr/local/lib/python2.4/site-packages/SQLObject-0.8.2dev_r2514-py2.4.egg/sqlobject/dbconnection.py", line 303, in printDebug > > print '%(n)2i%(threadName)s/%(name)s%(spaces)s%(sep)s %(s)s' % locals() > > File "/usr/local/lib/python2.4/encodings/koi8_r.py", line 18, in encode > > return codecs.charmap_encode(input,errors,encoding_map) > > UnicodeEncodeError: 'charmap' codec can't encode character u'\u20ac' in position 82: character maps to <undefined> > > It's debugging print. Without it ("?debug=") both SQLite and Postgres > work; Postgres requires "createdb -E utf-8". Great! So, is there a chance that the patch for the MySQLdb backend will be incorporated? Markus |
From: Oleg B. <ph...@ph...> - 2007-04-24 13:09:14
|
On Tue, Apr 24, 2007 at 03:03:23PM +0200, Markus Gritsch wrote: > Great! So, is there a chance that the patch for the MySQLdb backend > will be incorporated? Not so fast. Now it's your turn to review patches. ;) What do you think of this one? http://sourceforge.net/tracker/index.php?func=detail&aid=1653898&group_id=74338&atid=540674 Can you test it? Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Markus G. <m.g...@gm...> - 2007-04-24 13:58:47
|
On 4/24/07, Oleg Broytmann <ph...@ph...> wrote: > What do you think of this one? > http://sourceforge.net/tracker/index.php?func=detail&aid=1653898&group_id=74338&atid=540674 > > Can you test it? I applied it and looked at the changes it makes. The last hunk def _setAutoCommit(self, conn, auto): @@ -71,13 +84,11 @@ def _executeRetry(self, conn, cursor, query): while 1: try: - if self.need_unicode: - # For MysqlDB 1.2.1 and later, we go - # encoding->unicode->charset (in the mysql db) - myquery = unicode(query, self.encoding) - return cursor.execute(myquery) - else: - return cursor.execute(query) + # For MySQLdb 1.2.1 and later, we go + # encoding->unicode->charset (in the mysql db) + if self.need_unicode and not isinstance(query, unicode): + query = unicode(query, self.dbEncoding) + return cursor.execute(query) except MySQLdb.OperationalError, e: if e.args[0] == 2013: # SERVER_LOST error if self.debug: is exactly the one I proposed plus some code cleanup. This is ok and suffices to make my test-code work. As for the other hunks, well I cannot say much. It seems that it includes some MySQLdb version check which uses SET NAMES for versions pre 1.2.1 to set the encoding. I cannot test this, since I never used that old connector version. I used 1.2.1 beta for over a year, and since 1.2.2 came out recently, I switched to that one. Next, the patch contains some code to eliminate the possibility that the connection parameters 'charset' and 'sqlobject_encoding' can be different, although not popping both of them if they are both present. This leads to an error message when both are specified: TypeError: __init__() got an unexpected keyword argument 'sqlobject_encoding' Further I thought that 'charset' is the MySQL encoding and 'sqlobject_encoding' is the one used by SQLObject. I thought that both have a right to exist. The patch unifies them to one parameter, and I have no idea if this is a good or a bad thing. And last, the patch adds a new UnicodeStringLikeConverter which encodes to 'utf8' hardcoded. I doubt that this is the correct thing to do under all circumstances. Just my 2 Cent, Markus |
From: Oleg B. <ph...@ph...> - 2007-04-24 14:47:53
|
On Tue, Apr 24, 2007 at 03:58:44PM +0200, Markus Gritsch wrote: > On 4/24/07, Oleg Broytmann <ph...@ph...> wrote: > >What do you think of this one? > >http://sourceforge.net/tracker/index.php?func=detail&aid=1653898&group_id=74338&atid=540674 > > I applied it and looked at the changes it makes. The last hunk > is exactly the one I proposed That's why I asked you about it. > Next, the patch contains some code to eliminate the possibility that > the connection parameters 'charset' and 'sqlobject_encoding' can be > different I will unapply that part. > And last, the patch adds a new UnicodeStringLikeConverter which > encodes to 'utf8' hardcoded. I doubt that this is the correct thing > to do under all circumstances. That's probably good for connections that support neither unicode nor dbEncoding. At least there is a chance an unicode expression could be passed to the DB. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Oleg B. <ph...@ph...> - 2007-04-25 14:59:17
|
On Tue, Apr 24, 2007 at 03:58:44PM +0200, Markus Gritsch wrote: > On 4/24/07, Oleg Broytmann <ph...@ph...> wrote: > >http://sourceforge.net/tracker/index.php?func=detail&aid=1653898&group_id=74338&atid=540674 I applied a part of the patch, but I didn't apply the UnicodeStringLikeConvertor (I have to think about it a bit more to find problems with the part) and dbEncoding/sqlobject_encoding unification. Committed in the revisions 2593-2597 (0.7, 0.8, 0.9, trunk and docs). Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Markus G. <m.g...@gm...> - 2007-04-26 07:03:47
|
On 4/25/07, Oleg Broytmann <ph...@ph...> wrote: > On Tue, Apr 24, 2007 at 03:58:44PM +0200, Markus Gritsch wrote: >http://sourceforge.net/tracker/index.php?func=detail&aid=1653898&group_id=74338&atid=540674 > > I applied a part of the patch Allows the using unicode queries problem and works fine for me :) Markus |