Thread: [SQLObject] Unicode characters
SQLObject is a Python ORM.
Brought to you by:
ianbicking,
phd
From: Glenn M. <gle...@gm...> - 2008-01-04 21:08:22
|
Hi All, I am plugging along with Python and SQLObject, very cool stuff. I have a string which contains utf8 (unicode) characters. When I try to add an instance of the table class with that string I get an error: UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 2: ordinal not in range(128) The string I am inserting is Ch\xe9rie, where \xe9 is e with an umlaut. I am not sure if this is the database sending this message or SQLObject. I am using MySQL 5 and I have created the tables with a default charset of utf8. Thanks Glenn |
From: Glenn M. <gle...@gm...> - 2008-01-04 21:15:13
|
Looking into this issue a bit more revealed that in the StringValidator class, line 505 on col.py, the from_python function tries to encode all unicode type strings to ascii. Is this correct behavior? If so how can I get around it ? Thanks Glenn On Jan 4, 2008 4:08 PM, Glenn MacGregor <gle...@gm...> wrote: > Hi All, > > I am plugging along with Python and SQLObject, very cool stuff. I have a > string which contains utf8 (unicode) characters. When I try to add an > instance of the table class with that string I get an error: > > UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in > position 2: ordinal not in range(128) > > The string I am inserting is Ch\xe9rie, where \xe9 is e with an umlaut. I > am not sure if this is the database sending this message or SQLObject. > > I am using MySQL 5 and I have created the tables with a default charset of > utf8. > > Thanks > > Glenn > |
From: Oleg B. <ph...@ph...> - 2008-01-04 21:21:16
|
Please, do not top-post. On Fri, Jan 04, 2008 at 04:15:09PM -0500, Glenn MacGregor wrote: > Looking into this issue a bit more revealed that in the StringValidator > class, line 505 on col.py, the from_python function tries to encode all > unicode type strings to ascii. Is this correct behavior? If so how can I get > around it ? What version of SQLObject? In the latest version the encoding is not 'ascii' - the code is dbEncoding = getattr(connection, "dbEncoding", None) or "ascii" so to change the encoding (on MySQL) you have to set it in the DB URI: mysql://host:port/database?charset=utf-8 ^^^^^^^^^^^^^ Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Glenn M. <gle...@gm...> - 2008-01-04 22:00:37
|
Oleg, Sorry about the top-post. I am using SQLObject 0.10dev-r3137 installed from an egg. Even with the charset set to utf8 I still get the same error. I must be on an older version. I will update and let you know. Thanks On Jan 4, 2008 4:21 PM, Oleg Broytmann <ph...@ph...> wrote: > Please, do not top-post. > > On Fri, Jan 04, 2008 at 04:15:09PM -0500, Glenn MacGregor wrote: > > Looking into this issue a bit more revealed that in the StringValidator > > class, line 505 on col.py, the from_python function tries to encode all > > unicode type strings to ascii. Is this correct behavior? If so how can I > get > > around it ? > > What version of SQLObject? In the latest version the encoding is not > 'ascii' - the code is > > dbEncoding = getattr(connection, "dbEncoding", None) or "ascii" > > so to change the encoding (on MySQL) you have to set it in the DB URI: > > mysql://host:port/database?charset=utf-8 > ^^^^^^^^^^^^^ > > Oleg. > -- > Oleg Broytmann http://phd.pp.ru/ ph...@ph... > Programmers don't die, they just GOSUB without RETURN. > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > sqlobject-discuss mailing list > sql...@li... > https://lists.sourceforge.net/lists/listinfo/sqlobject-discuss > |
From: Glenn M. <gle...@gm...> - 2008-01-07 17:56:08
|
Maybe I need to backup at bit. I am somewhat confused at this point. I need to insert a string which contains a non-ascii character into my database table, that character is Ch0xE9rie. E9 is the hex representation of the acute-e. Do I need unicode to do this, do I need to change the charset? What are my options? Thanks Glenn On Jan 7, 2008 12:45 PM, Oleg Broytmann <ph...@ph...> wrote: > On Mon, Jan 07, 2008 at 12:35:40PM -0500, Glenn MacGregor wrote: > > #!/usr/bin/python > > # -*- coding: utf-8 -*- > > > > import os, sys, glob MySQLdb > > from sqlobject import * > > > > sqlhub.processConnection = connectionForURI('mysql://test:test@localhost > > /test?charset=utf8&use_unicode=1 > > > > class Temp(SQLObject): > > class sqlmeta: > > table = "temp" > > fromDatabase = True > > > > def add(label): > > return Temp(label=label) > > > > add = staticmethod(add) > > > > > > a = 'Ch?rie' > > > > t = Temp.add(a) > > > > print 'Done' > > Declare the column as non-unicode: > > class Temp(SQLObject): > class sqlmeta: > table = "temp" > fromDatabase = True > > label = StringCol() > > > Or stop using unicode at all: > > sqlhub.processConnection = connectionForURI('mysql://test:test@localhost > /test') > > Oleg. > -- > Oleg Broytmann http://phd.pp.ru/ ph...@ph... > Programmers don't die, they just GOSUB without RETURN. > |
From: Oleg B. <ph...@ph...> - 2008-01-07 18:05:01
|
On Mon, Jan 07, 2008 at 12:56:04PM -0500, Glenn MacGregor wrote: > Maybe I need to backup at bit. I am somewhat confused at this point. I need > to insert a string which contains a non-ascii character into my database > table, that character is Ch0xE9rie. E9 is the hex representation of the > acute-e. Do I need unicode to do this, do I need to change the charset? What > are my options? In general, you can use unicode, but you don't have to. You are not obliged to use unicode. In this particular case you have a str string from a file in an unknown charset, certainly not in utf-8, so you cannot use utf-8 as the charset. Probably for this column you don't need unicode at all. Without unicode your options are: -- use plain strings with StringCol; -- use binary data with BLOBCol; -- use complex data with PickleCol. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Dan P. <da...@ag...> - 2008-01-07 18:50:22
|
On Monday 07 January 2008, Oleg Broytmann wrote: > On Mon, Jan 07, 2008 at 12:56:04PM -0500, Glenn MacGregor wrote: > > Maybe I need to backup at bit. I am somewhat confused at this point. > > I need to insert a string which contains a non-ascii character into > > my database table, that character is Ch0xE9rie. E9 is the hex > > representation of the acute-e. Do I need unicode to do this, do I > > need to change the charset? What are my options? > > In general, you can use unicode, but you don't have to. You are not > obliged to use unicode. > In this particular case you have a str string from a file in an > unknown charset, certainly not in utf-8, so you cannot use utf-8 as the > charset. But he can use latin1 because that encoding is a 1-to-1 mapping of any 8-bit char, so what you put in the db is what you get out of it, even if you do not know the encoding the input data uses. > Probably for this column you don't need unicode at all. > Without unicode your options are: > > -- use plain strings with StringCol; > -- use binary data with BLOBCol; > -- use complex data with PickleCol. > > Oleg. -- Dan |
From: Glenn M. <gle...@gm...> - 2008-01-04 22:08:52
|
Oleg, I am looking in svn at col.py, it seems that from_python in the StringValidator class still uses ascii encoding exclusivly: class StringValidator(validators.Validator): def to_python(self, value, state): if value is None: return None if isinstance(value, unicode): connection = state.soObject._connection dbEncoding = getattr(connection, "dbEncoding", None) or "ascii" return value.encode(dbEncoding) return value def from_python(self, value, state): if value is None: return None if isinstance(value, str): return value if isinstance(value, unicode): --> return value.encode("ascii") return value Maybe I should not be getting there but I am. I am sure mysqld is set to utf8 and the connection is defaulted to utf8 as well. Thanks Glenn On Jan 4, 2008 4:21 PM, Oleg Broytmann <ph...@ph...> wrote: > Please, do not top-post. > > On Fri, Jan 04, 2008 at 04:15:09PM -0500, Glenn MacGregor wrote: > > Looking into this issue a bit more revealed that in the StringValidator > > class, line 505 on col.py, the from_python function tries to encode all > > unicode type strings to ascii. Is this correct behavior? If so how can I > get > > around it ? > > What version of SQLObject? In the latest version the encoding is not > 'ascii' - the code is > > dbEncoding = getattr(connection, "dbEncoding", None) or "ascii" > > so to change the encoding (on MySQL) you have to set it in the DB URI: > > mysql://host:port/database?charset=utf-8 > ^^^^^^^^^^^^^ > > Oleg. > -- > Oleg Broytmann http://phd.pp.ru/ ph...@ph... > Programmers don't die, they just GOSUB without RETURN. > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > sqlobject-discuss mailing list > sql...@li... > https://lists.sourceforge.net/lists/listinfo/sqlobject-discuss > |
From: Oleg B. <ph...@ph...> - 2008-01-04 22:32:13
|
On Fri, Jan 04, 2008 at 05:08:49PM -0500, Glenn MacGregor wrote: > if isinstance(value, unicode): > --> return value.encode("ascii") I guess you are trying to pass unicode to a StringCol. If this is the case - for unicode use UnicodeCol. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Glenn M. <gle...@gm...> - 2008-01-07 19:00:55
|
I really hate to be a pain, but I can not get this to work. I have removed the charset=3Dutf8&use_unicode=3D1 from the connection string amd get the s= chema using fromDatabase. When I try to insert the string: Ch=E8rie it fails at a different location, line 146 of cursors.py of the Mysqldb package. query =3D query.encode(charset) charset is latin1 and it fails with the error: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 36: ordinal not in range(128) This is where I started looking into unicode, but if there is another way t= o get past this, please let me know. Thanks Glenn On Jan 7, 2008 1:05 PM, Oleg Broytmann < ph...@ph...> wrote: > On Mon, Jan 07, 2008 at 12:56:04PM -0500, Glenn MacGregor wrote: > > Maybe I need to backup at bit. I am somewhat confused at this point. I > need > > to insert a string which contains a non-ascii character into my databas= e > > table, that character is Ch0xE9rie. E9 is the hex representation of the > > acute-e. Do I need unicode to do this, do I need to change the charset? > What > > are my options? > > In general, you can use unicode, but you don't have to. You are not > obliged to use unicode. > In this particular case you have a str string from a file in an unknown > charset, certainly not in utf-8, so you cannot use utf-8 as the charset. > Probably for this column you don't need unicode at all. > Without unicode your options are: > > -- use plain strings with StringCol; > -- use binary data with BLOBCol; > -- use complex data with PickleCol. > > Oleg. > -- > Oleg Broytmann http://phd.pp.ru/ ph...@ph... > Programmers don't die, they just GOSUB without RETURN. > |
From: Oleg B. <ph...@ph...> - 2008-01-07 19:29:35
|
On Mon, Jan 07, 2008 at 02:00:51PM -0500, Glenn MacGregor wrote: > it fails at a different location, line 146 of cursors.py of the Mysqldb > package. > > query = query.encode(charset) > > charset is latin1 and it fails with the error: > > UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 36: > ordinal not in range(128) Does "charset=latin1" without "use_unicode" help? Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Glenn M. <gle...@gm...> - 2008-01-07 15:09:56
|
Oleg, I have been fooling around with this all weekend and I am still unsure of what is going on. In the connections string I now use ?charset=utf8&use_unicode=1, this seems to get past the original error, now I am getting caught elsewhere. When I try to add the offending string to the db I get: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 20: ordinal not in range(128) I may be totally off here, so I want to let you know what I am trying to do. I am reading data from a file, one piece of data contains a char outside of the ascii range (0xc3). I need to insert that into the db, the column type is a varchar(64). I am not sure where to start, I believe the db (MySQL) is setup correctly using unicode. I am new to python and SQLObject so I an unsure of the unicode support in both. Is there a tutorial or something that can help me work this out? Thanks Glenn On Jan 4, 2008 5:32 PM, Oleg Broytmann <ph...@ph...> wrote: > On Fri, Jan 04, 2008 at 05:08:49PM -0500, Glenn MacGregor wrote: > > if isinstance(value, unicode): > > --> return value.encode("ascii") > > I guess you are trying to pass unicode to a StringCol. If this is the > case - for unicode use UnicodeCol. > > Oleg. > -- > Oleg Broytmann http://phd.pp.ru/ ph...@ph... > Programmers don't die, they just GOSUB without RETURN. > |
From: Oleg B. <ph...@ph...> - 2008-01-07 15:38:09
|
On Mon, Jan 07, 2008 at 10:09:51AM -0500, Glenn MacGregor wrote: > ?charset=utf8&use_unicode=1 Do you draw the database scheme from the DB using fromDatabase? > I am reading data from a file, one piece of data contains a char outside of > the ascii range (0xc3). I need to insert that into the db, the column type > is a varchar(64) First question to decide is: what is the type of the data? Is it string, unicode or binary data? For strings, use StringCol, but do not put unicode to the column - convert it to a string yourself. For unicode use UnicodeCol, and put unicode to the column. For binary data use BLOBCol or StringCol; then again, use strings as the data input. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Oleg B. <ph...@ph...> - 2008-01-07 16:59:17
|
On Mon, Jan 07, 2008 at 11:43:24AM -0500, Glenn MacGregor wrote: > It looks like somewhere in SQLObject we are calling decode on the variable > which contains Ch?rie to decode it to ascii. Is this the intended result? No, "decoding" in Python means converting from str to unicode. Somewhere (in UnicodeStringValidator in col.py) the string is decoded from str using utf-8 charset, but the decoding fails. > I have tried to manually define the column in the class definition, > > label = UnicodeCol() > > This did not change anything Of course. Setting "use_unicode" in DB URI changes every StringCol to UnicodeCol. You can declare it StringCol explicitly, and then use only str, not unicode. > I have tracked it down to the return statement of the StringLikeConverter, > line 96 on converters.py. return "'%s'" % value seems to be doing a decode > (or encode) which is failing. I doubt it. At the time the converter works unicode should be converted to str already. > Note in the debugger I see the type of the > param value in that function is str, I would think it would be unicode. UnicodeStringValidator.from_python() has converted it to str. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Oleg B. <ph...@ph...> - 2008-01-07 17:44:58
|
On Mon, Jan 07, 2008 at 12:35:40PM -0500, Glenn MacGregor wrote: > #!/usr/bin/python > # -*- coding: utf-8 -*- > > import os, sys, glob MySQLdb > from sqlobject import * > > sqlhub.processConnection = connectionForURI('mysql://test:test@localhost > /test?charset=utf8&use_unicode=1 > > class Temp(SQLObject): > class sqlmeta: > table = "temp" > fromDatabase = True > > def add(label): > return Temp(label=label) > > add = staticmethod(add) > > > a = 'Ch?rie' > > t = Temp.add(a) > > print 'Done' Declare the column as non-unicode: class Temp(SQLObject): class sqlmeta: table = "temp" fromDatabase = True label = StringCol() Or stop using unicode at all: sqlhub.processConnection = connectionForURI('mysql://test:test@localhost/test') Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Oleg B. <ph...@ph...> - 2008-01-07 20:26:35
|
On Mon, Jan 07, 2008 at 03:13:18PM -0500, Glenn MacGregor wrote: > My mistake, not sure what happened, but when I set the charset=latin1 it > fails in the same place as it fails without the charset set. That is > cursors.py line 146 which encodes the query using latin1 in both cases, > means that mysql default charset for the connection is latin1. Very strange. The offending character can be decoded from latin1 without any problem: >>> unicode("\xc3", "utf-8") Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'utf8' codec can't decode byte 0xc3 in position 0: unexpected end of data >>> unicode("\xc3", "latin1") u'\xc3' Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Glenn M. <gle...@gm...> - 2008-01-08 13:26:04
|
Thanks for all the help! I got it working using the default charset and explicitly setting the necessary columns to UnicodeCol, I am going to try using use_unicode just to verify. Thanks Glenn On Jan 7, 2008 3:26 PM, Oleg Broytmann <ph...@ph...> wrote: > On Mon, Jan 07, 2008 at 03:13:18PM -0500, Glenn MacGregor wrote: > > My mistake, not sure what happened, but when I set the charset=latin1 it > > fails in the same place as it fails without the charset set. That is > > cursors.py line 146 which encodes the query using latin1 in both cases, > > means that mysql default charset for the connection is latin1. > > Very strange. The offending character can be decoded from latin1 without > any problem: > > >>> unicode("\xc3", "utf-8") > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > UnicodeDecodeError: 'utf8' codec can't decode byte 0xc3 in position 0: > unexpected end of data > > >>> unicode("\xc3", "latin1") > u'\xc3' > > Oleg. > -- > Oleg Broytmann http://phd.pp.ru/ ph...@ph... > Programmers don't die, they just GOSUB without RETURN. > |
From: Oleg B. <ph...@ph...> - 2008-01-07 19:49:09
|
On Mon, Jan 07, 2008 at 02:38:08PM -0500, Glenn MacGregor wrote: > Changing the connection string to incorporate charset=latin1 fails at a > different place. With that in the string the failure happens at > dbconnection.py line 383. Line 383 in the trunk is return ("INSERT INTO %s (%s) VALUES (%s)" % (table, ', '.join(names), ', '.join([self.sqlrepr(v) for v in values]))) so you probably got an error from sqlrepr, right? What was the error? If it is UnicodeDecodeError - I do not understand where from have you got unicode now. You are using str everywhere, right? Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Glenn M. <gle...@gm...> - 2008-01-07 20:13:25
|
My mistake, not sure what happened, but when I set the charset=latin1 it fails in the same place as it fails without the charset set. That is cursors.py line 146 which encodes the query using latin1 in both cases, means that mysql default charset for the connection is latin1. I am using str everywhere as far as I can tell. Glenn On Jan 7, 2008 2:49 PM, Oleg Broytmann <ph...@ph...> wrote: > On Mon, Jan 07, 2008 at 02:38:08PM -0500, Glenn MacGregor wrote: > > Changing the connection string to incorporate charset=latin1 fails at a > > different place. With that in the string the failure happens at > > dbconnection.py line 383. > > Line 383 in the trunk is > > return ("INSERT INTO %s (%s) VALUES (%s)" % > (table, ', '.join(names), > ', '.join([self.sqlrepr(v) for v in values]))) > > so you probably got an error from sqlrepr, right? What was the error? If > it is UnicodeDecodeError - I do not understand where from have you got > unicode now. You are using str everywhere, right? > > Oleg. > -- > Oleg Broytmann http://phd.pp.ru/ ph...@ph... > Programmers don't die, they just GOSUB without RETURN. > |
From: Glenn M. <gle...@gm...> - 2008-01-07 19:38:12
|
Changing the connection string to incorporate charset=latin1 fails at a different place. With that in the string the failure happens at dbconnection.py line 383. I have also tried charset=utf8 with the same results. I have tried changing the settings in the my.cnf to manage to connection types but to no avail. Is there something I can do to decode or escape the offending chars? Glenn On Jan 7, 2008 2:29 PM, Oleg Broytmann <ph...@ph...> wrote: > On Mon, Jan 07, 2008 at 02:00:51PM -0500, Glenn MacGregor wrote: > > it fails at a different location, line 146 of cursors.py of the Mysqldb > > package. > > > > query = query.encode(charset) > > > > charset is latin1 and it fails with the error: > > > > UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 36: > > ordinal not in range(128) > > Does "charset=latin1" without "use_unicode" help? > > Oleg. > -- > Oleg Broytmann http://phd.pp.ru/ ph...@ph... > Programmers don't die, they just GOSUB without RETURN. > |
From: Oleg B. <ph...@ph...> - 2008-01-07 19:59:18
|
BTW, you have said you can insert the string from the command line. But can you do it using MySQLdb? What are correct parameters for connection? Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Glenn M. <gle...@gm...> - 2008-01-07 16:43:28
|
I do get the schema using fromDatabase. This is a simple table with 2 columns: id, int and label, varchar(64). Everything until now has been ascii, but I now have one string, Ch=E8rie, which I read from a file that I need to add to the db using SQLObject. If I do the insert from the command line into the db locally it works fine, so I think MySQL is setup correctly to handle this. It looks like somewhere in SQLObject we are calling decode on the variable which contains Ch=E8rie to decode it to ascii. Is this the intended result? I have tried to manually define the column in the class definition, label =3D UnicodeCol() This did not change anything, I get the same error. The last line in the traceback comes from dbconnection.py line 383. This calls self.sqlrepr whic= h must be the source of the problem. I have tracked it down to the return statement of the StringLikeConverter, line 96 on converters.py. return "'%s'" % value seems to be doing a decode (or encode) which is failing. Note in the debugger I see the type of the param value in that function is str, I would think it would be unicode. Any help would be great! Glenn On Jan 7, 2008 10:38 AM, Oleg Broytmann <ph...@ph...> wrote: > On Mon, Jan 07, 2008 at 10:09:51AM -0500, Glenn MacGregor wrote: > > ?charset=3Dutf8&use_unicode=3D1 > > Do you draw the database scheme from the DB using fromDatabase? > > > I am reading data from a file, one piece of data contains a char outsid= e > of > > the ascii range (0xc3). I need to insert that into the db, the column > type > > is a varchar(64) > > First question to decide is: what is the type of the data? Is it string= , > unicode or binary data? For strings, use StringCol, but do not put unicod= e > to the column - convert it to a string yourself. For unicode use > UnicodeCol, and put unicode to the column. For binary data use BLOBCol or > StringCol; then again, use strings as the data input. > > Oleg. > -- > Oleg Broytmann http://phd.pp.ru/ ph...@ph... > Programmers don't die, they just GOSUB without RETURN. > |
From: Glenn M. <gle...@gm...> - 2008-01-07 17:35:46
|
Ok, so I am lost. The debugger is not too much help at this point. I assume this setup should work so we can go from there. -----CUT--------- #!/usr/bin/python # -*- coding: utf-8 -*- import os, sys, glob MySQLdb from sqlobject import * sqlhub.processConnection =3D connectionForURI('mysql://test:test@localhost /test?charset=3Dutf8&use_unicode=3D1 class Temp(SQLObject): class sqlmeta: table =3D "temp" fromDatabase =3D True def add(label): return Temp(label=3Dlabel) add =3D staticmethod(add) a =3D 'Ch=E8rie' t =3D Temp.add(a) print 'Done' This fails, can you give me any advise on where to look or how to debug? Thanks Glenn On Jan 7, 2008 11:59 AM, Oleg Broytmann <ph...@ph...> wrote: > On Mon, Jan 07, 2008 at 11:43:24AM -0500, Glenn MacGregor wrote: > > It looks like somewhere in SQLObject we are calling decode on the > variable > > which contains Ch?rie to decode it to ascii. Is this the intended > result? > > No, "decoding" in Python means converting from str to unicode. Somewher= e > (in UnicodeStringValidator in col.py) the string is decoded from str usin= g > utf-8 charset, but the decoding fails. > > > I have tried to manually define the column in the class definition, > > > > label =3D UnicodeCol() > > > > This did not change anything > > Of course. Setting "use_unicode" in DB URI changes every StringCol to > UnicodeCol. > You can declare it StringCol explicitly, and then use only str, not > unicode. > > > I have tracked it down to the return statement of the > StringLikeConverter, > > line 96 on converters.py. return "'%s'" % value seems to be doing a > decode > > (or encode) which is failing. > > I doubt it. At the time the converter works unicode should be converted > to str already. > > > Note in the debugger I see the type of the > > param value in that function is str, I would think it would be unicode. > > UnicodeStringValidator.from_python() has converted it to str. > > Oleg. > -- > Oleg Broytmann http://phd.pp.ru/ ph...@ph... > Programmers don't die, they just GOSUB without RETURN. > |