I am having some character-set problems with MySQLdb within a turbogears project.
my dsn:
sqlobject.dburi="notrans_mysql://user:pass@localhost/database?charset=utf8&debug=True"
my class:
class aTest(SQLObject):
class sqlmeta:
table="atest_test"
name=UnicodeCol(length=100)
in the shell:
>>> aTest(name="abc")
<aTest 1L name=u'abc'>
this one was alright BUT:
>>> meinTest(name="äöü") # causes:
Traceback (most recent call last):
(...)
File "/usr/lib/python2.4/site-packages/SQLObject-0.8.0-py2.4.egg/
sqlobject/main.py", line 1111, in set
value = to_python(dbValue, self._SO_validatorState)
File "/usr/lib/python2.4/site-packages/SQLObject-0.8.0-py2.4.egg/
sqlobject/col.py", line 549, in to_python
return unicode(value, self.db_encoding)
File "/usr/lib/python2.4/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 0-2:
invalid data
If i change dsn and append "?charset=utf8" and
>>> meinTest(name="äöü")
error traceback changes to:
Traceback (most recent call last):
File "<console>", line 1, in ?
File "/usr/lib/python2.4/site-packages/SQLObject-0.8.0-py2.4.egg/sqlobject/declarative.py", line 94, in _wrapper
return fn(self, args, kwargs)
File "/usr/lib/python2.4/site-packages/SQLObject-0.8.0-py2.4.egg/sqlobject/main.py", line 1231, in init
self._create(id, kw)
File "/usr/lib/python2.4/site-packages/SQLObject-0.8.0-py2.4.egg/sqlobject/main.py", line 1258, in _create
self.set(*kw)
File "/usr/lib/python2.4/site-packages/SQLObject-0.8.0-py2.4.egg/sqlobject/main.py", line 1111, in set
value = to_python(dbValue, self._SO_validatorState)
File "/usr/lib/python2.4/site-packages/SQLObject-0.8.0-py2.4.egg/sqlobject/col.py", line 549, in to_python
return unicode(value, self.db_encoding)
File "/usr/lib/python2.4/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 4-6: unexpected end of data
The Database, the table and the corresponding field "name" in the
table have the collation "utf8_unicode_ci", charset "utf8"
What can i do to solve this problem?
Regards, Frank
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
AFAIK, the Postgres drivers only read (and possibly write) strings and not unicode, whereas MySQLdb can return unicode values directly; there is a use_unicode=bool option to connect() which controls this. You can always pass unicode values to cursor.execute() whether this is set or not; it only controls whether text-like columns are returned as unicode or string.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello Group!
I am having some character-set problems with MySQLdb within a turbogears project.
my dsn:
sqlobject.dburi="notrans_mysql://user:pass@localhost/database?charset=utf8&debug=True"
my class:
class aTest(SQLObject):
class sqlmeta:
table="atest_test"
in the shell:
>>> aTest(name="abc")
<aTest 1L name=u'abc'>
this one was alright BUT:
>>> meinTest(name="äöü") # causes:
Traceback (most recent call last):
(...)
File "/usr/lib/python2.4/site-packages/SQLObject-0.8.0-py2.4.egg/
sqlobject/main.py", line 1111, in set
value = to_python(dbValue, self._SO_validatorState)
File "/usr/lib/python2.4/site-packages/SQLObject-0.8.0-py2.4.egg/
sqlobject/col.py", line 549, in to_python
return unicode(value, self.db_encoding)
File "/usr/lib/python2.4/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 0-2:
invalid data
If i change dsn and append "?charset=utf8" and
>>> meinTest(name="äöü")
error traceback changes to:
Traceback (most recent call last):
File "<console>", line 1, in ?
File "/usr/lib/python2.4/site-packages/SQLObject-0.8.0-py2.4.egg/sqlobject/declarative.py", line 94, in _wrapper
return fn(self, args, kwargs)
File "/usr/lib/python2.4/site-packages/SQLObject-0.8.0-py2.4.egg/sqlobject/main.py", line 1231, in init
self._create(id, kw)
File "/usr/lib/python2.4/site-packages/SQLObject-0.8.0-py2.4.egg/sqlobject/main.py", line 1258, in _create
self.set(*kw)
File "/usr/lib/python2.4/site-packages/SQLObject-0.8.0-py2.4.egg/sqlobject/main.py", line 1111, in set
value = to_python(dbValue, self._SO_validatorState)
File "/usr/lib/python2.4/site-packages/SQLObject-0.8.0-py2.4.egg/sqlobject/col.py", line 549, in to_python
return unicode(value, self.db_encoding)
File "/usr/lib/python2.4/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 4-6: unexpected end of data
The Database, the table and the corresponding field "name" in the
table have the collation "utf8_unicode_ci", charset "utf8"
What can i do to solve this problem?
Regards, Frank
There's nothing in your traceback that would indicate a problem with MySQLdb, only SQLobject.
i was just wondering - because i just switched from postgres to mysql then the error occured, on postgres there wasn´t any...
but i solved it now by telling SQLObject to use utf8 internally by adding ?sqlobject_charset=utf8 to the dsn.
regards,
Frank
AFAIK, the Postgres drivers only read (and possibly write) strings and not unicode, whereas MySQLdb can return unicode values directly; there is a use_unicode=bool option to connect() which controls this. You can always pass unicode values to cursor.execute() whether this is set or not; it only controls whether text-like columns are returned as unicode or string.