-----BEGIN PGP SIGNED MESSAGE-----
Ian Bicking wrote:
| Max Ischenko wrote:
|>>> Most modern dbs and python drivers handle this problem
|>>> transparently. At least, that's my experience.
|>> Can you show an example? Using a snippet of code, with a DB API
|> Guess not. ;-)
|> Just checked with psycopg -- it does requre a parameter to be encoded
|> into something like utf-8.
psycopg2 does it automatically I believe.
|> Either my memory make me a disservice or this psycopg is somehow
|> broken. ;-)
| I suspect it's something about Unicode in databases being a pain in the
| ass. At least, that's what I'm guessing; I've never tried to do it,
| I've only stored ASCII and stuff that I treat as though its binary data.
We happily throw Unicode strings through SQLObject and it gives us
nothing but Unicode strings back. Welcome to the new millenium :-)
To do this, we patched DBAPI._executeRetry and the StringValidator class:
~ def _executeRetry(self, conn, cursor, query):
~ if isinstance(query, unicode):
~ query = query.encode('utf8')
~ # raise UnicodeError if it is not valid utf8 already
~ return cursor.execute(query)
~ def fromPython(self, value, state):
~ if isinstance(value, unicode):
~ return value.encode('utf8')
~ return value
~ def toPython(self, value, state):
~ if isinstance(value, str):
~ return value.decode('utf8')
~ return value
This of course should really be done in the PostgreSQL driver somewhere
but the above hack is fine for our needs at the moment. And psycopg2
might make it all irrelevant anyway.
| I might note when I installed postgres on Debian, it asked me questions
| about encoding. This might imply that encoding setup in an
| installation-wide (not per-database or per-session) fashion. Then it
| also asked me about how I wanted to format my dates. I answered ISO,
| but what madness would happen if someone selected US format dates? I
| doubt psycopg knows anything about what format date the server is using.
| Maybe these are just defaults, and by explicitly setting up the
| configuration for the connection you can avoid the madness.
PostgreSQL hard codes the locale at initdb time, I believe because the
locale you use for collation order affects index creation and cannot be
changed. It is a pita, because most people really want to use the C
locale and the only way of reverting is to blow away your data directory
and recreate it.
But the locale has nothing to do with the encoding - you can happily
create databases with whatever encoding you like no matter what locale
you selected at initdb time.
Stuart Bishop <stuart.bishop@...> http://www.canonical.com/
Canonical Ltd. http://www.ubuntulinux.com/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
-----END PGP SIGNATURE-----