From: Jim H. <ftp...@jo...> - 2003-06-20 20:17:31
|
Hello, I'm having trouble with Unicode. I have set up the database to use UTF-8. I am trying to get some non-ASCII strings in there and I am failing. Here is a short version of the script: -------------------------------------- #!/usr/bin/python -u # test unicode support in pyPgSQL import string; from pyPgSQL import libpq from pyPgSQL import PgSQL dBconnection=PgSQL.connect("::ctanWeb:ftpmaint:") dBcursor=dBconnection.cursor() dBcursor.execute("SELECT * FROM authors") print dBcursor.fetchone() # works fine; I'm connected # these two also work fine dBcursor.execute("INSERT INTO authors (name,email) VALUES ('Mike Jones','mik...@ct...')") dBcursor.execute("INSERT INTO authors (name,email) VALUES (%(name)s,%(email)s)",{'name':'Mike Jones','email':'mik...@ct...'}) # this does not work dBcursor.execute("INSERT INTO authors (name,email) VALUES ('Mike Schr\xf6der','mik...@ct...')") # I cannot get to run, even dBcursor.execute("INSERT INTO authors (name,email) VALUES (%(name)s,%(email)s)",{'name':u'Mike Schr\xf6der','email':u'mik...@ct...'}) ------------------------------------- The line that does not work gives me the error message: Traceback (most recent call last): File "./test_uni.py", line 25, in ? dBcursor.execute("INSERT INTO authors (name,email) VALUES ('Mike Schr\xf6der','mik...@ct...')") File "/usr/lib/python2.2/site-packages/pyPgSQL/PgSQL.py", line 2956, in execute raise OperationalError, msg libpq.OperationalError: ERROR: Unicode >= 0x10000 is not supoorted I cannot make it out: is that a PostgreSQL objection or does it come from PyPgSQL? (By the way, the 'supoorted' is not me.) So my questions are: (1) Is Pg unable to take this character? (2) What is the right format for a dictionary substitution? I can't get the final line past an ASCII encoding error: ordinal not in range. Thanks for any help at all; I tried looking in the archives of this list but I had no luck, Jim Hefferon |
From: <gh...@gh...> - 2003-06-27 05:00:37
|
Jim Hefferon wrote: > Hello, > > I'm having trouble with Unicode. I have set up the database to use UTF-8. > I am trying to get some non-ASCII strings in there and I am failing. > > Here is a short version of the script: > -------------------------------------- > #!/usr/bin/python -u > # test unicode support in pyPgSQL > import string; > from pyPgSQL import libpq > from pyPgSQL import PgSQL > > dBconnection=PgSQL.connect("::ctanWeb:ftpmaint:") First problem: You need to tell *pyPgSQL*, which client encoding to use. Use the parameter client_encoding="utf-8". To also get back Unicode strings for text columns, use the parameter unicode_results=1. > dBcursor=dBconnection.cursor() Second problem: you need to tell the PostgreSQL backend in which client encoding you send your data and want it back: dBcursor.execute("set client_encoding to unicode") > dBcursor.execute("SELECT * FROM authors") > print dBcursor.fetchone() # works fine; I'm connected > > # these two also work fine > dBcursor.execute("INSERT INTO authors (name,email) VALUES ('Mike Jones','mik...@ct...')") > dBcursor.execute("INSERT INTO authors (name,email) VALUES (%(name)s,%(email)s)",{'name':'Mike Jones','email':'mik...@ct...'}) > > # this does not work > dBcursor.execute("INSERT INTO authors (name,email) VALUES ('Mike Schr\xf6der','mik...@ct...')") This cannot work, because it is not a valid UTF-8 string. Consider: ISO-8859-1: 'Mike Schröder' UTF-8: 'Mike Schr\xc3\xb6der' > # I cannot get to run, even > dBcursor.execute("INSERT INTO authors (name,email) VALUES (%(name)s,%(email)s)",{'name':u'Mike Schr\xf6der','email':u'mik...@ct...'}) This, however, will work with my above adjustments. > [...] -- Gerhard |
From: Karsten H. <Kar...@gm...> - 2003-06-27 10:20:23
|
Billy, you stated that ; is a statement _separator_ in SQL which makes sense. However, this: http://www.postgresql.org/docs/view.php?version=7.3&idoc=1&file=sql-syntax.html#SQL-SYNTAX-LEXICAL (1.1.4. Special Characters, 5th bullet) suggests otherwise. Given PGs track record of being (sanely) anal about SQL conformity I wonder what to make of it. I certainly am no expert on such matters. Thanks for your help, Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346 |
From: Billy G. A. <bil...@mu...> - 2003-06-28 02:54:10
|
Karsten Hilbert wrote: >Billy, > >you stated that ; is a statement _separator_ in SQL which >makes sense. However, this: > > http://www.postgresql.org/docs/view.php?version=7.3&idoc=1&file=sql-syntax.html#SQL-SYNTAX-LEXICAL > (1.1.4. Special Characters, 5th bullet) > >suggests otherwise. Given PGs track record of being (sanely) >anal about SQL conformity I wonder what to make of it. I >certainly am no expert on such matters. > >Thanks for your help, >Karsten > > First, I didn't say that, someone else on the list did ;-) The point your refering to state a semi-colon is a command terminater. Multiple commands can be entered (even on the same line) terminated (seperated) by a semi-colon. Commands are also terminated by the end of the input stream, which is why a semi-colon is not needed in the query given to the execute() method if there is only one command in the query. Note that adding the semi-colon to a single command query does not cause any problems, it just terminates the one command. I guess I'm trying to say is that if you equate 'statement' to 'command' and 'separator' to 'terminator', then there is no conflict. -- ___________________________________________________________________________ ____ | Billy G. Allie | Domain....: Bil...@mu... | /| | 7436 Hartwell | MSN.......: B_G...@em... |-/-|----- | Dearborn, MI 48126| |/ |LLIE | (313) 582-1540 | |
From: Karsten H. <Kar...@gm...> - 2003-07-24 22:40:21
|
Hello Gerhard, >> I'm having trouble with Unicode. I have set up the database to use UTF-8. >> I am trying to get some non-ASCII strings in there and I am failing. >> dBconnection=PgSQL.connect("::ctanWeb:ftpmaint:") > You need to tell *pyPgSQL*, which client encoding to use. Use the > parameter client_encoding="utf-8". Is this also necessary if I use a client encoding of, say, "latin1" or "iso-8859-15" or some such ? What I mean is: It makes sense to tell pyPgSQL that my Python code uses "utf-8" in strings so it knows that it has to deal with u''-strings rather than ''-strings but why would I need this if I use latin1 in ''-strings ? In fact, pyPgSQL might be able to distinguish u'' from ''-strings by a type check (u'' should return "unicode", shouldn't it ?). So I wonder why I need to tell *pyPgSQL* what encoding I use ? (I understand why I need to tell PostgreSQL about that.) I would be thinking that pyPgSQL just hands it's input to libq. Or does it need to know about the encoding so it knows how to properly quote/escape input ? > To also get back Unicode strings for > text columns, use the parameter unicode_results=1. That makes sense in that I will get back u'' strings instead of ''-strings. Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346 |
From: <gh...@gh...> - 2003-07-24 23:29:33
|
Karsten Hilbert wrote: > Hello Gerhard, > >>>I'm having trouble with Unicode. I have set up the database to use UTF-8. >>>I am trying to get some non-ASCII strings in there and I am failing. > > >>>dBconnection=PgSQL.connect("::ctanWeb:ftpmaint:") >> >>You need to tell *pyPgSQL*, which client encoding to use. Use the >>parameter client_encoding="utf-8". > > Is this also necessary if I use a client encoding of, say, > "latin1" or "iso-8859-15" or some such? [...] If in your Python code you use Unicode strings, you need to: a) use client_encoding parameter in connect() call b) tell the PostgreSQL backend with "SET CLIENT_ENCODING TO ..." if in your Python code you use only byte strings, you need to: - tell the PostgreSQL backend with "SET CLIENT_ENCODING TO ..." All clear now? Or should I explain in more detail? -- Gerhard |
From: Karsten H. <Kar...@gm...> - 2003-07-25 07:20:22
|
> If in your Python code you use Unicode strings, you need to: > a) use client_encoding parameter in connect() call > b) tell the PostgreSQL backend with "SET CLIENT_ENCODING TO ..." > > if in your Python code you use only byte strings, you need to: > - tell the PostgreSQL backend with "SET CLIENT_ENCODING TO ..." > > All clear now? Yes. Thank you. I will now try to understand the *reason* by reading the source. Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346 |