[cx-oracle-users] Unicode problems with Python 3 and Cx_Oracle 5.1.2

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi,

I'm having problems in getting cx_oracle working with Unicode using
python 3

The setup:

Python 3.3.2
Oracle Server: 11g
Client library: 11_2_03

NLS_LANG: American_America.UTF8
NLS_CHARACTERSET: WE8MSWIN1252
NLS_NCHAR_CHARACTERSET: AL16UTF16

The db table contains both varchar2 and nvarchar2 columns.

I can use SqlPlus to read and write Unicode strings in the nvarchar2
columns without any problems, however in cx_oracle, the same
statements will result in a conversion loss and the "to big" chars
will be replaced with ¿ both in the read and inserted data. Since
things works with sqlplus I would assume that the server/client lib
side is fine but there is something I missed in how I use cx_oracle.

Goggling on the issue, I found this in the oracle unicode guidelines:

http://docs.oracle.com/cd/B28359_01/server.111/b28298/ch7progrunicode.htm#i1006452.

"When you bind or define SQL NCHAR datatypes and do not set
OCI_ATTR_CHARSET_FORM, data conversions take place from client
character set to the database character set, and from the database
character set to the national database character set. In the worst
case, data loss can occur if the database character set is smaller
than the client's."

This seems to describe the problem that I see spot on, i.e that the
string passes through the database charset (WE8MSWIN1252) when doing
the conversion between client UTF8 and server UTF16.

In the python 2.x code path it looks like OCI_ATTR_CHARSET_FORM is
explicitly set depending on the database column type but I can't find
any similar code for the python3 code path. Doing a quick hack and
actually set this attribute for strings seem to indicate that this is
a problem and the values shows up nicely when making queries.

So is this setup not supported, anything I missed and are there any
way to get around this when the database charset is not Unicode?

Thanks
Joakim