[cx-oracle-users] Issue with UTF-8 encoding
Brought to you by:
atuining
From: Guilherme M. <gui...@gm...> - 2010-11-17 16:59:08
|
Hi, I am having an encoding issue using Python 2.6.2, cxOracle 5.0.4 to access an Oracle 11g database. I am using the NLS_LANG=.AL32UTF8 environment variable. My table in Oracle is correctly configured to accept Unicode. I compiled cxOracle without the WITH_UNICODE flag and passed unicode() objects to cxOracle. Everything worked without exceptions or warnings. However, sometimes Oracle would complain that the string I was trying to insert into a VARCHAR2 field was too big (> 4000), even when the string size ( len(the_string.encode('utf-8')) ) was about 2300 bytes. I used a sniffer to verify that the Oracle client was sending two bytes for each character (even the ASCII ones), instead of sending two bytes only for special characters. It seemed to me that cx_Oracle accepts unicode() objects but it does not encode() them to the correct encoding (as set in NLS_LANG variable) if the WITH_UNICODE flag is unset. Instead, it just sends to Oracle the internal representation of the unicode() object. Is this behaviour expected? Am I doing something wrong? Regards, Guilherme. |