Re: [cx-oracle-users] Unicode issue

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Am 15.01.2012 03:57, schrieb Anthony Tuininga:
> See below.
>
> On Sat, Jan 14, 2012 at 2:17 AM, Uwe Hoffmann<uw...@fa...>  wrote:
>> Hi Anthony,
>>
>>   see below.
>>
>> Regards
>>      Uwe
>>
>> Am 13.01.2012 23:37, schrieb Anthony Tuininga:
>>> Hi,
>>>
>>> I just had a chance to test this out. This is "normal" and "expected"
>>> behavior -- not what you want but that's a different thing!
>>
>> ;-)
>>
>>
>>>> I tested
>>> this with nvarchar2 instead of varchar2 and in that situation *both*
>>> string and unicode parameters work equally well.
>>
>> Yup , I had expected this. But that's not a very practical solution.
>>   From my view of a *client* developer that means: Using unicode
>> with cx_Oracle is tough (For example you can use custom Cursor classes
>> making a parameter conversion from unicode to str on the fly).
>> You have to touch every application when moving from 5.0.4 to 5.1.x.
>
> Well, there are other options that can "solve" this problem which I'll
> get to in a minute. But your argument loses a lot of force when you
> realize that you are essentially saying "I want to use Unicode in my
> client throughout but *NOT* use Unicode in my database". If you used
> Unicode in your database, your client would be perfectly happy. So
> using Unicode is "tough" only when you want to only go halfway. :-)
>

You are right, but unfortunately the world is not ideal. E.g. in my 
environment I have n databases and m clients. I can not enforce a
"only use nvchar2 attributes". So that means I'm stuck with "normal"
strings. But I think that's a normal environment for many people.
(OT: if you choose the wrong database characterset  (e.g.utf16) 
nvarchar2 will double your database size in the "nearly all
  characters are ascii" scenario. I only want to say that there are
many dependencies).

> Fortunately, cx_Oracle does have a method for handling this situation
> so you don't have to write a custom cursor class and manipulate all of
> the parameters on the fly. You can use connection wide or cursor
> specific input and output type handlers. These allow you to manipulate
> the variables passed between Oracle and Python quite conveniently. I
> use this, for example, to turn all of my strings in the database into
> Unicode. The same can be done for converting all unicode bind
> variables to varchar2 type bind variables instead. Oracle does the
> conversion itself on the fly.
>
> def OutputTypeHandler(cursor, name, defaultType, size, precision, scale):
>      if defaultType in (cx_Oracle.STRING, cx_Oracle.FIXED_CHAR):
>          return cursor.var(unicode, size, cursor.arraysize)
>
> def InputTypeHandler(cursor, value, numElements):
>      if isinstance(value, unicode):
>          return cursor.var(str, arraysize = numElements)
>
> Then you can do the following
>
> connection.outputtypehandler = OutputTypeHandler
> connection.inputtypehandler = InputTypeHandler
>
> That should cover everything for you quite nicely and make Unicode
> "easy" again. :-)

yes. (maybe with the exception of the casual user here)

>
>> As a side note: What does this mean (performance wise) for python3.x
>> and varchar2 attributes. I haven't tested this.
>
> Well, everything is converted to string, as it turns out, so there is
> no performance problem as Oracle is perfectly fine with accepting
> string and converting it to Unicode but the reverse is not true. That
> suggests, of course, that I should be doing the same thing in both
> versions of Python but to be perfectly honest it never really crossed
> my mind. :-) I'll give it some thought.
>

that will also solve the "problems" of the casual user. The question is:
Are there any overfluous (character) conversions on the client (or the 
server) side.

regards
   Uwe

>> You can also add this
>>> line
>>>
>>> cursor.setinputsizes(P1 = str)
>>>
>>> That will force Oracle to treat it as a string (instead of Unicode)
>>> and then all is well as well.
>>>
>> If you have large applications with many queries it's also impractical
>> to touch every query.
>>
>>
>>> Hope that helps.
>>
>>>
>>> Anthony
>>>
>>> 2011/12/6 Brian Watson<bw...@ma...>:
>>>> Hello everyone,
>>>>
>>>> I'm referencing bug 3442096:
>>>>
>>>> https://sourceforge.net/tracker/index.php?func=detail&aid=3442096&group_id=84168&atid=571972
>>>>
>>>> We just updated a webapp from SQLAlchemy 0.4 to 0.7.3.  Once we went live,
>>>> the server slowed to a crawl.  We put them back on their old server while we
>>>> diagnosed.  I hadn't noticed before, but the new orm emits unicode
>>>> parameters.  The unicode version is much slower:
>>>>
>>>> 2011-12-06 14:12:24,439 INFO sqlalchemy.engine.base.Engine SELECT
>>>> [......CUT......]
>>>> FROM demp_inventory
>>>> WHERE demp_inventory.inventory_snsn = :inventory_snsn_1 AND
>>>> demp_inventory.inventory_slocation = :inventory_slocation_1 AND
>>>> demp_inventory.inventory_sentityid = :inventory_sentityid_1 AND
>>>> demp_inventory.inventory_scc = :inventory_scc_1 AND
>>>> (demp_inventory.inventory_sshelflifeid IS NULL OR
>>>> demp_inventory.inventory_sshelflifeid = :inventory_sshelflifeid_1) AND
>>>> (demp_inventory.inventory_sserialno IS NULL OR
>>>> demp_inventory.inventory_sserialno = :inventory_sserialno_1))
>>>> WHERE ROWNUM<= :ROWNUM_1
>>>> 2011-12-06 14:12:24,439 INFO sqlalchemy.engine.base.Engine
>>>> {'inventory_snsn_1': u'0000LLCQA7546', 'inventory_slocation_1':
>>>> u'JNK1234114', 'inventory_sserialno_1': '', 'ROWNUM_1': 1,
>>>> 'inventory_sentityid_1': u'bae9b856-d874-11df-89f4-0022191e6cbe',
>>>> 'inventory_sshelflifeid_1': '', 'inventory_scc_1': u'A'}
>>>>
>>>> 0.419656 seconds to execute
>>>>
>>>>
>>>> 2011-12-06 13:59:15,882 INFO sqlalchemy.engine.base.Engine SELECT
>>>> [......CUT......]
>>>> FROM demp_inventory
>>>> WHERE demp_inventory.inventory_snsn = :inventory_snsn_1 AND
>>>> demp_inventory.inventory_slocation = :inventory_slocation_1 AND
>>>> demp_inventory.inventory_sentityid = :inventory_sentityid_1 AND
>>>> demp_inventory.inventory_scc = :inventory_scc_1 AND
>>>> (demp_inventory.inventory_sshelflifeid IS NULL OR
>>>> demp_inventory.inventory_sshelflifeid = :inventory_sshelflifeid_1) AND
>>>> (demp_inventory.inventory_sserialno IS NULL OR
>>>> demp_inventory.inventory_sserialno = :inventory_sserialno_1))
>>>> WHERE ROWNUM<= :ROWNUM_1
>>>> 2011-12-06 13:59:15,882 INFO sqlalchemy.engine.base.Engine
>>>> {'inventory_snsn_1': '0000LLCQA7546', 'inventory_slocation_1': 'JNK1234114',
>>>> 'inventory_sserialno_1': '', 'ROWNUM_1': 1, 'inventory_sentityid_1':
>>>> 'bae9b856-d874-11df-89f4-0022191e6cbe', 'inventory_sshelflifeid_1': '',
>>>> 'inventory_scc_1': 'A'}
>>>>
>>>> 0.014337 seconds to execute
>>>>
>>>> I've tested on cx_Oracle 5.0.3, 5.0.4, and 5.1.1 with identical results.
>>>>    I've learned my lesson on deployment, and I'll never do another rollout
>>>> without performance testing first.  I've tried various fixes so far with no
>>>> luck.  I've downloaded the source to try and diagnose my own problem, but
>>>> thats easier said than done.  I'm trying to get sqla to nix unicode params.
>>>>    If anyone has any suggestions or things to try, PLEASE let me know.  We use
>>>> cx in production and would buy support or donate to solve our issue.
>>>>
>>>> Many thanks,
>>>> Brian Watson