Re: [cx-oracle-users] Antw: Bad conversion of a unicode value?
Brought to you by:
atuining
From: Anthony T. <ant...@gm...> - 2007-11-29 18:37:21
|
On Nov 28, 2007 9:28 AM, Michael Schlenker <ms...@co...> wrote: > matilda matilda schrieb: > >>>> Michael Schlenker <ms...@co...> 28.11.2007 16:31 >>> > > > >> From taking a closer look at the code Unicode support is at best to be described as > >> 'rudimentary', lots of fine points still missing in there. > > > > I'm sure Anthony will agree. Especially with the upcoming Py3000 there will > > be many questions to answer regarding byte-strams, unicode-streams, characterset > > conversion (implicit/explicit), character representation. > > I thought Py3k should solve those questions ;-)... > > I usually write Tcl where unicode just works and you don't step into the deep > morass that Python string/uncode dichotomy is all the time. Basically it uses > a model very close to the one now used by Py3k, but got it stable some years ago > (around Tcl 8.2). > > For cx_Oracle i would like to see maximum ease of use, e.g. expunge the evil > NLS_LANG dependency as it should never ever be needed if your using Python ( > case 1: non unicode db encoding -> let cx_Oracle convert to python system encoding > internally or to unicode, and get happy, case 2: unicode db encoding -> let cx_Oracle > just eat Python Unicode strings and spit them out). Not everyone is going to want to switch wholesale to unicode immediately -- the amount of pain that would cause is considerable. I would agree that long term, that is the way to go, but some intermediate steps should probably be taken. I am thinking that it might work to do the following: Add a connection constructor parameter that specified that __all__ strings should be returned as unicode objects, not string objects. This parameter would default to False to get the current behavior -- string objects returned (fixing whatever is causing the problem right now). It would likely be reasonable to return unicode for nvarchar2 data regardless of this setting but I'd appreciate some feedback on that. I've also considered some setting that would allow you to specify unicode for certain columns but perhaps it would be better to go all unicode or the hybrid I suggested above, and not confuse matters by adding that capability. Thoughts? > There are some pitfalls for literal SQL in the path, but i think you can write > the interface in a way that you don't have to worry about all this stupid encoding > conversion stuff forced upon us by Oracle, Python and other forces... What are you referring to about literal SQL in the path? > >> With that patch (against the cx_Oracle-4.3.3.zip file) at least my test runs through > >> cleanly when i set the right environment. One surely can do better.... > > > > Can you easily enhance the tests in the test directory to unit test the > > charset conversion cases? > > Haven't looked there yet. But maybe i can rip something from my nosetest files when done. It doesn't have to be pretty, just work. I can pretty it up as long as I know what you are trying to accomplish or can ask you questions, of course. :-) > Basically i need to first get permission to invest more time in these issues with cx_Oracle, > it could happen that my manager decides to go a different route... :-(. Understood. Let us know. Anthony |