Re: [cx-oracle-users] Antw: Bad conversion of a unicode value?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Amaury Forgeot d'Arc schrieb:
> matilda matilda wrote:
>>>>> Michael Schlenker <ms...@co...> 28.11.2007 16:31 >>>
>>> Michael Schlenker schrieb:
>>> Okay, i got my test to work after patching cx_Oracle a little bit.
>> Anthony will be happy to hear that.  ;-) Anthony: Are you still here?
>>
>>> From taking a closer look at the code Unicode support is at best to be described as
>>> 'rudimentary', lots of fine points still missing in there.
>> I'm sure Anthony will agree. Especially with the upcoming Py3000 there will
>> be many questions to answer regarding byte-strams, unicode-streams, characterset
>> conversion (implicit/explicit), character representation.
>>
>> See the change history to see when Anthony started to focus on character set
>> conversion.
>>
>> Amaury Forgeot d'Arc who also gives valueable input is probably also interested
>> in that topic while speeking and writing a language with many special characters.
> 
> I indeed proposed a patch one year ago, to support unicode.
> It was against version 4.2.1, I join it again in the hope it can be useful.

Looks good, the minimal stuff i did goes a similar way but I didn't use UTF16 yet,
so there might be buffers with problems due to UTF-8 variable length...

I'll try to use your stuff with a recent cx_Oracle if i find the time.

And yes, it will break with UCS-4 builds of Python..., easy to fix though, if
one uses AL32UTF-8 instead of the UTF16 code and converts on read. Makes the code
immune against possible BigEndian vs LittleEndian problems too (although i assume
those are handled by OCI for UTF-16 anyway.) But surrogates and the astral plane
is a treacherous ground anyway, so if BMP works for a start its nice.

Michael