Re: [Pyobjc-dev] depythonify_c_value rejects non-ascii, non-unicode strings

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

> We had this (short) discussion before:
> http://sourceforge.net/mailarchive/message.php?msg_id=6595522

Thank you for pointing it out; I had not seen it.

> I've come to the conclusion that if the Python program doesn't handle 
> all text as unicode, then it's broken.  This is really just PyObjC 
> telling you to fix your code.

I only partially agree. It is true that internally, a Python program 
should use unicode all the way; but nobody should force me to use 
unicode on the output. The case I am raising is that I have a Python 
program with Latin-1 output, which is picked up by another Python 
program, which is encoding-agnostic, and transfers it to the bridge. 
The two programs are totally disconnected, except through I/O, and that 
I/O may use another encoding.

Now, maybe what you are saying amounts to the suggestion that the 
second program should know (or be told) about the encoding of the first 
program's output; and that makes sense. However, there may be cases, 
such as mine, where it makes sense for the Python program to use 
encoded (non-unicode) data internally, and not to care about it, and 
(supposing I know the encoding) I should not have to convert to unicode 
before calling the bridge at every point.
(Granted, in this case, we could convert to unicode at the interface 
between both programs, but that may not always be the case...)
So let me then make a plea for an API so that a PyObjC program can tell 
the bridge to use an encoding other than the system default, if 
specified, even if the default behaviour remains identical, i.e. throw 
exceptions upon non-ascii strings.
That way, only a program that knows what it is doing will modify the 
behaviour, and no data will be lost by default; but a program that has 
good architectural reasons to do so might still use another encoding 
internally.

Marc-Antoine Parent