Re: [Pyobjc-dev] depythonify_c_value rejects non-ascii, non-unicode strings
Brought to you by:
ronaldoussoren
From: Marc-Antoine P. <map...@ac...> - 2004-01-21 17:20:08
|
> We had this (short) discussion before: > http://sourceforge.net/mailarchive/message.php?msg_id=6595522 Thank you for pointing it out; I had not seen it. > I've come to the conclusion that if the Python program doesn't handle > all text as unicode, then it's broken. This is really just PyObjC > telling you to fix your code. I only partially agree. It is true that internally, a Python program should use unicode all the way; but nobody should force me to use unicode on the output. The case I am raising is that I have a Python program with Latin-1 output, which is picked up by another Python program, which is encoding-agnostic, and transfers it to the bridge. The two programs are totally disconnected, except through I/O, and that I/O may use another encoding. Now, maybe what you are saying amounts to the suggestion that the second program should know (or be told) about the encoding of the first program's output; and that makes sense. However, there may be cases, such as mine, where it makes sense for the Python program to use encoded (non-unicode) data internally, and not to care about it, and (supposing I know the encoding) I should not have to convert to unicode before calling the bridge at every point. (Granted, in this case, we could convert to unicode at the interface between both programs, but that may not always be the case...) So let me then make a plea for an API so that a PyObjC program can tell the bridge to use an encoding other than the system default, if specified, even if the default behaviour remains identical, i.e. throw exceptions upon non-ascii strings. That way, only a program that knows what it is doing will modify the behaviour, and no data will be lost by default; but a program that has good architectural reasons to do so might still use another encoding internally. Marc-Antoine Parent |