Re: [Pyobjc-dev] depythonify_c_value rejects non-ascii, non-unicode strings

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On Jan 21, 2004, at 11:24 AM, Marc-Antoine Parent wrote:

> I am writing some Python code that has to output Latin-1 text.
> Some of that output makes its way through other (python) code to a 
> text widget through insertText_. The other code does not know about my 
> encoding choice, as it is not my code, but Glenn Andreas' PyOxide IDE; 
> it should not know about encoding. So it simply passes along my 
> Latin-1 strings to the insertText_ method of a text widget, where the 
> PyObjC bridge tries to make it into a NSString.

We had this (short) discussion before:
http://sourceforge.net/mailarchive/message.php?msg_id=6595522

I've come to the conclusion that if the Python program doesn't handle 
all text as unicode, then it's broken.  This is really just PyObjC 
telling you to fix your code.

Here's some important snippets that helped me come to this conclusion:

[Just van Rossum]
  Strongly disagree. This leads to silent errors, possibly even data 
loss.
  You _have_ to know the encoding, and you _have_ to deal with it. If
  there's no way you can know the encoding, you have to explicitly tell
  which encoding or behavior to use.

  Btw. it's not so much PyObjC's behavior, but Python's default str ->
  unicode coercion behavior. Perhaps it's "fixable" in the bridge, but I
  think it's a bad idea to deviate from Python's behavior (in addition to
  that I find it a bad idea to begin with).

[Ronald Oussoren]
  BTW. You should convert all input to unicode instead of waiting for
  problems with the implicit conversion to unicode that is performed by
  PyObjC. You're more likely to know the right encoding while reading the
  data.

-bob