Re: [Pyobjc-dev] depythonify_c_value rejects non-ascii, non-unicode strings
Brought to you by:
ronaldoussoren
From: Bob I. <bo...@re...> - 2004-01-21 19:51:33
|
On Jan 21, 2004, at 2:27 PM, Marc-Antoine Parent wrote: >>> That sentence agrees with my point the second time: What if I _do_=20= >>> know the encoding, and I want to tell the bridge about it? >>> Your point is that I should convert strings to unicode before the=20 >>> bridge; my point is that I may be calling the bridge in quite a few=20= >>> places, and converting there may not be practical. >>> Whereas if the bridge had a simple API, viz. >>> PyObjC.setStringEncoding(str) >>> PyObjC.getStringEncoding() >>> getting and setting a variable which defaults to the system's=20 >>> default encoding, >>> then it would be easy to still use (single-byte) strings in Python=20= >>> if so desired (again, do realize that one is often dealing with=20 >>> someone else's code, and reengineering it is not always practical.) >> >> The problem with this proposal is that you want a function to change=20= >> the encoding related to *your* code, the proposed API changes the=20 >> encoding for *all* code that uses the bridge. > > Do you mean that this global would be shared by two different python=20= > programs using the bridge? (i.e. in different processes...) > That would be indeed very dangerous and fully justify your reluctance.=20= > Otherwise, see my point in another post about uniqueness of GUI. > >> If you had control over all of the code then it would be fine, but=20= >> in that case you would also be able to just change Python's default=20= >> encoding. > > Remember that I cannot do it after startup, > >>>> If you want/need to exchange arbitrary data you're going to have to=20= >>>> explicitly put it in NSData. >>> >>> That would be valid for arbitrary data; but strings of a _known_=20 >>> encoding are not arbitrary data. >> >> Yeah they are, they're arbitrary data until they're combined with the=20= >> encoding metadata -- which is the unicode type. > > My point was to allow for more than one way to combine them. Unicode=20= > is one solution, and my favoured solution in most cases, but not=20 > always the best solution, and sometimes not practically available. I think I understand your problem now, you have a console program that=20= is interacting with a GUI application via a pipe. This GUI=20 application is trying to display the output of your program, but since=20= it does not know the encoding of your text it is passing on NSString=20 and crossing its fingers. The correct solution is, of course, to fix=20 the GUI application; the way it is handling text is broken. Solution: Possibly use a configuration panel for the GUI to choose the encoding=20 of incoming pipes Use codecs.getreader(your_encoding) on the pipe, and use that to create=20= NSStrings. >>> import sys >>> import codecs >>> input =3D codecs.getreader('utf8')(sys.stdin) >>> input.readline() =8E=F0 u'\xe9\uf8ff\n' -bob |