Thread: [Pyobjc-dev] depythonify_c_value rejects non-ascii, non-unicode strings
Brought to you by:
ronaldoussoren
From: Marc-Antoine P. <map...@ac...> - 2004-01-21 16:24:15
|
Good day, all! I am writing some Python code that has to output Latin-1 text. Some of that output makes its way through other (python) code to a text widget through insertText_. The other code does not know about my encoding choice, as it is not my code, but Glenn Andreas' PyOxide IDE; it should not know about encoding. So it simply passes along my Latin-1 strings to the insertText_ method of a text widget, where the PyObjC bridge tries to make it into a NSString. In objc_support.c, in int depythonify_c_value (const char *type, PyObject *argument, void *datum) We have the following code (currently around line 1300:) as_unicode = PyUnicode_Decode( strval, len, PyUnicode_GetDefaultEncoding(), "strict"); if (as_unicode == NULL) { PyErr_Format(PyExc_UnicodeError, "depythonifying 'id', got " "a string with a non-default " "encoding"); return -1; } Now, it turns out that the DefaultEncoding is ascii, unless specified otherwise in PyUnicode_SetDefaultEncoding.... (from /System/Library/Frameworks/Python.framework/Headers/unicodeobject.h) Now, that means that in many cases, I get the immediately following error and no output at all. It is fairly easy to set the default encoding at startup (thanks to Glenn for pointing this out to me) using sys.setdefaultencoding('iso-8859-1') in a sitecustomize.py. However, this can only be done at Python startup, and I fear many users of the bridge may not know about this limitation. I propose that the PyObjC bridge use a less restrictive encoding than the current (bizarre) platform default, so as to allow Python to output encoded text to Cocoa widgets. (Maybe the bridge should have a hook to set the platforn default when the Python subsystem is started?) I suggest Latin 1, as it is the most common encoding, and the one most likely to be used by most (unix-written) Python code; even if the python code uses another encoding, as Latin-1 lets bytes pass through identically to widgets, if the user sees gibbersih it will be familiar gibberish. But I am sure a case could be made for mac-roman as well. Another solution (Glenn's suggestion) is to at least not decode it 'strict'ly, using 'ignore' or at worst 'replace' to allow some of the text at least to reach the user... Whatever the correct solution, I feel that the current situation (rejecting any encoded non-ascii text) is overly restrictive. Thank you for your attention, Marc-Antoine Parent |
From: Bob I. <bo...@re...> - 2004-01-21 16:49:17
|
On Jan 21, 2004, at 11:24 AM, Marc-Antoine Parent wrote: > I am writing some Python code that has to output Latin-1 text. > Some of that output makes its way through other (python) code to a > text widget through insertText_. The other code does not know about my > encoding choice, as it is not my code, but Glenn Andreas' PyOxide IDE; > it should not know about encoding. So it simply passes along my > Latin-1 strings to the insertText_ method of a text widget, where the > PyObjC bridge tries to make it into a NSString. We had this (short) discussion before: http://sourceforge.net/mailarchive/message.php?msg_id=6595522 I've come to the conclusion that if the Python program doesn't handle all text as unicode, then it's broken. This is really just PyObjC telling you to fix your code. Here's some important snippets that helped me come to this conclusion: [Just van Rossum] Strongly disagree. This leads to silent errors, possibly even data loss. You _have_ to know the encoding, and you _have_ to deal with it. If there's no way you can know the encoding, you have to explicitly tell which encoding or behavior to use. Btw. it's not so much PyObjC's behavior, but Python's default str -> unicode coercion behavior. Perhaps it's "fixable" in the bridge, but I think it's a bad idea to deviate from Python's behavior (in addition to that I find it a bad idea to begin with). [Ronald Oussoren] BTW. You should convert all input to unicode instead of waiting for problems with the implicit conversion to unicode that is performed by PyObjC. You're more likely to know the right encoding while reading the data. -bob |
From: Marc-Antoine P. <map...@ac...> - 2004-01-21 17:20:08
|
> We had this (short) discussion before: > http://sourceforge.net/mailarchive/message.php?msg_id=6595522 Thank you for pointing it out; I had not seen it. > I've come to the conclusion that if the Python program doesn't handle > all text as unicode, then it's broken. This is really just PyObjC > telling you to fix your code. I only partially agree. It is true that internally, a Python program should use unicode all the way; but nobody should force me to use unicode on the output. The case I am raising is that I have a Python program with Latin-1 output, which is picked up by another Python program, which is encoding-agnostic, and transfers it to the bridge. The two programs are totally disconnected, except through I/O, and that I/O may use another encoding. Now, maybe what you are saying amounts to the suggestion that the second program should know (or be told) about the encoding of the first program's output; and that makes sense. However, there may be cases, such as mine, where it makes sense for the Python program to use encoded (non-unicode) data internally, and not to care about it, and (supposing I know the encoding) I should not have to convert to unicode before calling the bridge at every point. (Granted, in this case, we could convert to unicode at the interface between both programs, but that may not always be the case...) So let me then make a plea for an API so that a PyObjC program can tell the bridge to use an encoding other than the system default, if specified, even if the default behaviour remains identical, i.e. throw exceptions upon non-ascii strings. That way, only a program that knows what it is doing will modify the behaviour, and no data will be lost by default; but a program that has good architectural reasons to do so might still use another encoding internally. Marc-Antoine Parent |
From: Bob I. <bo...@re...> - 2004-01-21 17:28:42
|
On Jan 21, 2004, at 12:20 PM, Marc-Antoine Parent wrote: >> We had this (short) discussion before: >> http://sourceforge.net/mailarchive/message.php?msg_id=6595522 > > Thank you for pointing it out; I had not seen it. > >> I've come to the conclusion that if the Python program doesn't handle >> all text as unicode, then it's broken. This is really just PyObjC >> telling you to fix your code. > > I only partially agree. It is true that internally, a Python program > should use unicode all the way; but nobody should force me to use > unicode on the output. The case I am raising is that I have a Python > program with Latin-1 output, which is picked up by another Python > program, which is encoding-agnostic, and transfers it to the bridge. > The two programs are totally disconnected, except through I/O, and > that I/O may use another encoding. > > Now, maybe what you are saying amounts to the suggestion that the > second program should know (or be told) about the encoding of the > first program's output; and that makes sense. However, there may be > cases, such as mine, where it makes sense for the Python program to > use encoded (non-unicode) data internally, and not to care about it, > and (supposing I know the encoding) I should not have to convert to > unicode before calling the bridge at every point. > (Granted, in this case, we could convert to unicode at the interface > between both programs, but that may not always be the case...) > So let me then make a plea for an API so that a PyObjC program can > tell the bridge to use an encoding other than the system default, if > specified, even if the default behaviour remains identical, i.e. throw > exceptions upon non-ascii strings. > That way, only a program that knows what it is doing will modify the > behaviour, and no data will be lost by default; but a program that has > good architectural reasons to do so might still use another encoding > internally. The simple fact of the matter is that NSString is the equivalent to python's unicode. If you unicode('something-with-latin-1') then you will get an exception. There is no reason whatsoever to put arbitrary data in a NSString unless you know its encoding. If you want/need to exchange arbitrary data you're going to have to explicitly put it in NSData. I would almost vote to *disable* the str<->NSString bridge in PyObjC, or make it bridge NSData instead, but that would just be terribly inconvenient for many people. -bob |
From: Marc-Antoine P. <map...@ac...> - 2004-01-21 17:49:45
|
> The simple fact of the matter is that NSString is the equivalent to > python's unicode. If you unicode('something-with-latin-1') then you > will get an exception. There is no reason whatsoever to put arbitrary > data in a NSString unless you know its encoding. That sentence agrees with my point the second time: What if I _do_ know the encoding, and I want to tell the bridge about it? Your point is that I should convert strings to unicode before the bridge; my point is that I may be calling the bridge in quite a few places, and converting there may not be practical. Whereas if the bridge had a simple API, viz. PyObjC.setStringEncoding(str) PyObjC.getStringEncoding() getting and setting a variable which defaults to the system's default encoding, then it would be easy to still use (single-byte) strings in Python if so desired (again, do realize that one is often dealing with someone else's code, and reengineering it is not always practical.) > If you want/need to exchange arbitrary data you're going to have to > explicitly put it in NSData. That would be valid for arbitrary data; but strings of a _known_ encoding are not arbitrary data. > I would almost vote to *disable* the str<->NSString bridge in PyObjC, > or make it bridge NSData instead, but that would just be terribly > inconvenient for many people. Indeed. Marc-Antoine Parent |
From: Bob I. <bo...@re...> - 2004-01-21 18:08:28
|
On Jan 21, 2004, at 12:50 PM, Marc-Antoine Parent wrote: >> The simple fact of the matter is that NSString is the equivalent to >> python's unicode. If you unicode('something-with-latin-1') then you >> will get an exception. There is no reason whatsoever to put >> arbitrary data in a NSString unless you know its encoding. > > That sentence agrees with my point the second time: What if I _do_ > know the encoding, and I want to tell the bridge about it? > Your point is that I should convert strings to unicode before the > bridge; my point is that I may be calling the bridge in quite a few > places, and converting there may not be practical. > Whereas if the bridge had a simple API, viz. > PyObjC.setStringEncoding(str) > PyObjC.getStringEncoding() > getting and setting a variable which defaults to the system's default > encoding, > then it would be easy to still use (single-byte) strings in Python if > so desired (again, do realize that one is often dealing with someone > else's code, and reengineering it is not always practical.) The problem with this proposal is that you want a function to change the encoding related to *your* code, the proposed API changes the encoding for *all* code that uses the bridge. If you had control over all of the code then it would be fine, but in that case you would also be able to just change Python's default encoding. >> If you want/need to exchange arbitrary data you're going to have to >> explicitly put it in NSData. > > That would be valid for arbitrary data; but strings of a _known_ > encoding are not arbitrary data. Yeah they are, they're arbitrary data until they're combined with the encoding metadata -- which is the unicode type. In any case, this really just isn't going to happen. There's too many extremely good reasons not to do it. -bob |
From: Marc-Antoine P. <map...@ac...> - 2004-01-21 19:27:13
|
>> That sentence agrees with my point the second time: What if I _do_ >> know the encoding, and I want to tell the bridge about it? >> Your point is that I should convert strings to unicode before the >> bridge; my point is that I may be calling the bridge in quite a few >> places, and converting there may not be practical. >> Whereas if the bridge had a simple API, viz. >> PyObjC.setStringEncoding(str) >> PyObjC.getStringEncoding() >> getting and setting a variable which defaults to the system's default >> encoding, >> then it would be easy to still use (single-byte) strings in Python if >> so desired (again, do realize that one is often dealing with someone >> else's code, and reengineering it is not always practical.) > > The problem with this proposal is that you want a function to change > the encoding related to *your* code, the proposed API changes the > encoding for *all* code that uses the bridge. Do you mean that this global would be shared by two different python programs using the bridge? (i.e. in different processes...) That would be indeed very dangerous and fully justify your reluctance. Otherwise, see my point in another post about uniqueness of GUI. > If you had control over all of the code then it would be fine, but in > that case you would also be able to just change Python's default > encoding. Remember that I cannot do it after startup, >>> If you want/need to exchange arbitrary data you're going to have to >>> explicitly put it in NSData. >> >> That would be valid for arbitrary data; but strings of a _known_ >> encoding are not arbitrary data. > > Yeah they are, they're arbitrary data until they're combined with the > encoding metadata -- which is the unicode type. My point was to allow for more than one way to combine them. Unicode is one solution, and my favoured solution in most cases, but not always the best solution, and sometimes not practically available. > In any case, this really just isn't going to happen. There's too many > extremely good reasons not to do it. Well, I will stop here, it is clear you do not find my arguments compelling, and that is unfortunately that. We still disagree, but thank you for taking the time to give me your reasons. Regards, Marc-Antoine Parent |
From: Bob I. <bo...@re...> - 2004-01-21 19:51:33
|
On Jan 21, 2004, at 2:27 PM, Marc-Antoine Parent wrote: >>> That sentence agrees with my point the second time: What if I _do_=20= >>> know the encoding, and I want to tell the bridge about it? >>> Your point is that I should convert strings to unicode before the=20 >>> bridge; my point is that I may be calling the bridge in quite a few=20= >>> places, and converting there may not be practical. >>> Whereas if the bridge had a simple API, viz. >>> PyObjC.setStringEncoding(str) >>> PyObjC.getStringEncoding() >>> getting and setting a variable which defaults to the system's=20 >>> default encoding, >>> then it would be easy to still use (single-byte) strings in Python=20= >>> if so desired (again, do realize that one is often dealing with=20 >>> someone else's code, and reengineering it is not always practical.) >> >> The problem with this proposal is that you want a function to change=20= >> the encoding related to *your* code, the proposed API changes the=20 >> encoding for *all* code that uses the bridge. > > Do you mean that this global would be shared by two different python=20= > programs using the bridge? (i.e. in different processes...) > That would be indeed very dangerous and fully justify your reluctance.=20= > Otherwise, see my point in another post about uniqueness of GUI. > >> If you had control over all of the code then it would be fine, but=20= >> in that case you would also be able to just change Python's default=20= >> encoding. > > Remember that I cannot do it after startup, > >>>> If you want/need to exchange arbitrary data you're going to have to=20= >>>> explicitly put it in NSData. >>> >>> That would be valid for arbitrary data; but strings of a _known_=20 >>> encoding are not arbitrary data. >> >> Yeah they are, they're arbitrary data until they're combined with the=20= >> encoding metadata -- which is the unicode type. > > My point was to allow for more than one way to combine them. Unicode=20= > is one solution, and my favoured solution in most cases, but not=20 > always the best solution, and sometimes not practically available. I think I understand your problem now, you have a console program that=20= is interacting with a GUI application via a pipe. This GUI=20 application is trying to display the output of your program, but since=20= it does not know the encoding of your text it is passing on NSString=20 and crossing its fingers. The correct solution is, of course, to fix=20 the GUI application; the way it is handling text is broken. Solution: Possibly use a configuration panel for the GUI to choose the encoding=20 of incoming pipes Use codecs.getreader(your_encoding) on the pipe, and use that to create=20= NSStrings. >>> import sys >>> import codecs >>> input =3D codecs.getreader('utf8')(sys.stdin) >>> input.readline() =8E=F0 u'\xe9\uf8ff\n' -bob |
From: Marc-Antoine P. <map...@ac...> - 2004-01-21 20:17:00
Attachments:
smime.p7s
|
> I think I understand your problem now, you have a console program that=20= > is interacting with a GUI application via a pipe. This GUI=20 > application is trying to display the output of your program, but since=20= > it does not know the encoding of your text it is passing on NSString=20= > and crossing its fingers. That is indeed my case. I was trying to make a more general argument, about third-party=20 non-unicode libraries in general, but I will admit it is theoretical. I=20= still feel that the fact that there is a single point of conversion in=20= the PyObjC bridge makes it a very practical point of control. But I=20 will now try to restrain myself to my current problem. > The correct solution is, of course, to fix the GUI application; the=20= > way it is handling text is broken. > > Solution: > Possibly use a configuration panel for the GUI to choose the encoding=20= > of incoming pipes > Use codecs.getreader(your_encoding) on the pipe, and use that to=20 > create NSStrings.... Yes, in this case, we can ask Glen about it (I have) and/or do the=20 change (I may.) If the application were closed source, I would be in more trouble.=20 Hence my request. Le 04-01-21, =E0 14:57, Ronald Oussoren a =E9crit : >> The fact that setdefaultencoding can only be set at startup is a=20 >> major limitation, and the reason that I argue for a separate value in=20= >> the bridge. > > And the fact that setdefaultencoding exists and is removed early=20 > during startup is an important reason for not adding a simular=20 > function to PyObjC. I am arguing it is not similar, as it controls a single point of=20 conversion (communication with the Cocoa code) as opposed to Python=20 behaviour as a whole. I assume it makes sense, in that (in my limited experience) the Cocoa=20 interface is mostly used to talk with the UI, which is a well-defined=20 subset of the API. Though I admit that this would also affect other parts of the Cocoa=20 bridge, if used, which is as bad as changing Python as a whole. > If you really want to change the encoding after startup you should=20 > probably file a bugreport for Python, or ask around on=20 > comp.lang.python. Fair, but I still think that my case is slightly different. > BTW. If you build .app bundles you can completely replace the site.py=20= > inside your application Ah? How, out of curiosity? Marc-Antoine |
From: Bob I. <bo...@re...> - 2004-01-21 20:36:51
|
On Jan 21, 2004, at 3:17 PM, Marc-Antoine Parent wrote: >> I think I understand your problem now, you have a console program=20 >> that is interacting with a GUI application via a pipe. This GUI=20 >> application is trying to display the output of your program, but=20 >> since it does not know the encoding of your text it is passing on=20 >> NSString and crossing its fingers. > > That is indeed my case. > I was trying to make a more general argument, about third-party=20 > non-unicode libraries in general, but I will admit it is theoretical.=20= > I still feel that the fact that there is a single point of conversion=20= > in the PyObjC bridge makes it a very practical point of control. But I=20= > will now try to restrain myself to my current problem. Encodings are serialization formats, beyond that you need to be using=20 unicode. This is by far one of the worst things about Python: we have=20= this AWESOME unicode support, but we forget to use it most of the time=20= because it requires us to put a u in front of our text. Hopefully=20 someday, Python str will be crippled to the point where nobody will=20 want to use it for anything but raw data. >> The correct solution is, of course, to fix the GUI application; the=20= >> way it is handling text is broken. >> >> Solution: >> Possibly use a configuration panel for the GUI to choose the encoding=20= >> of incoming pipes >> Use codecs.getreader(your_encoding) on the pipe, and use that to=20 >> create NSStrings.... > > Yes, in this case, we can ask Glen about it (I have) and/or do the=20 > change (I may.) > If the application were closed source, I would be in more trouble.=20 > Hence my request. The truth of the matter is that the application is broken, whether it's=20= open source or closed. <offtopic> Because it's open source, and you're a developer, you have this=20 wonderful i-can-fix-it-if-i-have-to power over your software. That's=20 what I really like about open source. I don't particularly care for=20 the rest of it (especially annoyances like the GPL and even LGPL). If=20= everyone just used Python/BSD/MIT-style licenses, then we could all=20 share code and not have to hire a lawyer to see if we can reuse=20 something in another open source project with a different license. </offtopic> > Le 04-01-21, =E0 14:57, Ronald Oussoren a =E9crit : > >>> The fact that setdefaultencoding can only be set at startup is a=20 >>> major limitation, and the reason that I argue for a separate value=20= >>> in the bridge. >> >> And the fact that setdefaultencoding exists and is removed early=20 >> during startup is an important reason for not adding a simular=20 >> function to PyObjC. > > I am arguing it is not similar, as it controls a single point of=20 > conversion (communication with the Cocoa code) as opposed to Python=20 > behaviour as a whole. > I assume it makes sense, in that (in my limited experience) the Cocoa=20= > interface is mostly used to talk with the UI, which is a well-defined=20= > subset of the API. > Though I admit that this would also affect other parts of the Cocoa=20 > bridge, if used, which is as bad as changing Python as a whole. > >> If you really want to change the encoding after startup you should=20 >> probably file a bugreport for Python, or ask around on=20 >> comp.lang.python. > > Fair, but I still think that my case is slightly different. > >> BTW. If you build .app bundles you can completely replace the site.py=20= >> inside your application > > Ah? How, out of curiosity? http://pythonmac.org/wiki/BundleBuilder The bootstrap script sets your PYTHONPATH to the Resources folder, so=20 you can put a sitecustomize.py there and it will just work -bob |
From: Marc-Antoine P. <map...@ac...> - 2004-01-21 20:52:50
Attachments:
smime.p7s
|
>>> BTW. If you build .app bundles you can completely replace the >>> site.py inside your application >> >> Ah? How, out of curiosity? > > http://pythonmac.org/wiki/BundleBuilder > > The bootstrap script sets your PYTHONPATH to the Resources folder, so > you can put a sitecustomize.py there and it will just work OK, I did not realize this. I had tried in one case, but the Python had been segregated in a subfolder, so it failed for me. I should have tried harder. Thanks |
From: Glenn A. <gan...@ma...> - 2004-01-21 17:58:14
|
At 12:31 PM -0500 1/21/04, Bob Ippolito wrote: >If you want/need to exchange arbitrary data you're going to have to >explicitly put it in NSData. I would almost vote to *disable* the >str<->NSString bridge in PyObjC, or make it bridge NSData instead, >but that would just be terribly inconvenient for many people. What about doing both? If the conversion works, it creates an NSString. This will handle all the current ASCII cases as well as cases where the default encoding is explicitly set (and all the str's are handled accordingly). If the conversion doesn't work, it creates NSData. Obviously, this will push the error somewhere else, which may not be able to handle it any better, but at least there is a chance. (The current problem was doing something like "NSText insertText:", which would then fail with some other error, which might even be more confusing). I suppose a more general solution is to allow for custom conversion handlers that can be installed, but that seems to open another can of worms... (more like a 55 gallon drum) Another possibility is to just make the system default encoding be UTF8 instead of ASCII, but I'm guessing if that were a good idea it would have already been done (and would certainly cause other problems with "str is a collection of bytes", "no str is string of characters", "no, it's a desert topping"). Based on the number of google group hits on "+python +setdefaultencodings" these sorts of issues bite those using IDLE, etc... -- Glenn Andreas gan...@de... Theldrow, Blobbo, Cythera, oh my! Be good, and you will be lonesome |
From: Bob I. <bo...@re...> - 2004-01-21 18:20:31
|
On Jan 21, 2004, at 12:57 PM, Glenn Andreas wrote: > At 12:31 PM -0500 1/21/04, Bob Ippolito wrote: >> If you want/need to exchange arbitrary data you're going to have to >> explicitly put it in NSData. I would almost vote to *disable* the >> str<->NSString bridge in PyObjC, or make it bridge NSData instead, >> but that would just be terribly inconvenient for many people. > > What about doing both? If the conversion works, it creates an > NSString. This will handle all the current ASCII cases as well as > cases where the default encoding is explicitly set (and all the str's > are handled accordingly). > > If the conversion doesn't work, it creates NSData. Obviously, this > will push the error somewhere else, which may not be able to handle it > any better, but at least there is a chance. (The current problem was > doing something like "NSText insertText:", which would then fail with > some other error, which might even be more confusing). Oh god no! What if you wanted an NSData that happened to not have any high bits set? This sounds more like how I'd imagine unicode support to work (or not work) in a Perl ObjC bridge ;) And yes, at least at this point the error predictably happens exactly when you're doing something evil/lazy. > I suppose a more general solution is to allow for custom conversion > handlers that can be installed, but that seems to open another can of > worms... (more like a 55 gallon drum) There are custom conversion handlers, Python's unicode support. You can make file-like-objects that spew unicode and you can convert any string of known encoding to a unicode string. The problem with "conversion handlers" is that you don't know where the str came from, and without that information you can't register a conversion handler that does anything that beyond what sys.defaultencoding can do. I think that the reason sys.setdefaultencoding is only settable by the end user (or any other mechanism for starting the python interpreter) is that it's evil for a module to change the system encoding, because it can break totally unrelated code, or end user preferences, in a hard to debug way. > Another possibility is to just make the system default encoding be > UTF8 instead of ASCII, but I'm guessing if that were a good idea it > would have already been done (and would certainly cause other problems > with "str is a collection of bytes", "no str is string of characters", > "no, it's a desert topping"). setdefaultencoding doesn't ever effect str, it only affects unicode (creating unicode and coercing unicode to str). str is always a collection of bytes that happens to be convenient at times to use as a collection of characters. It does, typically, make sense for the system default encoding to be UTF8 *on OS X*, but that is a decision that effects any Python code and that decision needs to be made by the end user (or vendor, I suppose). -bob |
From: Ronald O. <ous...@ci...> - 2004-01-21 17:38:08
|
On 21 jan 2004, at 18:20, Marc-Antoine Parent wrote: [...] > So let me then make a plea for an API so that a PyObjC program can > tell the bridge to use an encoding other than the system default, if > specified, even if the default behaviour remains identical, i.e. throw > exceptions upon non-ascii strings. I don't like introducing global switches like this, libraries may modify the switch and change the behaviour of other code. Too bad that sitecustomize.py cannot in the same directory as a script (dirname(sys.argv[0] is added after site.py finishes). BTW. does anyone know why sys.setdefaultencoding is removed in site.py? E.g. why is it good that users cannot change the default encoding after the interpreter has initialized? > That way, only a program that knows what it is doing will modify the > behaviour, and no data will be lost by default; but a program that has > good architectural reasons to do so might still use another encoding > internally. Unicode should be good enough for this. The strings used by Cocoa are Unicode strings there's not much you can do about this. Ronald |
From: Bob I. <bo...@re...> - 2004-01-21 17:41:03
|
On Jan 21, 2004, at 12:38 PM, Ronald Oussoren wrote: > > On 21 jan 2004, at 18:20, Marc-Antoine Parent wrote: > [...] >> So let me then make a plea for an API so that a PyObjC program can >> tell the bridge to use an encoding other than the system default, if >> specified, even if the default behaviour remains identical, i.e. >> throw exceptions upon non-ascii strings. > > I don't like introducing global switches like this, libraries may > modify the switch and change the behaviour of other code. > > Too bad that sitecustomize.py cannot in the same directory as a script > (dirname(sys.argv[0] is added after site.py finishes). BTW. does > anyone know why sys.setdefaultencoding is removed in site.py? E.g. why > is it good that users cannot change the default encoding after the > interpreter has initialized? sys.setdefaultencoding is probably removed in site.py for the same reason you don't like global switches.. someone could sys.setdefaultencoding in a module that you use, for example. -bob |