[Pyobjc-dev] Re: [ pyobjc-Bugs-679748 ] NSMutableString gets converted to Python string
Brought to you by:
ronaldoussoren
From: Just v. R. <ju...@le...> - 2003-02-03 21:49:02
|
Bill Bumgarner wrote: > On Monday, Feb 3, 2003, at 15:45 US/Eastern, Just van Rossum wrote: > > Bill Bumgarner wrote: > >> - a python object that provides a character buffer style interface > >> to the contents of an NSString. > > > > How would this work for NSStrings containing unicode? > > I have no clue yet. What works nicely now is that the conversion of unicode strings to NSStrings and vice versa is really transparant: pass Python unicode strings to ObjC call expecting an NSString and it works. The other way also: if the NSString is representable in 7-bit ascii you get a str, if not you get a unicode string. I worry about that Python users will have to convert to a unicode string after all when this conversion _doesn't_ take place. I have no idea how to make an object can behave _like_ a unicode string and have it work everywhere. Maybe time for a post to c.l.py... > NSString provides a rich set of API for converting from whatever the > internal representation is to whatever Unicode representation you > might want. As such, it will be easy to produce a character buffer > full of, say, UTF8 characters. > > What can be done with this in the context of the Python API -- > whether it can be wrapped into a python object that is actually > useful -- remains to be seen. Given that file()/open() only looks > for a character buffer and, I believe, can handle a UTF8 path gives > me hope. Python has only limited support for unicode file names and I believe it's highly platform dependent. Right now it doesn't work with unicode strings on OSX, but it does work with 8-bit strings encoded as utf-8: >>> os.stat('a\xcc\x8a') (33188, 1685956L, 234881029L, 1, 501, 20, 0L, 1044307510, 1044307510, 1044307510) >>> os.stat(unicode('a\xcc\x8a', "utf-8")) Traceback (most recent call last): File "<stdin>", line 1, in ? UnicodeEncodeError: 'ascii' codec can't encode character '\u30a' in position 1: ordinal not in range(128) >>> This seems pretty broken, but I don't know enough of the internals to see what it would take to fix this. Just |