Thread: [Pyobjc-dev] Bridging NSMutableString, a compromise
Brought to you by:
ronaldoussoren
From: Just v. R. <ju...@le...> - 2003-02-07 10:14:21
|
I've been doing some thinking, and now have some doubts that "fixing" Python to allow mutable dict keys will solve everything. Biggest problem: if we wrap an NSMutableString in a (mutable) unicode subclass, how are we going to keep the two strings synchronized? Here's an idea for a compromise that might work and yet be convenient to work with from Python in the majority of cases. Problem: - We need NSStrings to work like Python strings as _much_ as possible. - We need access to the methods of the NSString, or to put it in other words, we need to have full access to the native object, _especially_ if it's mutable. Proposal for ObjC -> Python: - wrap the NSString in a subclass of unicode as Bob Ippolito suggested. So we both wrap _and_ convert. Regardless of the mutability of the NS(Mutable)String, these objects will be _immutable_. This means if the NSMutableString changes, we won't see these changes reflected in the Python string. Yet this also allows us to use it as keys in dicts in case it is _assumed_ the string is immutable in a non-deadly way. These objects don't directly give access to NSString methods, it's a pure Python string in all respects except: >>> type(s) is unicode False >>> isinstance(s, unicode) True - provide an attribute (say "nsstring") that gives a pure wrapper for the underlying NSString. This must somehow be wrapped or proxied in such a way that for _this_ object, automatic conversion is *never* done, so it can give us access to the NS(Mutable)String methods. This object does not try to work like a Python string in any way: it does not define __hash__/tp_hash so will _not_ be usable as dict keys. We end up with a situation that's very similar to what we have now: NSStrings always get converted to Python strings (but a subclass), eg. isinstacne(NSString.stringWithString_("ladieda"), unicode) will be true. Yet if you need NSStringMethods you would do s = NSString.stringWithString_("ladieda") s.nsstring.someNSStringMethod() Or if you don't care for the Python object at all (eg. if you _know_ it's a true mutable string), you do s = s.nsstring Proposal for Python -> ObjC: - if it's a pure Python string (str or unicode), simply convert to NSString, don't keep a ref to the Python string. AFAIK this is what we have now. - if it's an instance of our unicode subclass, use the underlying NSString. This can even preserve identity. How does that sound? I think is fairly simple to implement, is slightly less convenient if you must access the raw NSString (need to access an attribute), is slightly confusing if the NSMutableString actually changes behind your back, but is otherwise completely transparent the the Python user who expects Python strings. So basically we _always_ assume strings to be immutable, yet if you _know_ it isn't, you can deal with that. Just |
From: <bb...@ma...> - 2003-02-07 15:41:56
|
On Friday, Feb 7, 2003, at 05:14 US/Eastern, Just van Rossum wrote: > Proposal for Python -> ObjC: > > - if it's a pure Python string (str or unicode), simply convert to > NSString, don't keep a ref to the Python string. AFAIK this is what we > have now. > - if it's an instance of our unicode subclass, use the underlying > NSString. This can even preserve identity. (I'll comment on the rest of the proposal later -- have to think hard about it and don't have time or caffeine to do so now . Just wanted to give a brief comment on this.) I have an OC_PythonString implementation [that needs some adjustment in light of Ronald's recent improvements] that does not convert the Python string and maintains the reference to the string. It works as you describe for both unicode and python strings, naively assuming that python strings can be represented within an NSString without problem [so far so good]. I hope to clean it up and commit sometime soon. If you are going to dive into this stuff, I'd be happy to throw you the source... b.bum |
From: Just v. R. <ju...@le...> - 2003-02-07 16:02:08
|
bb...@ma... wrote: > I have an OC_PythonString implementation [that needs some adjustment > in light of Ronald's recent improvements] that does not convert the > Python string and maintains the reference to the string. "the string" meaning the NSString, right? > It works > as you describe for both unicode and python strings, How does it work with unicode strings? I think we can't get decent transparency without subclassing unicode (and I agree with David Eppstein that it's best to just go for unicode all the way). > naively assuming > that python strings can be represented within an NSString without > problem [so far so good]. I'm not following you. This is a wrapper so NSStrings can be used in Python as _if_ they are Python strings, where does representing a _Python_ string with an NSString come into the picture? just |
From: <bb...@ma...> - 2003-02-07 16:16:51
|
On Friday, Feb 7, 2003, at 11:01 US/Eastern, Just van Rossum wrote: > bb...@ma... wrote: > >> I have an OC_PythonString implementation [that needs some adjustment >> in light of Ronald's recent improvements] that does not convert the >> Python string and maintains the reference to the string. > > "the string" meaning the NSString, right? > >> It works >> as you describe for both unicode and python strings, > > How does it work with unicode strings? I think we can't get decent > transparency without subclassing unicode (and I agree with David > Eppstein that it's best to just go for unicode all the way). > >> naively assuming >> that python strings can be represented within an NSString without >> problem [so far so good]. > > I'm not following you. This is a wrapper so NSStrings can be used in > Python as _if_ they are Python strings, where does representing a > _Python_ string with an NSString come into the picture? When passing a Python string into ObjC..... the easy part of the whole string conundrum. The current very broken implementation appears below. Conflicts removed, all unit tests pass, but it has critical flaws that I intend on cleaning up at some point: - leaks memory - doesn't use Ronald's new API/implementation style (_pyobjc_pyObject) - doesn't handle unicode nearly as effeciently as it should/could What I was stumbling over was how to figure out how the Python unicode string is encoded such that I can create an NSString with the appropriate equivalent encoding. The enclosed code forces everything to UTF8. Note that PyString_AsStringAndSize() returns an invalid buffer* if the unicode string has not been touched in a fashion that causes the cache-- the defenc in the internal struct-- to be initialized. That tripped me up a bit. Note that identity is not currently preserved-- that is, passing the same python string into ObjC land will cause a new instance of OC_PythonString to be instantiated. However, this problem is largely outside of the scope of implementation of OC_PythonString and should likely be a generic object correlation mechanism in the Python->ObjC, ObjC->Python call mechanism within the bridge. It *could* be here, but that problem has to be solved elsewhere and more generically anyway. Note that this implementation is about as slow of a string as you can create in that it only implements the two primitive methods, thereby causing all access to the string contents to go character by character. Optimization is trivial -- just implement the various NSStringExtension [see NSString.h] methods to call to the encapsulated NSString directly. Until this actually works correctly there is little point in optimizing it.... b.bum #include "OC_PythonString.h" #include "pyobjc.h" #include "objc_support.h" @implementation OC_PythonString +newWithPythonObject:(PyObject*)v; { OC_PythonString* res = [[OC_PythonString alloc] initWithPythonObject:v]; [res autorelease]; return res; } -initWithPythonObject:(PyObject*)v; { value = v; if (PyString_Check(value)) { char *buffer; int length; int result; result = PyString_AsStringAndSize(value, &buffer, &length); if(result == -1) { ObjCErr_ToObjC(); [self release]; return nil; // not reached } stringValue = CFStringCreateWithCStringNoCopy(NULL, buffer, kCFStringEncodingUTF8, kCFAllocatorNull); } else if (PyUnicode_Check(value)) { char *buffer; int length; int result; #warning instead of doing this, we should figure out the encoding of the python string and do a 'native' conversion to force the buffer interface to actually return something useful. value = PyUnicode_AsUTF8String(value); result = PyString_AsStringAndSize(value, &buffer, &length); if(result == -1) { ObjCErr_ToObjC(); [self release]; return nil; } stringValue = CFStringCreateWithCStringNoCopy(NULL, buffer, kCFStringEncodingUTF8, kCFAllocatorNull); } Py_INCREF(value); return self; } -(PyObject*)pyObject { return value; } -(void)dealloc { CFRelease(stringValue); Py_XDECREF(value); [super dealloc]; } - (unsigned int)length; { <<<<<<< OC_PythonString.m int result; result = CFStringGetLength(stringValue); return result; ======= return PyString_Size([self pyObject]); >>>>>>> 1.4 } - (unichar)characterAtIndex:(unsigned)index; { <<<<<<< OC_PythonString.m UniChar result; result = CFStringGetCharacterAtIndex(stringValue, index); return result; ======= return PyString_AsString([self pyObject]); >>>>>>> 1.4 } @end |
From: David E. <epp...@ic...> - 2003-02-07 15:42:06
|
On 2/7/03 11:14 AM +0100 Just van Rossum <ju...@le...> wrote: > We end up with a situation that's very similar to what we have now: > NSStrings always get converted to Python strings (but a subclass), eg. > isinstacne(NSString.stringWithString_("ladieda"), unicode) will be true. > Yet if you need NSStringMethods you would do > > s = NSString.stringWithString_("ladieda") > s.nsstring.someNSStringMethod() If you're going to go that far, why not define your unicode subclass in such a way that s.someNSStringMethod() works? -- David Eppstein UC Irvine Dept. of Information & Computer Science epp...@ic... http://www.ics.uci.edu/~eppstein/ |
From: Just v. R. <ju...@le...> - 2003-02-07 15:55:02
|
David Eppstein wrote: > > Yet if you need NSStringMethods you would do > > > > s = NSString.stringWithString_("ladieda") > > s.nsstring.someNSStringMethod() > > If you're going to go that far, why not define your unicode subclass > in such a way that s.someNSStringMethod() works? Not entirely sure, but I can think of a couple of reasons: - to make it explicit you're using an actual NSString. This is quite important in the case of methods that modify the string, as the Python string will _not_ reflect the changes in my proposal. - to simplify the implementation. (But then again, with a proper __getattr__ function this is easily solved.) I kindof like the explicit separation between the two objects, but I won't make a huge deal out of it if people insist on merging them this way. Just |
From: <bb...@ma...> - 2003-02-07 16:07:35
|
On Friday, Feb 7, 2003, at 10:54 US/Eastern, Just van Rossum wrote: >> If you're going to go that far, why not define your unicode subclass >> in such a way that s.someNSStringMethod() works? Unless the class in Python is truly a subclass of NSString/NSMutableString, access to the NSString methods are likely best left to going through one extra level of indirection for the reasons that Just describes. Personally, I have no particular problem with NSString.someRandomMethod(fooWrappyUnicodyStringThingy) either. As much as we need to preserve identity across the bridge and avoid copying as much as humanly possible of performance reasons-- strings are used *a lot*-- the differing semantics of PyString vs. NSString are such that making it highly visible in code that you are expecting a string to behave like one or the other is probably a good thing. b.bum |
From: Jack J. <Jac...@cw...> - 2003-02-07 16:38:47
|
On Friday, Feb 7, 2003, at 11:14 Europe/Amsterdam, Just van Rossum wrote: > I've been doing some thinking, and now have some doubts that "fixing" > Python to allow mutable dict keys will solve everything. Biggest > problem: if we wrap an NSMutableString in a (mutable) unicode subclass, > how are we going to keep the two strings synchronized? > > Here's an idea for a compromise that might work and yet be convenient > to > work with from Python in the majority of cases. > > Problem: > > - We need NSStrings to work like Python strings as _much_ as possible. > > - We need access to the methods of the NSString, or to put it in other > words, we need to have full access to the native object, _especially_ > if > it's mutable. I've been thinking about it long and hard, but I don't see why the second assertion is true. And my feeling is that all the complexity comes from that assertion. If anything returning an NSString would (on the Python side) return a "NSString_Or_NSMutableString_Acting_Like_a_Python_String" object (NONALPS for short) then I think the requirements would be: - A NONALPS behaves as much as a Python string as possible. - A NONALPS can be passed where an NSString is expected, and then object identity of the original NS{Mutable}String is preserved on the ObjC side. - (corollary of the previous): a NONALPS can be cast to an NSString, whereby you get the original object, plus access to all the methods. - of course a NONALPS can be cast to a Python string, but this doesn't do any identity preserving (which isn't needed anyway). Note that I think we should specifically *not* allow passing a NONALPS where an NSMutableString is expected, or returning any NSMutableString coming from ObjC as a NONALPS. Mutable strings are un-Pythonic beasts, and the programmer should be aware of that. The only mutable strings that we want to represent as Pythonic string lookalikes are those returned in places where the promise was that we would get an NSString. I think it's a safe assumption that the code returning NSMutableString in stead of NSString will at least have the common decency not to modify the contents behind our back. -- Jack Jansen, <Jac...@cw...>, http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman |
From: Ronald O. <ous...@ci...> - 2003-02-07 18:08:04
|
On Friday, Feb 7, 2003, at 17:38 Europe/Amsterdam, Jack Jansen wrote: > > On Friday, Feb 7, 2003, at 11:14 Europe/Amsterdam, Just van Rossum > wrote: > >> I've been doing some thinking, and now have some doubts that "fixing" >> Python to allow mutable dict keys will solve everything. Biggest >> problem: if we wrap an NSMutableString in a (mutable) unicode >> subclass, >> how are we going to keep the two strings synchronized? >> >> Here's an idea for a compromise that might work and yet be convenient >> to >> work with from Python in the majority of cases. >> >> Problem: >> >> - We need NSStrings to work like Python strings as _much_ as possible. >> >> - We need access to the methods of the NSString, or to put it in other >> words, we need to have full access to the native object, _especially_ >> if >> it's mutable. > > I've been thinking about it long and hard, but I don't see why the > second assertion > is true. And my feeling is that all the complexity comes from that > assertion. Given the size of the NSString API I'd assume that at least one method is usefull for Python programmers :-). Those can already be accessed using unbound methods (NSString.fooMethod(strVal)). The current problems seem to result from the following assertions: * Copying strings is too expensive I'd like to see some figures here, creating proxy objects also has a cost and depending on the size of strings creating a proxy might not be signifantly faster than building a 'foreign' string object. * NSMutableString should be visible in Python as a mutable object No objections here, some APIs won't work until we fix this. * The identity (id()) of strings is significant I have no idea whether this is true or not. Note that it would be easy to make sure that when an instance of NSString is passed to Python through two paths the two "proxies" have the same identity. This is already true ordinary objects and only requires that strings/unicode objects can have weakrefs. BTW. This whole discussion reminds me of a simular discussion related to NSArray/NSDictionary and list/tuple/dict. At the time is was suggested to create subclasses of list/dict to represent NSArray/NSDictionary. This seemed like a good idea (code testing for isinstance(foobar, list) would accept NSArray instances) we soon discovered that the implementation of Python assumes it can poke in the actual datastructure, which of course wouldn't be very helpfull. The best idea I've seen so far for the path from Objective-C to Python is subclassing str or unicode. I think this would work fine for immutable strings, but doing this for mutable strings might be problematic, unless we can somehow arrange that the internal representation of the NSMutableString and our unicode subclass are the same. I have serious doubts on the feasability(sp?) of this, but I wouldn't mind if someone surprised me by providing a working implementation ;-) ;-) > The only mutable strings that we want to represent as Pythonic string > lookalikes are those returned in places where the promise was that we > would get an > NSString. The only way to detect that the API promises to return an (immutable) NSString is by parsing header files, as far as the Objective-C runtime (and therefore PyObjC) is concerned all classes are the same when mentioned in method signatures. > I think it's a safe assumption that the code returning NSMutableString > in > stead of NSString will at least have the common decency not to modify > the contents > behind our back. I sure hope so ;-) Another BTW: We currently try to convert the NSString to ASCII before building a unicode object when going from Objective-C to Python, if performance is an issue we should at least do away with that piece of code and just always convert NSString objects to unicode. Ronald, on the other end of a 10-foot stick ;-) |
From: <bb...@ma...> - 2003-02-07 18:24:09
|
On Friday, Feb 7, 2003, at 13:07 US/Eastern, Ronald Oussoren wrote: >> The only mutable strings that we want to represent as Pythonic string >> lookalikes are those returned in places where the promise was that we >> would get an >> NSString. > The only way to detect that the API promises to return an (immutable) > NSString is by parsing header files, as far as the Objective-C runtime > (and therefore PyObjC) is concerned all classes are the same when > mentioned in method signatures. This won't work for methods that are not advertised in public API (but, of course, the developer is on their own at that point anyway) and will require third party developers to parse their headers before using PyObjC with their frameworks/code. It would also stick a serious wrench in the whole 'development environment on machine without dev tools installed' concept. >> I think it's a safe assumption that the code returning >> NSMutableString in >> stead of NSString will at least have the common decency not to modify >> the contents >> behind our back. > I sure hope so ;-) > > Another BTW: We currently try to convert the NSString to ASCII before > building a unicode object when going from Objective-C to Python, if > performance is an issue we should at least do away with that piece of > code and just always convert NSString objects to unicode. If we preserve identity then, in theory, this issue would be greatly reduced in that the conversion would only happen the first time the object crosses the bridge? I have no problem with converting everything to unicode unless it is particularly problematic to python developers.... b.bum |
From: Ronald O. <ous...@ci...> - 2003-02-07 19:26:25
|
On Friday, Feb 7, 2003, at 19:23 Europe/Amsterdam, bb...@ma... wrote: > On Friday, Feb 7, 2003, at 13:07 US/Eastern, Ronald Oussoren wrote: > >>> The only mutable strings that we want to represent as Pythonic string >>> lookalikes are those returned in places where the promise was that >>> we would get an >>> NSString. >> The only way to detect that the API promises to return an (immutable) >> NSString is by parsing header files, as far as the Objective-C >> runtime (and therefore PyObjC) is concerned all classes are the same >> when mentioned in method signatures. > > This won't work for methods that are not advertised in public API > (but, of course, the developer is on their own at that point anyway) > and will require third party developers to parse their headers before > using PyObjC with their frameworks/code. > > It would also stick a serious wrench in the whole 'development > environment on machine without dev tools installed' concept. I probably should have added some smileys here, I definitely didn't want to imply that we should parse header files. > >>> I think it's a safe assumption that the code returning >>> NSMutableString in >>> stead of NSString will at least have the common decency not to >>> modify the contents >>> behind our back. >> I sure hope so ;-) >> >> Another BTW: We currently try to convert the NSString to ASCII before >> building a unicode object when going from Objective-C to Python, if >> performance is an issue we should at least do away with that piece of >> code and just always convert NSString objects to unicode. > > If we preserve identity then, in theory, this issue would be greatly > reduced in that the conversion would only happen the first time the > object crosses the bridge? Preserving identity both might be hard to do without introducing garbage (in the memory management sense). This may not be relevant for strings, but I ran into this when thinking about a solution for the problems we're having with using Python objects as the model for an NSOutlineView. Preferably you'd keep the OC_PythonObject proxy alive as long as the Python object. Obviously the Python object must be alive as long as the OC_PythonObject proxy is. If you combine the two you get immortal objects :-( Ronald |
From: <bb...@ma...> - 2003-02-07 19:47:10
|
[cc'ing Guido because this is all part of the reasoning behind an earlier 2.3 'change request' asking if it would be possible to add weakref support to <string> and <unicode>] On Friday, Feb 7, 2003, at 14:25 US/Eastern, Ronald Oussoren wrote: > Preserving identity both might be hard to do without introducing > garbage (in the memory management sense). This may not be relevant for > strings, but I ran into this when thinking about a solution for the > problems we're having with using Python objects as the model for an > NSOutlineView. Preferably you'd keep the OC_PythonObject proxy alive > as long as the Python object. Obviously the Python object must be > alive as long as the OC_PythonObject proxy is. If you combine the two > you get immortal objects :-( If weakrefs were supported by <unicode> and <string>, that could help... or do we only get the callback-upon-finalize when it is too late to "save" the object from destruction? I don't think it would matter [taking advantage of the immutability of Python strings and the general lack of importance of the identity of a string on the python side]. - on ObjC side, everything revolves around retain/release. If retain count drops to zero and the object is deallocated [there are hooks to catch this at a low level -- don't know about viability, should look in source], then the ObjC side can be deallocated because there is [should!] no longer be a viable reference to the string from the ObjC side. If the string is subsequently 'rebridged', it doesn't matter. - on Python side, if the reference count drops to zero, the same thing can happen. We just need a hook to remove the association between the Python <string>/<unicode> and the NSString instance in the bridge. Another way to phrase this: Like we can transition an object from being present in one runtime to being present in both runtimes, we need a way to undo that association. Once the reference count for an object drops to zero on either side of the bridge, the bridging for that object can be removed -- the object returns to only be accessible on one side of the bridge. Clearly, the transition from bridged to unbridged may not always be so straightforward. If either side is using the internal backing store of the other side, the act of 'unbridging' where the backing store is about to be destroyed will have to cause the backing store to either be moved or duplicated to the other side of the bridge. For the [very broken, but a potentially right direction] implementation of OC_PythonString, having it transition from using the PyString backing store to an entire copy contained within an NSString or CFString would be trivial *assuming we can receive notification that the PyString object is about to be deallocated and before its backing store has been invalidated*. When this happens, the PyString reference could be nullified-- indicating that the OC_PythonString now lives entirely on the ObjC side of world. Rebridging is simply a matter of creating a new PyString/PyUnicode reference and passing it off to Python. Going the other way *sounds* like it would be more feasible with unicode objects than it would with string objects in that unicode objects have their backing store as a slot whereas strings are all-in-one. When the OC_PythonString is -dealloc'd, it could copy its contents into freshly malloc'd backing store for the PyUnicode object... NSString -> Python bridging is slightly more difficult in that the OC_PythonString like functionality cannot be implemented as a subclass of NSString. It could be implemented as a subclass of unicode, I suppose? Also, we would have to verify that the... /usr/include/objc/objc-runtime.h:OBJC_EXPORT id (*_dealloc)(id); ... hook works as expected (and pay the price for being the ones to override it -- we are hosed if someone else overrides it). The alternative is to override/swizzle NSObject's -dealloc method to do what we need. That doesn't excite me much either. Too bad ObjC doesn't have some kind of a weakref-with-notifier concept.... b.bum |
From: Just v. R. <ju...@le...> - 2003-02-07 20:30:12
|
Bill, We're making this way to hard for ourselves. This weakref idea is an attempt to try to hide a flaw in Cocoa (that some objects don't incref an object while still storing a reference; it's a poor man's weak ref scheme and it sucks). It will be hard to do right. It is easy to work around in Python code. Let's make it a FAQ and move on. Just |
From: <bb...@ma...> - 2003-02-07 20:53:59
|
On Friday, Feb 7, 2003, at 15:29 US/Eastern, Just van Rossum wrote: > We're making this way to hard for ourselves. This weakref idea is an > attempt to try to hide a flaw in Cocoa (that some objects don't incref > an object while still storing a reference; it's a poor man's weak ref > scheme and it sucks). It will be hard to do right. It is easy to work > around in Python code. Let's make it a FAQ and move on. Huh? The weakref idea was to take advantage of the callback that happens when the object is finalized such that the object could be 'unbridged'. If we can receive notification from Python of when the Python side of the bridge is done with the PyString, the bridging of the NSString<->PyString can be broken, the NSString side of things can leave on in its unbridged state, and nothing gets lost. I hadn't remotely considered the situation where some Cocoa objects don't implement the concept of 'weakref' as 'just grab a pointer and don't tell anyone or anything about it'. b.bum |
From: Guido v. R. <gu...@py...> - 2003-02-07 21:03:42
|
Folks, I have no bandwidth left this week. Sorry. Maybe when you need an *action* from me, Just can send a short note? (I notice Bill is a bit more verbose than Just. I read short notes quicker. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) |
From: Ronald O. <ous...@ci...> - 2003-02-07 21:21:22
|
On Friday, Feb 7, 2003, at 21:53 Europe/Amsterdam, bb...@ma... wrote: > On Friday, Feb 7, 2003, at 15:29 US/Eastern, Just van Rossum wrote: >> We're making this way to hard for ourselves. This weakref idea is an >> attempt to try to hide a flaw in Cocoa (that some objects don't incref >> an object while still storing a reference; it's a poor man's weak ref >> scheme and it sucks). It will be hard to do right. It is easy to work >> around in Python code. Let's make it a FAQ and move on. > > Huh? > > The weakref idea was to take advantage of the callback that happens > when the object is finalized such that the object could be > 'unbridged'. If we can receive notification from Python of when the > Python side of the bridge is done with the PyString, the bridging of > the NSString<->PyString can be broken, the NSString side of things can > leave on in its unbridged state, and nothing gets lost. I'm not entirely sure what you mean here. Do you want to create a NSString subclass for proxying Python strings that somehow goes in 'native' mode if Python code no longer references the string? I think that this adds unnecessary complicated code to the bridge. Is someone working on the subclass of unicode as propased by Just? Having an initial version of this would get us going forward again, the current discussion seems to center around misunderstandings :-). Ronald |
From: Just v. R. <ju...@le...> - 2003-02-07 21:32:05
|
bb...@ma... wrote: > Huh? Sorry, I've assumed a different context. Although? > The weakref idea was to take advantage of the callback that happens > when the object is finalized such that the object could be > 'unbridged'. What is 'unbridged'? Here's how I see how it should work: | runtime A | runtime B | nativeObject ------> wrappedObject (contains a strong ref to | nativeObject) | I see no need for nativeObject to have a ref to the wrapped object, so there are no cycles and no need for weak refs. The only place where this goes wrong is when runtime B stores a reference without retaining it, hence my assumption. > If we can receive notification from Python of when the > Python side of the bridge is done with the PyString, the bridging of > the NSString<->PyString can be broken, the NSString side of things > can leave on in its unbridged state, and nothing gets lost. This seems awfully convoluted and error prone. Can you clarify why this is needed? Btw. I noticed PyObjC tries to sync the refcounts for both objects in class-builder.m's object_method_retain() and object_method_release() calls. Why is this needed? Just |
From: Ronald O. <ous...@ci...> - 2003-02-07 21:45:58
|
On Friday, Feb 7, 2003, at 22:31 Europe/Amsterdam, Just van Rossum wrote: > Btw. I noticed PyObjC tries to sync the refcounts for both objects in > class-builder.m's object_method_retain() and object_method_release() > calls. Why is this needed? This is for Python subclasses of Objective-C classes (and only those). Such instances have two parts, an Objective-C object and a Python object, that behave as if there is 1 object. Using 1 refcount for both parts is IMHO the only feaseable way to ensure that we don't end up with only one of the parts and that these objects are not immortal. Getting this right was one of the hardest elements of my rewrite of PyObjC, and something that really should be documented... Ronald |
From: Just v. R. <ju...@le...> - 2003-02-07 22:00:16
|
Ronald Oussoren wrote: > On Friday, Feb 7, 2003, at 22:31 Europe/Amsterdam, Just van Rossum > wrote: > > Btw. I noticed PyObjC tries to sync the refcounts for both objects > > in class-builder.m's object_method_retain() and > > object_method_release() calls. Why is this needed? > > This is for Python subclasses of Objective-C classes (and only those). > Such instances have two parts, an Objective-C object and a Python > object, that behave as if there is 1 object. Using 1 refcount for both > parts is IMHO the only feaseable way to ensure that we don't end up > with only one of the parts and that these objects are not immortal. > > Getting this right was one of the hardest elements of my rewrite of > PyObjC, and something that really should be documented... For lack of documentation <wink>, I hope you don't mind me asking the following, because I must be missing something deep: Why are there _two_ relevant objects instead of just one being the "master"? Without looking at the code I would have assumed it would work like this: a) a native ObjC object b) a thin Python wrapper, only containing a ref to the ObjC objects The ObjC object could have a __dict__ ivar that can store arbitrary attributes. The wrapper can be reconstructed from the native object at any time, and the native object may live longer than the wrapper if the ObjC runtime retains it. Apart from this not guaranteeing object identity on the Python side, as well as having to recreate the wrapper when the object enters Python again, I don't see any immediate flaw. An optimization would be to store a weakref to the wrapper in the native object to avoid reconstruction of the wrapper. But it's just that: an optimization. Just |
From: Ronald O. <ous...@ci...> - 2003-02-07 22:24:55
|
On Friday, Feb 7, 2003, at 22:59 Europe/Amsterdam, Just van Rossum wrote: > Ronald Oussoren wrote: > >> On Friday, Feb 7, 2003, at 22:31 Europe/Amsterdam, Just van Rossum >> wrote: >>> Btw. I noticed PyObjC tries to sync the refcounts for both objects >>> in class-builder.m's object_method_retain() and >>> object_method_release() calls. Why is this needed? >> >> This is for Python subclasses of Objective-C classes (and only those). >> Such instances have two parts, an Objective-C object and a Python >> object, that behave as if there is 1 object. Using 1 refcount for both >> parts is IMHO the only feaseable way to ensure that we don't end up >> with only one of the parts and that these objects are not immortal. >> >> Getting this right was one of the hardest elements of my rewrite of >> PyObjC, and something that really should be documented... > > For lack of documentation <wink>, I hope you don't mind me asking the > following, because I must be missing something deep: > > Why are there _two_ relevant objects instead of just one being the > "master"? Without looking at the code I would have assumed it would > work > like this: > a) a native ObjC object > b) a thin Python wrapper, only containing a ref to the ObjC objects That might also work. I wanted NSObject to behave as much as possible like a 'proper' new-style class (including __slots__), without copying code from the python implementation. It might be worthwhile to reexamine this issue after we get strings working properly. Ronald |
From: Just v. R. <ju...@le...> - 2003-02-07 23:21:42
|
Ronald Oussoren wrote: > > Why are there _two_ relevant objects instead of just one being the > > "master"? Without looking at the code I would have assumed it would > > work like this: > > a) a native ObjC object > > b) a thin Python wrapper, only containing a ref to the ObjC objects > > That might also work. I wanted NSObject to behave as much as possible > like a 'proper' new-style class (including __slots__), without copying > code from the python implementation. Ah, I see what you mean: this approach would force you reimplement much of the __slots__ logic for an NSObject. If we would go this route, I would propose to drop support for __slots__ in NSObject subclasses. Its purpose is twofold: 1) to define the set of names that can be used as attributes, causing typos in attr names to be caught earlier (or at all...) 2) to save space in the object as it then doesn't need a separate __dict__ object; the attr refs live in the allocated object itself I don't care much for #2, but #1 is useful. I could live without it, though. Hm, here's an idea (that probably only complicates things, so don't take it too seriously ;-): could the __slots__ definition be mapped to ivars in the native object? That would actually be pretty cool, except it doesn't save space, since all attrs need to be wrapped in OC_PythonObject instances... Ha, this would make the following two classes equivalent: class Foo(NSObject): a = objc.ivar("a") b = objc.ivar("b") class Bar(NSObject): __slots__ = ("a", "b") > It might be worthwhile to reexamine this issue after we get strings > working properly. Definitely not before 0.9! Just |
From: Just v. R. <ju...@le...> - 2003-02-07 19:52:16
|
Ronald Oussoren wrote: > but I ran into this when thinking about a solution for the > problems we're having with using Python objects as the model for an > NSOutlineView. Right, so this was indeed the reason you mentioned weakrefs... It's a tricky issue indeed, but I don't think it should be given a high priority: it's easy to workaround, and should be a FAQ once we have one. We have bigger fish to fry at the moment... Just |
From: Just v. R. <ju...@le...> - 2003-02-07 19:48:35
|
[Just] > >> Problem: > >> > >> - We need NSStrings to work like Python strings as _much_ as > >> possible. > >> > >> - We need access to the methods of the NSString, or to put it in > >> other words, we need to have full access to the native object, > >> _especially_ if it's mutable. [Jack] > > I've been thinking about it long and hard, but I don't see why the > > second assertion is true. And my feeling is that all the complexity > > comes from that assertion. Access to mutable strings is neccesary for API's that deal explicitly with mutable strings. I don't care much about having access to the NSString methods for immutable strings (I'm quite happy with how Python string methods work), but due to the "fragile immutability syndrome" it makes no sense to expose the NSMutableString methods only when the string is immutable. [Ronald] > Given the size of the NSString API I'd assume that at least one method > is usefull for Python programmers :-). Probably... > Those can already be accessed > using unbound methods (NSString.fooMethod(strVal)). But this doesn't work for mutable strings... > The current problems seem to result from the following assertions: > * Copying strings is too expensive > I'd like to see some figures here, creating proxy objects also has a > cost and depending on the size of strings creating a proxy might not > be signifantly faster than building a 'foreign' string object. Yeah, I don't care about this either until it's _proven_ to be a bottleneck. "Premature optimization is the root of" etc. > * NSMutableString should be visible in Python as a mutable object > No objections here, some APIs won't work until we fix this. As far as I'm concerned this is the only _real_ requirement. > * The identity (id()) of strings is significant > I have no idea whether this is true or not. Note that it would be > easy to make sure that when an instance of NSString is passed to > Python through two paths the two "proxies" have the same identity. > This is already true ordinary objects and only requires that > strings/unicode objects can have weakrefs. I'm not sure I'm following you. When going from ObjC -> Python -> ObjC, the Python representation of the string can safely hold a strong reference to the NSString. _This_ is the case where Bill claims object id is relevant. I still don't believe him, but it doesn't matter, because as soon as we keep the original NSString around this requirement is automatically met. Going the other way, Python -> ObjC -> Python, there is _no_ requirement to keep the object id the same. So we can just _convert_ to NSString and forget about the original. So this will be an autoreleased object. Of course this gives problems when the receiver doesn't retain it yet does store a reference, but we have that problem anyway, and is easy to work around. Am I missing something? > BTW. This whole discussion reminds me of a simular discussion related > to NSArray/NSDictionary and list/tuple/dict. At the time is was > suggested to create subclasses of list/dict to represent > NSArray/NSDictionary. This seemed like a good idea (code testing for > isinstance(foobar, list) would accept NSArray instances) we soon > discovered that the implementation of Python assumes it can poke in > the actual datastructure, which of course wouldn't be very helpfull. Also: custom sequence object are very common in Python, there are very few places where actual lists or tuples are required. Wrapping the original NSArray and support (a subset of) the sequence protocol is totally the right thing to do. > The best idea I've seen so far for the path from Objective-C to Python > is subclassing str or unicode. I think this would work fine for > immutable strings, but doing this for mutable strings might be > problematic, unless we can somehow arrange that the internal > representation of the NSMutableString and our unicode subclass are the > same. I vey much doubt this is possible (and if it is it will be a very complex and hairy implementation), hence my suggestion to simply punt at this issue. The PyObjC idiom for dealing with mutable strings would be this: s = someCallThatIsKnownToReturnAMutableString() # Toss the Python string, because it won't be sync'd with # the NSString, and is also if limited use to Python code # as using it as a dict key will not work as expected s = s.nsstring > I have serious doubts on the feasability(sp?) of this, but I > wouldn't mind if someone surprised me by providing a working > implementation ;-) ;-) Let's keep it simple, and live with the simple wart-by-design that the Python representation will _not_ be kept the same as the underlying NSMutableString. > Another BTW: We currently try to convert the NSString to ASCII before > building a unicode object when going from Objective-C to Python, if > performance is an issue we should at least do away with that piece of > code and just always convert NSString objects to unicode. +1 Just |
From: Ronald O. <ous...@ci...> - 2003-02-07 20:14:41
|
On Friday, Feb 7, 2003, at 20:48 Europe/Amsterdam, Just van Rossum wrote: > >> Those can already be accessed >> using unbound methods (NSString.fooMethod(strVal)). > > But this doesn't work for mutable strings... Not currently, but we'd stuff a reference to the NSString in an unicode object we could get this to work. But using your sugestion of explicitly dropping the python object and using the Objective-C object directly is probably better. > >> The current problems seem to result from the following assertions: >> * Copying strings is too expensive >> I'd like to see some figures here, creating proxy objects also has a >> cost and depending on the size of strings creating a proxy might not >> be signifantly faster than building a 'foreign' string object. > > Yeah, I don't care about this either until it's _proven_ to be a > bottleneck. "Premature optimization is the root of" etc. I couldn't agree more. > >> * NSMutableString should be visible in Python as a mutable object >> No objections here, some APIs won't work until we fix this. > > As far as I'm concerned this is the only _real_ requirement. > >> * The identity (id()) of strings is significant >> I have no idea whether this is true or not. Note that it would be >> easy to make sure that when an instance of NSString is passed to >> Python through two paths the two "proxies" have the same identity. >> This is already true ordinary objects and only requires that >> strings/unicode objects can have weakrefs. > > I'm not sure I'm following you. When going from ObjC -> Python -> ObjC, > the Python representation of the string can safely hold a strong > reference to the NSString. _This_ is the case where Bill claims object > id is relevant. I still don't believe him, but it doesn't matter, > because as soon as we keep the original NSString around this > requirement > is automatically met. What I meant is that if you get hold of the same ObjC object through two different method calls the proxies will have the same object id. We currently don't use this mechanism for strings. If object identity really is important in Cocoa that should be changed. That only covers ObjC -> Python, going back to ObjC requires more work. > > Going the other way, Python -> ObjC -> Python, there is _no_ > requirement > to keep the object id the same. So we can just _convert_ to NSString > and > forget about the original. So this will be an autoreleased object. Of > course this gives problems when the receiver doesn't retain it yet does > store a reference, but we have that problem anyway, and is easy to work > around. And the sad thing is that it easy to keep the object id the same when going Python -> ObjC -> Python. OC_Python{Object,Array,Dictionary} already do it for 'plain' objects, lists and dicts. Adding a proper OC_PythonString (subclassing from NSString) would be easy. > >> The best idea I've seen so far for the path from Objective-C to Python >> is subclassing str or unicode. I think this would work fine for >> immutable strings, but doing this for mutable strings might be >> problematic, unless we can somehow arrange that the internal >> representation of the NSMutableString and our unicode subclass are the >> same. > > I vey much doubt this is possible (and if it is it will be a very > complex and hairy implementation), hence my suggestion to simply punt > at > this issue. The PyObjC idiom for dealing with mutable strings would be > this: > > s = someCallThatIsKnownToReturnAMutableString() > # Toss the Python string, because it won't be sync'd with > # the NSString, and is also if limited use to Python code > # as using it as a dict key will not work as expected > s = s.nsstring According to our coding style this should be 's = s.pyobjc_nsstring', but otherwise I agree. > >> I have serious doubts on the feasability(sp?) of this, but I >> wouldn't mind if someone surprised me by providing a working >> implementation ;-) ;-) > > Let's keep it simple, and live with the simple wart-by-design that the > Python representation will _not_ be kept the same as the underlying > NSMutableString. > That's fine with me. Ronald |
From: Just v. R. <ju...@le...> - 2003-02-07 20:25:45
|
Ronald Oussoren wrote: > > Going the other way, Python -> ObjC -> Python, there is _no_ > > requirement to keep the object id the same. So we can just > > _convert_ to NSString and forget about the original. So this will > > be an autoreleased object. Of course this gives problems when the > > receiver doesn't retain it yet does store a reference, but we have > > that problem anyway, and is easy to work around. > And the sad thing is that it easy to keep the object id the same when > going Python -> ObjC -> Python. OC_Python{Object,Array,Dictionary} > already do it for 'plain' objects, lists and dicts. Adding a proper > OC_PythonString (subclassing from NSString) would be easy. But why bother? > > s = someCallThatIsKnownToReturnAMutableString() > > # Toss the Python string, because it won't be sync'd with > > # the NSString, and is also if limited use to Python code > > # as using it as a dict key will not work as expected > > s = s.nsstring > According to our coding style this should be 's = s.pyobjc_nsstring', > but otherwise I agree. Why the prefix? It's not a global name, it's a new attribute to an object with not all that many methods and attributes, so clashes aren't likely. It offers access to the underlying NSString, so .nsstring seems very intuitive to me. Just |