Re: [Pyobjc-dev] Retain / Release semantics
Brought to you by:
ronaldoussoren
|
From: Bill B. <bb...@co...> - 2002-10-25 14:59:03
|
On Friday, October 25, 2002, at 05:22 AM, Ronald Oussoren wrote:
> On Thursday, Oct 24, 2002, at 23:40 Europe/Amsterdam, Bill Bumgarner
> wrote:
>> ... which is exactly a 1 for 1 correspondance to calls to alloc()
>> from within Python. I had been writing my code like...
>>
>> toolbarItem =
>> NSToolbarItem.alloc().initWithItemIdentifier_(anIdentifier)
>> toolbarItem.autorelease()
>>
>> ... but under the code implied by the -release above, I should not be
>> calling autorelease(). Makes sense.
> That was my idea: Conversion from Objective-C to Python implies
> removing all calls to 'retain', 'release' and 'autorelease'.
In general, this is great. I'm still concerned about the differences
in behavior that it implies when it only affects a handful of methods.
>> However, I'm not 100% sure that it is the correct pattern. I.e. do
>> we really want to make the bridge responsible for keeping track of
>> every single method that causes object allocation and do the -release
>> as the current code does?
> Yes, this makes the bridge as transparent as possible. Asking the
> programmer to *sometimes* worry about reference counts is confusing. I
> added the code above to PyObjC to make sure I never have to think
> about reference counts when programming in Python.
The developer still has to sometimes worry about reference counts and
has to do so in a fashion that doesn't come naturally [in my
experience].
We are really talking about a couple of different behaviors.
First, there is the behavior that an assignment or set membership
within Python implies a -retain. To balance the implied -retain,
removal from a set or destruction of the assignment implies a -release.
This seems to come naturally to most developers and is quite consistent
across the environment.
--
The second behavior is that any call to +alloc, +allocWithZone:, +copy,
+copyWithZone:, and +mutableCopyWithZone: implies a -release as the
object comes into the Python environment. This changes the meaning of
these methods in Python-- the developer can no longer look up the
documentation of the method and believe what Apple has to say!
Instead, the developer has to effectively raise a mental exception and
remember that these methods no longer mean what they used to mean.
I'm [very likely] the first person to use this feature besides you and
look at the problems it caused me! I was expected those methods to
behave exactly as they do in ObjC, but they didn't and the resulting
behavior was-- as is often the case with retain/release/autorelease
bugs-- incredibly difficult to track down and fix.
Even now that I'm familiar with the behavior, I'm still want to write
the same [[[... alloc] init] autorelease] idiom that is the defined
standard within Obj-C.
The implementation also makes the assumption that those are the only
methods that return retained instances of an object-- the only methods
that transfer ownership. It even mentions...
# These 5 are documented in Apple's Objective-C book, in theory these
# are the only methods that transfer ownership.
... but, already, there is a discrepancy! The +new method also
transfers ownership -- it is just a cover for [[... alloc] init].
Speaking of the +new method, a common design pattern is to implement
factory methods of the form +newWith... or +newBy... or +newFrom... or
+new* where the ... or * is replaced by some indication as to how the
newly created object shoudl be configured. All of these methods
return alloc/init'd instances of the object and all transfer ownership
to the caller.
And what about the different forms of copy? -deepCopy and
-deepMutableCopy immediately come to mind. As well, I have seen (and
occasionally written) convenience methods that do things like copy the
contents of an NSImage, returning a new NSImageRep that is -retain'd or
copied the contents of an NSTextView and returned a -retain'd
NSTextStorage.
Now the developer has to remember the 5 methods that are different [6,
really] and they have to remember that any other method that 'transfers
ownership' works the old way.
> Due to a 'feature' of NSOutlineView I still have to worry about this,
> but that is a different story. In short: NSOutlineView doesn't
> 'retain' the results of outlineView:child:ofItem:, but does hang on to
> this value. This means you cannot use python objects as return values,
> because their proxies are autoreleased and NSOutlineView tries to
> access these proxies after they are released... IMNSHO this is a bug
> and I'm thinking of fileing a bug report on this.
It is expected it to survive beyond the end of the current pass through
the main event loop? That's definitely broken.
I.e. this'll break it?
- outlineView:child:ofItem:
{
return [NSString stringWithFormat: @"%d %d", 1, 2"];
}
>> ... extra details deleted ...
>> This subtly, but significantly, changes the object lifespan patterns
>> pervasive throughout the foundation. While this particular example
>> is somewhat bogus, there are other situations where such a lifespan
>> change could cause a failure. Example: the developer creates a
>> class whose instances listen for a particular notification and remove
>> themselves as observers when dealloc'd. A common idiom for creation
>> might be MyChangeListener.alloc().init().autorelease(), but that
>> isn't possible with the current implementation of the bridge.
> 'a common idiom for creation MIGHT be': Is this really a common idiom?
> I'd rather not worry about theoretical changes. As far as I can judge
> right now the current policy of fully automaticly managing reference
> counts is far more usefull than manually updating them.
Common? No, but it is a pattern that I use and have encountered in
looking at other developer's code (several times throughout my career,
I have been in a developer support role).
I have been building developer tools for a long, long time -- one of
the lessons that has been beat into my thick skull over the years is
that I will never be able to predict how the community of developers
using my code is going to use that code. If I can imagine it, there
are probably 10 other things I didn't imagine that some developer will
try to do!
> BTW. If you really want to use autorelease you could always do
> 'obj.retain();obj.autorelease()'. This is different from the
> Objective-C way of doing this, but at least it is clear you're doing
> something fishy. And this is pretty fishy: In most programs 'at least
> as long as the current threads current autorelease pool' is 'at least
> until the start of the next round throught the event loop', which is
> probably a very short time.
Event loops aren't just for processing user events. In WebObjects
(which used to be an Obj-C app), one pass through the event loop
involved parsing and responding to an entire HTTP request. Now that
Web Services are all the rage, handling an HTTP request inside of a
Cocoa application is gaining in popularity and the most common way to
do so is using a similar one-request-per-pass-through-run-loop model
(this is exactly what I'm doing in two of my apps now).
Handling an HTTP request is a considerably more complex process than
handling a keystroke. To further compound the complexity, there is
often some kind of a persistent backing store that may have to be read
from and/or written to.
Instead of validating just one field of user input, an HTTP request may
have to validate a couple of dozen hunks of user input all at once.
If this is through to a backing store of some kind, there is typically
several layers of delegated validation occurring during the whole
process.
Because there is often some concept of a session -- roughly equivalent
to a Document -- there is often change notifications flying around the
system and listeners of change notifications.
Run loops are also used to do other kinds of 'event' processing;
calculation engines, data gathering, timed event, background processes,
distributed objects, etc...
> BTW2. I really don't like 'invisible' objects like the
> MyChangeListener example. How am I as a code maintainer/reviewer to
> know that this is the intended behaviour and not an attempt to work
> around another bug (at least this doesn't leak memory).
The 'invisible' object is a design pattern found throughout the AppKit,
the Foundation, EOF, and a number of other object oriented kits.
NSNotificationCenter is at the heart of it. As something -- an
application, a athread, a whatever -- is initialized, various random
objects have the opportunity to act as notification listeners. The
objects posting the notifications don't have any clue what, if
anything, is listening for the notifications.
To put this into context; say, I create a bundleized app-- Project
Builder's PBX Bundles and Interface Builder's Palettes are both good
examples. Within that app within a single run loop, I post a
notification of 'prepare to change object graph', followed by a series
of 'changing object graph' notifications' and -- if everything went
well -- a 'done changing the object graph' notification. Now, during
one of the 'changing object graph' notifications, an exception is
raised and the 'done changing' notification never gets posted.
A bundle author may create a bundle that listens for the 'prepare'
notifications. When received, that bundle then creates a listener for
'changing' and 'done' notifications that is highly customized to the
userInfo dictionary contents passed in the 'prepare' notification.
In this case, the only way the object will be deallocated is if it is
in the autorelease pool!
(I ran into this situation when writing an IB palette recently)
>> As well, there is no way we can ever quantify all possible methods
>> that a developer might use to produce a freshly allocated object that
>> the current code would need to -release to preserve the pattern as it
>> exists now.
> Yes we can. Apple clearly documents in the Objective-C reference
> manual that only a small number of methods should return objects that
> you don't have to 'retain' (basicly alloc and copy). If you get an
> object in any other way you should call 'retain' yourself. As much as
> a dislike the idea of 'autorelease' it does solve the problem of
> object ownership (why couldn't Objective-C use a real garbage
> collector and do away with manual memory managment?).
As indicated above, the documentation is wrong. As well, any
developer can come along and create new methods (the +new...: pattern
is actually quite useful) that the API will not be aware of.
The retain/release implied by assignment and set membership is that it
is ubiquitous to messaging through the bridge. Modifying the behavior
of a handful of methods is not ubiquitous and, as such, will never
achieve 100% coverage-- this will lead to confusion and bugs.
(Because C is a pointer based language, an Obj-C garbage collector--
there are a few available-- can never achieve 100% coverage....)
> If a developer writes classes that don't follow this convention that
> is a bug, plain and simple. Such code is even a problem for plain
> Objective-C users: Given the text in the Objective-C manual users will
> expect they have to retain the result of method calls, a method that
> does not follow this convention is confusing and is bound to introduce
> memory leaks.
>
> As should be obvious by now I am not convinced at all that the current
> policy of PyObjC is wrong. I'd like to see a real-live example of why
> the current policy is bad before I change my position.
Examples -- all culled from code that I have encountered recently
(names changed to protect the guilty):
foo = [NSObject new]
bar = [someObject deepCopy]
baz = [myDictionary deepMutableCopy]
bob = [aCollection copyWithConvertedValues]
Scanning my source tree-- a tree containing piles of my own code, code
of my clients, and code from numerous third parties-- using...
find . -name '*.m' -exec grep -H -e '+.*new.*' {} \;
I find about 20 instances where I or others have implemented a +new...
method that returns a retained object. That is just looking for
+new... and does not include the other cases where a developer decided
that they-- for whatever reason, some may be invalid-- wanted to return
a retained instances of something.
Searching through /Developer/Examples/ turns up a couple of examples
where +new... is used in a fashion that does not return a -retain'd
instance. So much for consistency...
b.bum
|