Re: [Openinteract-dev] has_many progress

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Nov 11, 2004, at 8:40 PM, Vsevolod (Simon) Ilyushchenko wrote:
> Okay, this is my last stumbling block. I just don't get why you want 
> to sometimes refer to the object and sometimes to the id. If it's 
> implemented this way, I'll have to say $author->book->id instead of 
> $author->book_id for auto/lazy, if I want to to just get the id value. 
> Also, if I switch from auto/lazy to manual fetch, I'll have to change 
> my code.

But you'll have to change code anyway if you switch from auto/lazy to 
manual if you have any code that accesses the book object (since it 
will no longer be auto/lazy-fetched). I view the specification of the 
auto/lazy vs manual as part of specifying the 'type' of the field. If 
you change the 'type' you should expect to have to change code in order 
to get at the same data (e.g. the id).

If I understand your proposal you would always have the field hold the 
id and have it store the auto/lazy-fetched object into some other key 
in the parent object? What key? This approach requires you to specify 
for each auto/lazy fetched field another name by which you access the 
object. You have to worry about inconsistency between the id and the 
object. When you go to save, if the id in the field doesn't match the 
id of the object in corresponding key, which do you save?

I just think it is simpler and more straightforward to use the single 
field and think of the auto/lazy vs manual as part of the type 
definition for that field.

> Can you give me an example of manual fetch that cannot be implemented 
> otherwise or that makes life more convenient? Manual remove, OTOH, is 
> clearly different from the other two methods of removal.

Well, I think the example that I gave in a previous e-mail ... wait, I 
now see that that example made no sense. Sorry. For some reason I was 
thinking that the remove spec appeared inside the fetch, not at the 
same level. Nevermind.

OK, I think I understand your point. IF you use separate field names 
for the id (persistent field) and the auto/lazy-fetched object (temp 
non-persistent field), I suppose there is no need for manual fetch.

But if you use the same field for the id and the auto-fetched object, 
then you DO need the manual option. I still think this is a cleaner 
approach since you're not cluttering up your object with extra fields 
unnecessarily and it allows you treat some fields as ids (manual) and 
others as objects (auto/lazy), which I find useful conceptually.

>> I think the only thing you need to do is ensure that you don't have 
>> any auto-fetching loops (might even want to include lazy-fetching) 
>> when you do the configuration. In other words, you don't want to have 
>> a book auto-fetch it's list of authors AND have the author set to 
>> auto-fetch its book, creating an infinite loop. Even if you use a 
>> cache this would cause circular references which cause a problem for 
>> garbage collection unless you use weak references.
>
> Hmmm... circular references will actually come up even without 
> circular configuration. If an author refers to a list of books, and 
> the books refer back to the author, there you have it...

Right, I agree. After thinking about this a bit more, I think there are 
two separate issues here. They are related to one another and both 
affect, or are affected by, the presence or absence of caching. One is 
the circular configuration which could result in auto-fetching loops 
and the other is circular references in the objects which prevent 
automatic garbage collection.

I think the first one should be handled by SPOPS and it should be at 
the earliest possible stage (configuration of the class if possible, 
run-time if not).

I think the second one is outside the scope of SPOPS. The developer 
needs to be aware of which objects hold references which other objects. 
It is certainly possible for an object A to reference B which 
references C which in turn references A causing a circular reference 
involving more than just two classes. Another thing to keep in mind is 
that sometimes circular references are needed/desired, but the 
developer needs to handle the breaking of the circular ref manually 
before destruction. I don't think it is the job of SPOPS to prevent 
circular references.

When implementing an SPOPS cache it may be possible to handle circular 
references in an intelligent way during flushing of the cache for 
example, but I view this as a separate issue that should not be mixed 
in with the new has-a semantics.

> I'm currently doing this checking during saving/removing objects. The 
> configuration allows loops - I'm afraid that if this is prohibited at 
> the configuration stage, the potential for error is too great.

Are you talking about circular references or auto-fetch loops?  If the 
latter, why is it more error prone at the configuration stage? I think 
the circular reference issue is fundamentally a run-time issue since it 
deals with instances of objects referring to one another, and for the 
reasons stated above, I don't think SPOPS should try to tackle this 
except maybe w.r.t. caching. On the other hand I think the auto-fetch 
loop issue is about class (as opposed to instance) behavior and is 
therefore fundamentally a configuration issue. No class should be able 
to auto-fetch a field which through further auto-fetch configuration 
ends up auto-fetching an object of the original class.

It may be easier to implement this checking at run-time, but it is 
still a configuration level issue and I think it should be handled 
during configuration if possible. In other words, if a developer 
creates an auto-fetch loop, he has a bug. If the checking is done at 
the config stage, the bug will be exposed the first time the classes 
are created. If the checking is done during fetching, the class might 
be used successfully for a while without the bug be exposed if the 
particular field in question isn't populated for example. It's always 
best to detect errors at the earliest possible stage.

>> But since you can put a 'has_many' in Book and a 'has_a' in Author, 
>> for example, where Author has a 'book' field, I think they can be 
>> inconsistent. In my proposal you, for the 'has_a' in Author, you 
>> either specify a forward or reverse direction with no way to specify 
>> something conflicting in the Book class.
>
> Who's to stop the developer from saying that Author has_a Book and 
> Book has_a Author?

True, true. But I think of a 'has_many' and a reverse 'has_a' as two 
ways of defining the behavior of a single uni-directional link between 
the two classes (the Author's book field). In that sense, allowing both 
makes it possible to define contradicting behavior for that single 
link. I view your example, on the other hand, as defining the behavior 
of two different uni-directional (and obviously circular) links. I 
don't think this is a contradictory configuration, but it does require 
SPOPS to ensure that it doesn't create an infinite auto-fetch loop, and 
the developer needs to be aware of the implications for circular 
references (which will depend on manual vs lazy and cache 
implementation).

Ray Zimmerman
Director, Laboratory for Experimental Economics and Decision Research
428-B Phillips Hall, Cornell University, Ithaca, NY 14853
phone:  (607) 255-9645