Re: [Openinteract-dev] has_many progress

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Ray,

I've mulled over your comments, and I think I'm starting to see the 
light. :) I've looked at other OOP frameworks (Alzabo, Tangram, 
Hibernate, XORM), and they all seem to define the relationship in the 
class that has the ID field which refers to the other class.

I continue the discussion of cache and the 'manual' keyword below, but 
when we converge to an opinion on those, I'll redo the code your way. 
It's less work than it seems.

However, Chris has not replied yet. I'll ask him again about the 
compatibility issue. However, if I simply extend the has_a syntax the 
way you propose and smartly determine whether the old or the new syntax 
is used (I do it anyway now), it should not be a problem.

> I assume that's why you changed the key values in the 'has_a' spec to 
> the class name instead of the field name? But isn't this a problem if 
> you have multiple has_a fields of the same class, with different 
> behavior for fetching/saving/removing? Or do you just use an arrayref of 
> hashrefs instead of the single hashref in that case?  Blech.

I agree, it's detestable.

> And you say this full cache is necessary for the consistency circular 
> references. You mean to avoid infinite loops when you have A set to 
> auto-fetch B which is set to auto-fetch A? Seems to me that you should 
> be able to detect this type of thing when the classes are configured. 
> While I think a cache is a nice option that I may very well use, I don't 
> think it should be mandatory unless it's absolutely necessary. Can you 
> give me an example where consistency makes it absolutely necessary?

For example, you have a Book object with many Authors. If the 
application loads a book with the list of authors, adds another author 
to this book and asks the new author about its parent book, the current 
SPOPS implementation will re-fetch the book object, potentially ignoring 
the changes that were made to original book object.

However, I only now realized that you suggest saving both the 
parent-to-child reference and the reverse reference in the object 
fields. (I distinguish between $author->{book_id}, a number, and 
$author->{book}, an object. Let me know if I understood this correctly.)

Thus, once $book->{list_of_authors} is populated, adding a new author to 
the book should add the new object to this list, plus it should set the 
field $author->{book} to the original book object. This will make the 
above situation impossible.

However, inconsistencies still may occur if:

1) I create a new author-book relationship by setting the field 
$author->{book_id} instead of saying $book->add_author($author). This 
can be discouraged as an incorrect way of altering data, of course, but 
logically both make sense, and I'd like to be able to use them both.

Or, 2) if I am working with a second relationship, say books-to-artists 
(illustrators). In this case, in one place in my code, I could retrieve 
a book object by saying $artist->book, and then in another place I'll 
call $author->book, and even though they may refer to the same book, 
they will always be two different objects.

So looks like cache is still necessary.

> I've used only application level caching, never SPOPS-level caching so 
> I'm not clear on how this works. Does this mean that SPOPS objects no 
> longer go out of scope and get destroyed until the program ends or the 
> cache is manually flushed? In a mod_perl environment then, do you flush 
> the cache at the end of every request or what? Seems like you could get 
> some huge apache children pretty quickly if you're not careful.

It has to be flushed, of course.

> It was for completeness and to offer a mode that is equivalent to 
> current has_a behavior, that is, the field normally just gives you an 
> id, but you also have a convenience method for fetching the object as 
> well. My idea was that any 'has_a' spec, including 'manual', would 
> create convenience methods for fetching the related objects. The 'auto' 
> and 'lazy' options would simply call these methods automatically at the 
> appropriate time and stash the return values in the object. So in the 
> way I was picturing things, implementing 'manual' would simply be the 
> first step in implementing 'auto' and 'lazy'.

I feel dumb - I still don't quite get it. However, in your original 
examples the method X->myA returns the id of A in the case of manual 
fetch and A itself in the case of lazy/auto fetch, right? In my view, 
X->myA always return the id and X->fetch_myA always returns the object 
(I tend to use them like $author->book_id and $author->book in my 
applications). So there is no need for manual fetching.

I think that having X->myA return inconsistent values may be confusing.
Let me know what you think. Perhaps I am still missing the utility of 
manual fetching.

> Without the manual option, you can't specify a relationship at all 
> without having it define auto-fetching behavior. You can't, for example, 
> auto-remove an object without having it also auto-fetched (which I can 
> imagine you might want if you typically only need to deal with the ID of 
> the secondary object).

But in this case you still have to fetch the dependent object, because 
it may define its own rules of auto-removal of even more objects.

> Just curious, does your implementation of 'auto' generate a public 
> 'fetch_myA' method, for example?

See above - even if the fetch method name ('alias' in the current 
terminology') is not specified in the configuration, it'll be 
auto-created by using the name given to the target class. (I mean the 
name of config hash key for the target class, not its Perl name. In the 
example I sent you, X_alias is such a name.)

>> Autosaving is always off by default, to preserve compatibility.
> 
> 
> I'm not sure I follow. Auto-fetching is new so there is no previous 
> corresponding save behavior to be backward compatible with. Classes 
> defined without any auto-fetch/auto-remove behavior, could behave as 
> always. Classes defining new auto behavior could have whatever default 
> 'save' behavior we think makes sense. So I'm not sure there is a 
> backward compatibility issue here.

> And the save behavior described in my updated proposal posted 4 Jan 2002 
> still seems to be the most consistent and make the most sense to me.

Yes, if we change the syntax, we will be free to follow you rules.

>> OTOH, there are three types of removes - 'auto', 'manual' and 
>> 'forget'. 'Auto' means complete removal of dependent objects, 'forget' 
>> - nullifying id fields pointing to the removed objects, and 'manual' - 
>> no action. The default should logically be 'forget', but it may 
>> conflict with no autosaving, so I'll have to set it to 'manual'.
> 
> 
> OK, but what is the 'reverse_remove'? Is specifying 'reverse_remove' => 
> 'forget' in a 'has_a' the same as specifying 'remove' => 'forget' in the 
> corresponding 'has_many'? If so, which one takes precedence if they are 
> inconsistent? It looks like 'reverse_remove' => 'forget' is equvalent to 
> what I called 'null_by', right?. I personally think that having multiple 
> (and possibly conflicting) ways/places of defining the behavior for a 
> single relationship is asking for trouble. I think it will make it 
> difficult to write correct and clear documentation and it will create 
> some debugging nightmares. (More on this below)

This should not be a problem, because in my current proposal the 
programmer specifies either has_a or has_many (which implies the reverse 
has_a), so no conflicts should be possible. However, if we change the 
syntax, this issue will go away.

> Why do you include both 'link_class' and 'link_class_alias'? Aren't they 
> redundant? (see [1] below).

'Link_class' refers to the Perl class name, 'link_class_alias' - to the 
method name used to retrieve its instances (this is your 'list_field' in 
the 'link' hash).

> And I suppose the 'table' is only necessary if you don't specify the 
> 'link_class' and vice versa, right?

Yup. I am a little unhappy that in your proposal one has to have a Perl 
class for the linking table even if one is never going to use it, but I 
guess this is necessary for the sake of the uniform syntax.

> * didn't see any mention of the 'name' option for explicitly specifying 
> the name of generated methods

It's called 'alias'.

> * not clear to me what auto/lazy fetching, auto removing, etc options 
> are implemented for links_to

If we use your syntax, they will be the same as in the fetch_by case.

> [1]  I confess I never really did understand the purpose of the alias. 
> What is the difference between the alias and the class? Isn't one of 
> them redundant?

The alias is used to generate access methods in other classes referring 
to this one. In your configuration examples you always give a value to 
the 'name' key, but if it's omitted, methods are given names like 
'fetch_X_alias'.

Simon

-- 

Simon (Vsevolod ILyushchenko)   si...@cs...
				http://www.simonf.com

Terrorism is a tactic and so to declare war on terrorism
is equivalent to Roosevelt's declaring war on blitzkrieg.

Zbigniew Brzezinski, U.S. national security advisor, 1977-81