You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(19) |
Nov
(22) |
Dec
(19) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(35) |
Feb
(5) |
Mar
(13) |
Apr
(9) |
May
(3) |
Jun
(16) |
Jul
|
Aug
(6) |
Sep
(15) |
Oct
(5) |
Nov
(3) |
Dec
(7) |
2003 |
Jan
(16) |
Feb
(10) |
Mar
(19) |
Apr
(13) |
May
(5) |
Jun
(20) |
Jul
(33) |
Aug
(9) |
Sep
(1) |
Oct
(12) |
Nov
(17) |
Dec
(2) |
2004 |
Jan
(3) |
Feb
(9) |
Mar
(7) |
Apr
(16) |
May
(6) |
Jun
(6) |
Jul
(18) |
Aug
(8) |
Sep
(27) |
Oct
(8) |
Nov
(14) |
Dec
(29) |
2005 |
Jan
(1) |
Feb
(11) |
Mar
(33) |
Apr
(2) |
May
(4) |
Jun
(21) |
Jul
(41) |
Aug
(6) |
Sep
|
Oct
(1) |
Nov
|
Dec
(1) |
2006 |
Jan
(8) |
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
|
From: Salve J N. <sal...@me...> - 2004-12-02 14:59:21
|
Some thoughts... Chris Winters wrote: > Since I didn't get much of a response from my previous inquiry about a > possible change to the server configuration files I've created an example: > > http://www.openinteract.org/new_config_example/ > > The example uses the live configuration from my website (with passwords > and such changed of course) and shows the full configuration vs. what it > would look like broken down into files. It's all the same data. The main > configuration contains the data users are most likely to change (database, > session, email server + addresses, deployment context) and that's it. > > See what you think. Given the possibility that the amount of configuration entries (sections, key/value pairs) become large enough, you might want to consider complexity of "depth" (lots of info at few places - a "deep" configuration tree structure) vs. complexity of breadth (lots of places with just a little info everywhere - a "shallow" or "flat" config tree.) Proximity of configuration entries (wether two unrelated entries can be found in the same file, directory or even filesystem) should be part of your consideration. Obviously, keeping all configuration close together has some benefits (searchability, easy access) but too much in one place can increase the learning curve and/or make comprehension more difficult. This is roughly the same problem as "having too much documentation" vs. "having too little documentation easily available" I think the sweet spot most likely lays somewhere in-between the two, and that the clue to make this information accessible (easy to find/scan, read, understand) lays with good (sensible, intuitive, natural, conventions/standards-abiding) abstractions and categorization of the information, and to keep the "depth" at a minimum while still staying within a "width" that is comprehensible for the casual observer. Given this, I'd make the following suggestions: * Reduce the amount of configuration files, if possible. * Group related config sections more tightly together (e.g. put the "main" datasource together with the other datasources.) * KISS. Is it necessary to differentiate between Controllers and Content Generators? (Perhaps a different issue, but maybe worthwhile to look into if your're trying to clean up configuration anyway. Reducing the amount of necessary concepts to learn helps both with file size and understandability. :) * Package-related config should be placed somewhere very close, but seperatly from the server config (e.g. the "fulltext" package config has IMHO nothing to do in the conf/server.ini file, and the base_user action.ini files are too far away in pkg/base_user/conf.) I'd like to suggest that all package config files are accessible from a suitable "package.d" directory (e.g. "conf/packages.d/fulltext.ini"), and that the main config file @INCLUDEs these en-masse. The conf/packages.d/*.ini files need not to be more than symlinks to the packages in order to get a positive effect. * When upgrading some software (OI2 or .zip packages,) I'd like to keep my old config, but still get the new features. Splitting up files (as you described) may help with this, so do place commonly changed config entries in one or more seperate files. * BTW, I'd like to keep my timezone when I upgrade the server. :) * s/server_action_info.ini/server_actions.ini/g (Naming consistency is good. :) * The ini file parser should warn about inconsistencies and duplicate entries if they're not allowed. * Configuration entries that have related entries that for some greater reason can't be placed "close by", should at least reference the related entries. Just my USD 0.02. Feel free to criticise! :) - Salve -- Salve J. Nilsen <salvejn at met dot no> / Systems Developer Norwegian Meteorological Institute http://met.no/ Information Technology Department / Section for Development |
From: Chris W. <ch...@cw...> - 2004-12-01 22:19:44
|
> On Dec 1, 2004, at 2:58 PM, Vsevolod (Simon) Ilyushchenko wrote: >>> my $book = Book->fetch(42); >>> $book->publisher == object (whether manual/lazy fetch) >>> $book->publisher_id == ID >>> $book->author == list of objects (whether manual/lazy fetch) >>> Is that right? Because it sounds from the discussion as if: >>> $book->publisher == sometimes ID, sometimes object, >>> depending on the configuration >>> which IMO is very confusing. >> >> Yes, yes, yes! My point exactly! > > Aw, c'mon Simon ... I thought I'd convinced you on this one too. Now I > have to convince Chris too? :-) > > Chris, have you gotten through our whole previous discussion? I think > I've explained why I think this is NOT confusing (at least to me :-) I > can try to restate my reasons if you think it would help. I believe I read through everything. I just think you're overestimating how much time and effort people are willing to spend to sort these out. We who have been using SPOPS for a while see the relationship between the configuration and the code as blindingly obvious. But for new people -- or even people who only use SPOPS occasionally -- it's not. And the idea that the behavior of an existing method could change based on configuration is very disconcerting. People like to think of their objects as somewhat stable -- even Perl people! -- and when we change the meaning under the hood like that it's untrustworthy. OTOH, the idea that something *new* happens -- like adding new methods -- as a result of configuration changes isn't so bad because you have the choice if you want to use the new feature or not. Chris -- Chris Winters (ch...@cw...) Building enterprise-capable snack solutions since 1988. |
From: Ray Z. <rz...@co...> - 2004-12-01 22:05:51
|
On Dec 1, 2004, at 2:58 PM, Vsevolod (Simon) Ilyushchenko wrote: >> my $book = Book->fetch(42); >> $book->publisher == object (whether manual/lazy fetch) >> $book->publisher_id == ID >> $book->author == list of objects (whether manual/lazy fetch) >> Is that right? Because it sounds from the discussion as if: >> $book->publisher == sometimes ID, sometimes object, >> depending on the configuration >> which IMO is very confusing. > > Yes, yes, yes! My point exactly! Aw, c'mon Simon ... I thought I'd convinced you on this one too. Now I have to convince Chris too? :-) Chris, have you gotten through our whole previous discussion? I think I've explained why I think this is NOT confusing (at least to me :-) I can try to restate my reasons if you think it would help. Ray |
From: Vsevolod (S. I. <si...@cs...> - 2004-12-01 19:58:44
|
> > my $book = Book->fetch(42); > $book->publisher == object (whether manual/lazy fetch) > $book->publisher_id == ID > $book->author == list of objects (whether manual/lazy fetch) > > Is that right? Because it sounds from the discussion as if: > > $book->publisher == sometimes ID, sometimes object, > depending on the configuration > > which IMO is very confusing. Yes, yes, yes! My point exactly! Simon -- Simon (Vsevolod ILyushchenko) si...@cs... http://www.simonf.com Terrorism is a tactic and so to declare war on terrorism is equivalent to Roosevelt's declaring war on blitzkrieg. Zbigniew Brzezinski, U.S. national security advisor, 1977-81 |
From: Chris W. <ch...@cw...> - 2004-12-01 19:35:09
|
> Given the possibility that the amount of configuration entries > (sections, key/value pairs) become large enough, you might want to > consider complexity of "depth" (lots of info at few places - a "deep" > configuration tree structure) vs. complexity of breadth (lots of places > with just a little info everywhere - a "shallow" or "flat" config tree.) The INI format ensures we don't get too deep. I find configurations that get too deep very confusing, which makes for confusing software. > Proximity of configuration entries (wether two unrelated entries can be > found in the same file, directory or even filesystem) should be part of > your consideration. Obviously, keeping all configuration close together > has some benefits (searchability, easy access) but too much in one place > can increase the learning curve and/or make comprehension more difficult. Yes, that's what I'm responding to. It hit me when looking over the Quick Start guide that to get OI started you only need to edit a few items. But they're spread all over the configuration file! So why not have the stuff you absolutely need to edit up front and the rest of it easily available but not right in front of your eyes? > This is roughly the same problem as "having too much documentation" vs. > "having too little documentation easily available" > > I think the sweet spot most likely lays somewhere in-between the two, > and that the clue to make this information accessible (easy to > find/scan, read, understand) lays with good (sensible, intuitive, > natural, conventions/standards-abiding) abstractions and categorization > of the information, and to keep the "depth" at a minimum while still > staying within a "width" that is comprehensible for the casual observer. Right, that's what I'm shooting for and why I used topic-oriented include files. > Given this, I'd make the following suggestions: > > * Reduce the amount of configuration files, if possible. A good goal, but we also don't want to make them encompass too much. > * Group related config sections more tightly together (e.g. put the > "main" datasource together with the other datasources.) You're right that this is a bit of a disconnect, but I did it so that a new user would only have to edit 'server.ini' to get started quickly. > * KISS. Is it necessary to differentiate between Controllers and > Content Generators? (Perhaps a different issue, but maybe worthwhile to > look into if your're trying to clean up configuration anyway. Reducing > the amount of necessary concepts to learn helps both with file size and > understandability. :) Very, very true. > * Package-related config should be placed somewhere very close, but > seperatly from the server config (e.g. the "fulltext" package config has > IMHO nothing to do in the conf/server.ini file, and the base_user > action.ini files are too far away in pkg/base_user/conf.) I'd like to > suggest that all package config files are accessible from a suitable > "package.d" directory (e.g. "conf/packages.d/fulltext.ini"), and that > the main config file @INCLUDEs these en-masse. The conf/packages.d/*.ini > files need not to be more than symlinks to the packages in order to get > a positive effect. That's a really interesting idea. With the global package configuration override file I've tried to keep everything in one place, but the global override stuff is probably too confusing for most people to use. (Then again, most people probably don't need to modify their package/SPOPS configurations either.) The other part we have to balance is package upgrades -- how to allow the user to keep any changes she's made with the potentially new package configuration? Maybe with a package upgrade we read in the existing configurations and compare them to the new ones, writing out any differences to a separate file? I don't think we can do symlinks (not cross-platform), but I like the idea of keeping all the SPOPS + Action data in one location.... > * When upgrading some software (OI2 or .zip packages,) I'd like to > keep my old config, but still get the new features. Splitting up files > (as you described) may help with this, so do place commonly changed > config entries in one or more seperate files. OI2 should never, never overwrite your server.ini file with an upgrade. Once we write it at server creation we don't touch it. (In fact we don't touch anything in conf/ except for repository.ini.) > * BTW, I'd like to keep my timezone when I upgrade the server. :) This shouldn't get overwritten. Does it? > * s/server_action_info.ini/server_actions.ini/g (Naming consistency is > good. :) Actually 'server_actions.ini' was my first name but I thought it implied that you'd actually find actions there, which isn't true. > * The ini file parser should warn about inconsistencies and duplicate > entries if they're not allowed. Dupes we can do. Inconsistencies maybe not. But I'd imagine duplicated configuration would be the main problem. > * Configuration entries that have related entries that for some > greater reason can't be placed "close by", should at least reference the > related entries. Absolutely. > Just my USD 0.02. Feel free to criticise! :) Thanks! Chris -- Chris Winters (ch...@cw...) Building enterprise-capable snack solutions since 1988. |
From: Chris W. <ch...@cw...> - 2004-12-01 19:22:17
|
> ... > But you'll have to change code anyway if you switch from auto/lazy to > manual if you have any code that accesses the book object (since it > will no longer be auto/lazy-fetched). I view the specification of the > auto/lazy vs manual as part of specifying the 'type' of the field. If > you change the 'type' you should expect to have to change code in order > to get at the same data (e.g. the id). > > If I understand your proposal you would always have the field hold the > id and have it store the auto/lazy-fetched object into some other key > in the parent object? What key? This approach requires you to specify > for each auto/lazy fetched field another name by which you access the > object. You have to worry about inconsistency between the id and the > object. When you go to save, if the id in the field doesn't match the > id of the object in corresponding key, which do you save? > > I just think it is simpler and more straightforward to use the single > field and think of the auto/lazy vs manual as part of the type > definition for that field. > ... > > OK, I think I understand your point. IF you use separate field names > for the id (persistent field) and the auto/lazy-fetched object (temp > non-persistent field), I suppose there is no need for manual fetch. > > But if you use the same field for the id and the auto-fetched object, > then you DO need the manual option. I still think this is a cleaner > approach since you're not cluttering up your object with extra fields > unnecessarily and it allows you treat some fields as ids (manual) and > others as objects (auto/lazy), which I find useful conceptually. Just to be clear what we're talking about here: book: book_id, title, publisher_id publisher: publisher_id, name book_author: book_id, author_id author: author_id, name Book 1 ---> 1 Publisher Book 1 ---> * Author my $book = Book->fetch(42); $book->publisher == object (whether manual/lazy fetch) $book->publisher_id == ID $book->author == list of objects (whether manual/lazy fetch) Is that right? Because it sounds from the discussion as if: $book->publisher == sometimes ID, sometimes object, depending on the configuration which IMO is very confusing. Chris -- Chris Winters (ch...@cw...) Building enterprise-capable snack solutions since 1988. |
From: Vsevolod (S. I. <si...@cs...> - 2004-12-01 18:49:08
|
Chris, Please review our subsequent discussion with Ray titled "has_many progress". He has pretty much converted me to his point of view. He proposes 1) keeping the cache optional (so I'll have to test two configuration), 2) storing dependent objects in hashes in objects instead of retrieving them every time (thus, with cache enabled, fetch_many can skip trips to the database), and most importantly, 3) using his syntax for defining relationships. I can still support the current syntax, of course, but forcing the new configs follow the old syntax results in awkward rules. I looked at some other frameworks, and what Ray proposes turns out to be fairly standard. I am sorry it's not as straightforward. So even though I already have the code for the currently-out-of-favor syntax, I'll have to change it to use Ray's syntax. Please let us know what you think. Simon Chris Winters wrote: > Terribly sorry it took so long to respond. > > >>Now that Chris has some time (hopefully) to consider the changes, I'd >>like to submit a patch that would add has_many, auto-save/auto-delete >>(optional) and full cache support (which is necessary for the >>consistency circular references). Before I do that, I'd like to get >>Chris's approval on the following architectural decisions: >> >>1. There will be a basic cache which will always return the cached >>object, not its copy. > > > Ok. > > >>2. The fetch_many() now will also use cache. It will still make a call >>to the database, because it won't know which ids to request from the >>cache, though. > > > Ok -- although I'd like to have the current behavior available if the > cache is not enabled. (Where it fetches the actual data from the database, > not just the IDs.) > > >>3. Auto-save and auto-delete are off by default. This is not what Ray >>requested, but it's the only way to stay backwards compatible. With >>auto-save on, the graph of object in memory will be traversed to >>determine what has been changed. (Thus, I have deep_changed() instead of >>changed().) > > > I'm for backward compatibility. > > >>4. It will be possible to use a separate class that corresponds to the >>linking table in links_to (think ClubMembership). The syntax I came up >>with is: >> Club => { >> class => 'Club', >> ... >> links_to => >> { 'Person' => >> { table => 'ClubMembership', >> link_class => 'ClubMembership', >> link_class_alias => 'memberships', >> alias => 'members', >> to_id_field => 'member_id', >> from_id_field => 'club_id' }, > > > This looks good. > > >>The ClubMembership class can have extra attributes (like date), but they >>will be specified in a regular way in the config file. The new attribute >>'link_class_alias' will return an arrayref of the ClubMembership >>instances. Again, this is not as elegant as Ray proposed, but backwards >>compatible. > > > Ok. > > >>An open question is what to do with has_a. Currently it's implemented in >>ClassFactory, not ClassFactory/DBI, but adding auto-save and auto-delete >>functionality requires the has_a code to be DBI-aware. I can add extra >>functionality into ClassFactory/DBI, but I am not sure which effect it >>would have on LDAP storage, for example. > > > We can tell ClassFactory::DBI to override the has_a from ClassFactory. (I > don't remember how to do it off the top of my head, but I'm 98% sure it's > already built-in.) > > >>Oh, and a style question. I have marked all of my changes with "tags": >>#Simon >>#/Simon >>and I commented out, not deleted, all replaced code, for ease of >>reference. Is this necessary? > > > No -- that's why we have CVS :-) Just use good comments when you commit. > > Chris > -- Simon (Vsevolod ILyushchenko) si...@cs... http://www.simonf.com Terrorism is a tactic and so to declare war on terrorism is equivalent to Roosevelt's declaring war on blitzkrieg. Zbigniew Brzezinski, U.S. national security advisor, 1977-81 |
From: Chris W. <ch...@cw...> - 2004-12-01 18:48:35
|
>... > I think that both links_to and has_many share a fundamental flaw. They > both allow relationships to be defined from the "other" end, making it > possible to define conflicting behavior for a single relationship. This > unnecessarily complicates the error checking or precedence rules and > the documentation. I think always forcing the relationship to be > defined by the class with the linking field, and having a single syntax > (has_a) for defining it, makes for a much simpler, cleaner, more > consistent, easier to understand/document/implement, less error prone > design. Including both approaches (e.g. has_many and reverse_auto or > auto_by), in my opinion, only complicates and confuses things further. Good point. > For this reason, I think we should forget trying to make the new syntax > backward compatible. I think we should leave the old syntax in place > (but deprecated) as long as necessary for backward compatibility and > add a completely redesigned new syntax that folks can migrate to > gradually or as they need the new features. If it were up to me I > wouldn't touch links_to at all, and I wouldn't touch the old has_a > either. I'd just add a new configuration handler for the case where the > key of a 'has_a' spec matches one of the field names, in which case you > assume it is the new syntax I proposed, otherwise you assume it's a > class and use the old has_a semantics. My main concern is that we don't break existing code. Everyone seems to also be avoiding this so: great. Next is ensuring that new features are easily implemented, testable and maintainable. I suspect Ray's right in that overloading existing syntax will make all of these more difficult. I'm not against adding new syntax/configuration keys as necessary and if it takes care of both of these items, it's probably the best option. > [1] I confess I never really did understand the purpose of the alias. > What is the difference between the alias and the class? Isn't one of > them redundant? Yes, but IME most humans prefer the alias (e.g., 'news') vs. the class ('MyApp::Persist::News'). Plus it's less typing :-) Chris -- Chris Winters (ch...@cw...) Building enterprise-capable snack solutions since 1988. |
From: Chris W. <ch...@cw...> - 2004-12-01 18:39:26
|
Terribly sorry it took so long to respond. > Now that Chris has some time (hopefully) to consider the changes, I'd > like to submit a patch that would add has_many, auto-save/auto-delete > (optional) and full cache support (which is necessary for the > consistency circular references). Before I do that, I'd like to get > Chris's approval on the following architectural decisions: > > 1. There will be a basic cache which will always return the cached > object, not its copy. Ok. > 2. The fetch_many() now will also use cache. It will still make a call > to the database, because it won't know which ids to request from the > cache, though. Ok -- although I'd like to have the current behavior available if the cache is not enabled. (Where it fetches the actual data from the database, not just the IDs.) > 3. Auto-save and auto-delete are off by default. This is not what Ray > requested, but it's the only way to stay backwards compatible. With > auto-save on, the graph of object in memory will be traversed to > determine what has been changed. (Thus, I have deep_changed() instead of > changed().) I'm for backward compatibility. > 4. It will be possible to use a separate class that corresponds to the > linking table in links_to (think ClubMembership). The syntax I came up > with is: > Club => { > class => 'Club', > ... > links_to => > { 'Person' => > { table => 'ClubMembership', > link_class => 'ClubMembership', > link_class_alias => 'memberships', > alias => 'members', > to_id_field => 'member_id', > from_id_field => 'club_id' }, This looks good. > The ClubMembership class can have extra attributes (like date), but they > will be specified in a regular way in the config file. The new attribute > 'link_class_alias' will return an arrayref of the ClubMembership > instances. Again, this is not as elegant as Ray proposed, but backwards > compatible. Ok. > An open question is what to do with has_a. Currently it's implemented in > ClassFactory, not ClassFactory/DBI, but adding auto-save and auto-delete > functionality requires the has_a code to be DBI-aware. I can add extra > functionality into ClassFactory/DBI, but I am not sure which effect it > would have on LDAP storage, for example. We can tell ClassFactory::DBI to override the has_a from ClassFactory. (I don't remember how to do it off the top of my head, but I'm 98% sure it's already built-in.) > Oh, and a style question. I have marked all of my changes with "tags": > #Simon > #/Simon > and I commented out, not deleted, all replaced code, for ease of > reference. Is this necessary? No -- that's why we have CVS :-) Just use good comments when you commit. Chris -- Chris Winters (ch...@cw...) Building enterprise-capable snack solutions since 1988. |
From: Chris W. <ch...@cw...> - 2004-12-01 17:44:54
|
> I think the @INCLUDE syntax might not be a very good idea. Some quick > points: > > -- FIND > * Editors find makes include quite useless. > * All editors have find and all users can use it. > * But includes break editors find. > > -- OVERLAPPING ENTRIES > * Overlapping entries are hard to spot with include. > * The way include works with overlapping entries is not trivial. > * Errors resulting from this might be hard to find. These are all excellent points. I'm convinced: the multiple server.ini files won't be the default. (I'll keep the feature in the INI parser though.) Thanks! Chris -- Chris Winters (ch...@cw...) Building enterprise-capable snack solutions since 1988. |
From: A. <ant...@he...> - 2004-12-01 17:27:57
|
Hi! I think the @INCLUDE syntax might not be a very good idea. Some quick points: -- FIND * Editors find makes include quite useless. * All editors have find and all users can use it. * But includes break editors find. -- OVERLAPPING ENTRIES * Overlapping entries are hard to spot with include. * The way include works with overlapping entries is not trivial. * Errors resulting from this might be hard to find. - Antti |
From: Chris W. <ch...@cw...> - 2004-12-01 15:07:19
|
Since I didn't get much of a response from my previous inquiry about a possible change to the server configuration files I've created an example: http://www.openinteract.org/new_config_example/ The example uses the live configuration from my website (with passwords and such changed of course) and shows the full configuration vs. what it would look like broken down into files. It's all the same data. The main configuration contains the data users are most likely to change (database, session, email server + addresses, deployment context) and that's it. See what you think. Chris -- Chris Winters (ch...@cw...) Building enterprise-capable snack solutions since 1988. |
From: David N. <dav...@gm...> - 2004-11-30 18:55:27
|
On Tue, 30 Nov 2004 00:14:56 -0800 (PST), Darren Duncan <da...@da...> > > I appreciate you being as up front with this as early as you are. Now I > won't have to waste my time implementing all those specs and tests etc. Me too. On yet another hand, having a standard problem to solve in yer examples would make picking and choosing from the multitude easier, when you have to select a persistence abstraction, from looking at the documentation of them all. |
From: Dave R. <au...@ur...> - 2004-11-30 17:13:35
|
On Tue, 30 Nov 2004, Simon Cozens wrote: > Darren Duncan: >> First of all, I would *definitely* be interested in implementing the >> things you mention using my bleeding-edge Rosetta/SQL::Routine framework. > > Ah. I didn't ask for that. > > You see, there are about 30,000 new, cool and bleeding edge object persistence > / object representation frameworks on CPAN, and I don't want the chapter to be > 30,000 pages long. To avoid this, I had to make a selection, and the problem > with making selections is that it upsets people who wrote the things which > didn't get selected. I'm sorry about that. I tried to select a list of > frameworks which were either well known, widely used or well designed; you can > amuse yourself by guessing which category I filed each one into. The final > list was Alzabo, SPOPS, CDBI, Pixie, Class::Persist and Tangram, and I'm > afraid that's not negotiable. I have to make a selection somehow. However, I would be happy to include this stuff in another document at poop.sourceforge.net. -dave /*=========================== VegGuide.Org Your guide to all that's veg. ===========================*/ |
From: Darren D. <da...@Da...> - 2004-11-30 08:28:34
|
On Tue, 30 Nov 2004, Simon Cozens wrote: > > First of all, I would *definitely* be interested in implementing the > > things you mention using my bleeding-edge Rosetta/SQL::Routine framework. > > Ah. I didn't ask for that. > You see, there are about 30,000 new, cool and bleeding edge object persistence > / object representation frameworks on CPAN, and I don't want the chapter to be > 30,000 pages long. To avoid this, I had to make a selection, and the problem > with making selections is that it upsets people who wrote the things which > didn't get selected. I'm sorry about that. I tried to select a list of > frameworks which were either well known, widely used or well designed; you can > amuse yourself by guessing which category I filed each one into. The final > list was Alzabo, SPOPS, CDBI, Pixie, Class::Persist and Tangram, and I'm > afraid that's not negotiable. I have to make a selection somehow. Thanks for clearing that up now. The original posting gave no indication that there was a "final list". It simply said that all people who want to submit for a framework can do so. In fact, the closest thing to any list was giving a few examples and saying who was doing some and which needed help. No indication that was it. If I am wrong about this, then please say where the list was first announced, as I missed that context. I appreciate you being as up front with this as early as you are. Now I won't have to waste my time implementing all those specs and tests etc. On the other hand, I will continue to give feedback and suggestions on the given schema and other things that the approved frameworks need to implement. > Implementing the stuff requested might be good for your framework anyway and > to check that it can really do all those things, but I'm afraid it won't get > it covered in the book. (although I will be naming some notable alternative > choice such as Rosetta in the last few paragraphs) Again, sorry about that, > but attempting to cover everything would be madness. I quite understand. And good luck to you on the book project. Even a mention in the printed book is very much appreciated. Please let me know if there is anything about it you don't understand, so to make sure that any mention of it is factual. I look forward to a 3rd edition of the advanced perl book, by which time Rosetta et al should be relatively mature and well known. > I've had to do this with templating toolkits too, and doubtless I've already > offended someone by my selection there as well. Well, this is a limit of a printed book. Perhaps you could mention in the book that an expanded version of those chapters is available on the O'reilly website, and anyone who missed out getting in the book can have equivalent info there in the electronic appendicies. -- Darren Duncan |
From: Simon C. <si...@si...> - 2004-11-30 07:52:09
|
Darren Duncan: > First of all, I would *definitely* be interested in implementing the > things you mention using my bleeding-edge Rosetta/SQL::Routine framework. Ah. I didn't ask for that. You see, there are about 30,000 new, cool and bleeding edge object persistence / object representation frameworks on CPAN, and I don't want the chapter to be 30,000 pages long. To avoid this, I had to make a selection, and the problem with making selections is that it upsets people who wrote the things which didn't get selected. I'm sorry about that. I tried to select a list of frameworks which were either well known, widely used or well designed; you can amuse yourself by guessing which category I filed each one into. The final list was Alzabo, SPOPS, CDBI, Pixie, Class::Persist and Tangram, and I'm afraid that's not negotiable. I have to make a selection somehow. Implementing the stuff requested might be good for your framework anyway and to check that it can really do all those things, but I'm afraid it won't get it covered in the book. (although I will be naming some notable alternative choice such as Rosetta in the last few paragraphs) Again, sorry about that, but attempting to cover everything would be madness. I've had to do this with templating toolkits too, and doubtless I've already offended someone by my selection there as well. -- It's a testament to the versatility of the human mind that we're so able to compensate for our own incompetence. - Darrell Furhiman |
From: Darren D. <da...@Da...> - 2004-11-30 06:40:23
|
On Tue, 30 Nov 2004, Sam Vilain wrote: > Simon Cozens is including a chapter on POOP in _Advanced Programming > Perl, 2nd edition_, an O'Reilly book. > As he is busy with a new direction in his life, I have agreed to > co-ordinate gathering material and code for possible inclusion in the > chapter from authors and users of POOP frameworks. > The challenge is: > Implement classes to drive this schema (minimally remodelling as you > feel appropriate): <snip> > And produce a test script to do the following things: <snip> > And a couple of maintenance scripts: <snip> I will reply to this using 2-3 multiple separate messages that are more targeted. All of *those* will go to poo...@li.... First of all, I would *definitely* be interested in implementing the things you mention using my bleeding-edge Rosetta/SQL::Routine framework. This sort of thing is just the kind of exposure they need, and will provide a good documentation opportunity to demonstrate how to do certain common tasks, both on an objective and a comparative basis. Unfortunately, I have started having new bad sectors appearing on my hard disk, and will be having it replaced tomorrow under warranty. This will take a few days, during which I probably won't have computer access necessary to do the work. I will also need a minimum of a week after those few days to get my implementation ready. Having 2-4 weeks is preferred. But suffice to say that working on these modules is more or less my full time job at the moment, so I'm putting in long hours. So, when do you / does Simon need these implementations by, either full or partial? Eg, when are early drafts and final drafts needed? Is the book on a rigorous publishing schedule? Also, do the implementations need to be able to work as is for a long period of time, or is it okay that details may require small changes later due to module API changes? While I'm mostly nailed down, there are still a few pending API changes to my modules. They *are* pre-alpha. I'll send separate replies about task list details and suggested task additions; they will all go to poo...@li.... Also, if anyone is interested in my modules, I accept offers of assistance. (Aside from replies to some RFC emails over the last 2 years, I've been doing them entirely on my own.) I also have no list specific to developing or using my modules yet, but may start one later when they start to get a user base or multiple interested developers. Thank you. -- Darren Duncan |
From: Sam V. <sa...@vi...> - 2004-11-30 04:28:06
|
Hi all, Simon Cozens is including a chapter on POOP in _Advanced Programming Perl, 2nd edition_, an O'Reilly book. As he is busy with a new direction in his life, I have agreed to co-ordinate gathering material and code for possible inclusion in the chapter from authors and users of POOP frameworks. The challenge is: Implement classes to drive this schema (minimally remodelling as you feel appropriate): http://www.class-dbi.com/cgi-bin/wiki/index.cgi?ERD And produce a test script to do the following things: 1. create a new database object of each type in the schema 2. print IDs of the objects inserted 3. fetch a record by ID 4. fetch an artist record by name (exact match) 5. fetch an artist record with a search term (globbing / LIKE / etc) 6. fetch CD records by matching on a partial *artist's* name, using a cursor if possible. 7. fetch unique CD records by matching on a partial artist's *or* partial CD name, using a cursor if possible. 8. update a record or two 9. delete some records And a couple of maintenance scripts: 1. re-org the database (clean up IDs and/or garbage collect) - may load entire database into core and write result to a second database if required. May make arbitrary decisions about what constitutes the sane starting points, for example leaving behind artists with no CDs. 2. migrate the database to a second version of the schema, that supports two types of CDs - multi-artist and single artist, allowing artist associations on each track or a single artist per CD. again, may do a complete load and write result to a second database if required. Ideally the test script should still work after migrating to the new version of the schema, but if not, submit a modified script. If you feel there is some other important data management task or use case that your framework performs well or enables gracefully, that I have overlooked, then discuss it NOW (do NOT continue to cross-post; use the poop-group list!) so that we can add it to the list as early as possible. Entries must be released under the Open Publication License (if necessary, in addition to any other license). You may use the following licence text: Copyright 2004, Alan B'Stard. All rights reserved. This code is free software; you may use it and/or redistribute it under the same ; terms as Perl itself, or under the terms of the Open Documentation License, available at http://www.opencontent.org/openpub/ Please feel free to post the entries to the most relevant SINGLE mailing list (in case it is still not clear DO NOT CROSS-POST HUGE AMOUNTS OF CODE OR DISCUSSION OR YOU WILL BE PUBLICLY HUMILIATED!). I will be collecting the entries and getting them to work on a test server. Those keen are welcome to login accounts on that server that will have access to test instances of mysql/MyISAM & InnoDB, and Pg; other databases will be made available according to interest and people to set them up. Please advise me if you are posting to a list not in the To: or Cc: of this message. Solving all problems will greatly enhance the reader's experience. While the material isn't just being copied straight into the book, it is excellent worked research. All code submitted by the publication deadline will be available via the oreilly.com page for the book. Unfortunately deadlines are tight so get your entries in as soon as possible! The results will probably go up on the poop.sourceforge.net site as well (Dave?), so late submissions are better than never - though they won't end up entombed in the history of the book ;). Simon has already started the entry for Class::DBI, although I'd appreciate it if someone with a bit more CDBI experience than myself could spare some time to work on the remaining scripts or review my solutions. I will be happy to work on the entry for Tangram, though I'd like to hear something from you folk @state51 as you've dealt quite a bit in this problem space ;-). Maybe you could provide us with a nice large-ish sample subset of your record database that I can munge into shape and put on the public server ? Pixie and Class::Persist are getting entries whether they like it or not, but I couldn't find a mailing list for you Fotango folk. I'd also appreciate it if someone with Pixie experience could provide an elegant solution to the indexing and text search problems so their entries can be complete. I'm also very keen to get submissions from Alzabo and SPOPS/ESPOPS experts. -- Sam Vilain, sam /\T vilain |><>T net, PGP key ID: 0x05B52F13 (include my PGP key ID in personal replies to avoid spam filtering) |
From: Chris W. <ch...@cw...> - 2004-11-29 02:01:02
|
This weekend I introduced a feature to the INI parser to allow inlining external files: [Global] promote_oi = yes ConfigurationRevision = $Revision: 1.48 $ timezone = America/New_York @INCLUDE = server_database.ini @INCLUDE = server_caching.ini ... Question: do you think splitting up the server.ini file into separate files is more confusing or less confusing for most users? (For reference: server.ini currently clocks in at 574 lines with comments.) I'd plan on putting the options most-often modified in the main file and then separate files by subject after that. I think Beta 5 will be along very soon (next few days-ish), and then I can get to this SPOPS stuff from Simon and Ray I've been neglecting. (Sorry about that.) Thanks, Chris -- Chris Winters Creating enterprise-capable snack systems since 1988 |
From: Ray Z. <rz...@co...> - 2004-11-15 15:50:52
|
On Nov 11, 2004, at 8:40 PM, Vsevolod (Simon) Ilyushchenko wrote: > Okay, this is my last stumbling block. I just don't get why you want > to sometimes refer to the object and sometimes to the id. If it's > implemented this way, I'll have to say $author->book->id instead of > $author->book_id for auto/lazy, if I want to to just get the id value. > Also, if I switch from auto/lazy to manual fetch, I'll have to change > my code. But you'll have to change code anyway if you switch from auto/lazy to manual if you have any code that accesses the book object (since it will no longer be auto/lazy-fetched). I view the specification of the auto/lazy vs manual as part of specifying the 'type' of the field. If you change the 'type' you should expect to have to change code in order to get at the same data (e.g. the id). If I understand your proposal you would always have the field hold the id and have it store the auto/lazy-fetched object into some other key in the parent object? What key? This approach requires you to specify for each auto/lazy fetched field another name by which you access the object. You have to worry about inconsistency between the id and the object. When you go to save, if the id in the field doesn't match the id of the object in corresponding key, which do you save? I just think it is simpler and more straightforward to use the single field and think of the auto/lazy vs manual as part of the type definition for that field. > Can you give me an example of manual fetch that cannot be implemented > otherwise or that makes life more convenient? Manual remove, OTOH, is > clearly different from the other two methods of removal. Well, I think the example that I gave in a previous e-mail ... wait, I now see that that example made no sense. Sorry. For some reason I was thinking that the remove spec appeared inside the fetch, not at the same level. Nevermind. OK, I think I understand your point. IF you use separate field names for the id (persistent field) and the auto/lazy-fetched object (temp non-persistent field), I suppose there is no need for manual fetch. But if you use the same field for the id and the auto-fetched object, then you DO need the manual option. I still think this is a cleaner approach since you're not cluttering up your object with extra fields unnecessarily and it allows you treat some fields as ids (manual) and others as objects (auto/lazy), which I find useful conceptually. >> I think the only thing you need to do is ensure that you don't have >> any auto-fetching loops (might even want to include lazy-fetching) >> when you do the configuration. In other words, you don't want to have >> a book auto-fetch it's list of authors AND have the author set to >> auto-fetch its book, creating an infinite loop. Even if you use a >> cache this would cause circular references which cause a problem for >> garbage collection unless you use weak references. > > Hmmm... circular references will actually come up even without > circular configuration. If an author refers to a list of books, and > the books refer back to the author, there you have it... Right, I agree. After thinking about this a bit more, I think there are two separate issues here. They are related to one another and both affect, or are affected by, the presence or absence of caching. One is the circular configuration which could result in auto-fetching loops and the other is circular references in the objects which prevent automatic garbage collection. I think the first one should be handled by SPOPS and it should be at the earliest possible stage (configuration of the class if possible, run-time if not). I think the second one is outside the scope of SPOPS. The developer needs to be aware of which objects hold references which other objects. It is certainly possible for an object A to reference B which references C which in turn references A causing a circular reference involving more than just two classes. Another thing to keep in mind is that sometimes circular references are needed/desired, but the developer needs to handle the breaking of the circular ref manually before destruction. I don't think it is the job of SPOPS to prevent circular references. When implementing an SPOPS cache it may be possible to handle circular references in an intelligent way during flushing of the cache for example, but I view this as a separate issue that should not be mixed in with the new has-a semantics. > I'm currently doing this checking during saving/removing objects. The > configuration allows loops - I'm afraid that if this is prohibited at > the configuration stage, the potential for error is too great. Are you talking about circular references or auto-fetch loops? If the latter, why is it more error prone at the configuration stage? I think the circular reference issue is fundamentally a run-time issue since it deals with instances of objects referring to one another, and for the reasons stated above, I don't think SPOPS should try to tackle this except maybe w.r.t. caching. On the other hand I think the auto-fetch loop issue is about class (as opposed to instance) behavior and is therefore fundamentally a configuration issue. No class should be able to auto-fetch a field which through further auto-fetch configuration ends up auto-fetching an object of the original class. It may be easier to implement this checking at run-time, but it is still a configuration level issue and I think it should be handled during configuration if possible. In other words, if a developer creates an auto-fetch loop, he has a bug. If the checking is done at the config stage, the bug will be exposed the first time the classes are created. If the checking is done during fetching, the class might be used successfully for a while without the bug be exposed if the particular field in question isn't populated for example. It's always best to detect errors at the earliest possible stage. >> But since you can put a 'has_many' in Book and a 'has_a' in Author, >> for example, where Author has a 'book' field, I think they can be >> inconsistent. In my proposal you, for the 'has_a' in Author, you >> either specify a forward or reverse direction with no way to specify >> something conflicting in the Book class. > > Who's to stop the developer from saying that Author has_a Book and > Book has_a Author? True, true. But I think of a 'has_many' and a reverse 'has_a' as two ways of defining the behavior of a single uni-directional link between the two classes (the Author's book field). In that sense, allowing both makes it possible to define contradicting behavior for that single link. I view your example, on the other hand, as defining the behavior of two different uni-directional (and obviously circular) links. I don't think this is a contradictory configuration, but it does require SPOPS to ensure that it doesn't create an infinite auto-fetch loop, and the developer needs to be aware of the implications for circular references (which will depend on manual vs lazy and cache implementation). Ray Zimmerman Director, Laboratory for Experimental Economics and Decision Research 428-B Phillips Hall, Cornell University, Ithaca, NY 14853 phone: (607) 255-9645 |
From: Vsevolod (S. I. <si...@cs...> - 2004-11-12 01:40:29
|
Ray, > Right. Except that 'list_of_authors' in $book implies that you've > specified a reverse fetch of the book field in author, so > $author->{book} is just an id, not an object. More on object vs id below. Okay, this is my last stumbling block. I just don't get why you want to sometimes refer to the object and sometimes to the id. If it's implemented this way, I'll have to say $author->book->id instead of $author->book_id for auto/lazy, if I want to to just get the id value. Also, if I switch from auto/lazy to manual fetch, I'll have to change my code. Can you give me an example of manual fetch that cannot be implemented otherwise or that makes life more convenient? Manual remove, OTOH, is clearly different from the other two methods of removal. > But I don't think there is anything in the new has_a design that > REQUIRES one to maintain consistency though an SPOPS level cache, right? Aww, who am I to argue with developers? I'll make sure both cases work. > I think the only thing you need to do is ensure that you don't have any > auto-fetching loops (might even want to include lazy-fetching) when you > do the configuration. In other words, you don't want to have a book > auto-fetch it's list of authors AND have the author set to auto-fetch > its book, creating an infinite loop. Even if you use a cache this would > cause circular references which cause a problem for garbage collection > unless you use weak references. Hmmm... circular references will actually come up even without circular configuration. If an author refers to a list of books, and the books refer back to the author, there you have it... > I think this checking is important, but I haven't honestly given any > thought to how to implement it. I'm currently doing this checking during saving/removing objects. The configuration allows loops - I'm afraid that if this is prohibited at the configuration stage, the potential for error is too great. > I vote once again that we stick with my proposal to use 'fetch_' > prepended to the name of the field, by default, and allow an option to > explicitly specify a method name. Okay. >> This should not be a problem, because in my current proposal the >> programmer specifies either has_a or has_many (which implies the >> reverse has_a), so no conflicts should be possible. However, if we >> change the syntax, this issue will go away. > > > But since you can put a 'has_many' in Book and a 'has_a' in Author, for > example, where Author has a 'book' field, I think they can be > inconsistent. In my proposal you, for the 'has_a' in Author, you either > specify a forward or reverse direction with no way to specify something > conflicting in the Book class. Who's to stop the developer from saying that Author has_a Book and Book has_a Author? >> 'Link_class' refers to the Perl class name, 'link_class_alias' - to >> the method name used to retrieve its instances (this is your >> 'list_field' in the 'link' hash). > > But can't you always get the one, given the other? No, the alias can be overriden by the user. Simon -- Simon (Vsevolod ILyushchenko) si...@cs... http://www.simonf.com Terrorism is a tactic and so to declare war on terrorism is equivalent to Roosevelt's declaring war on blitzkrieg. Zbigniew Brzezinski, U.S. national security advisor, 1977-81 |
From: Ray Z. <rz...@co...> - 2004-11-08 23:13:58
|
Simon, >> And you say this full cache is necessary for the consistency circular >> references. You mean to avoid infinite loops when you have A set to >> auto-fetch B which is set to auto-fetch A? Seems to me that you >> should be able to detect this type of thing when the classes are >> configured. While I think a cache is a nice option that I may very >> well use, I don't think it should be mandatory unless it's absolutely >> necessary. Can you give me an example where consistency makes it >> absolutely necessary? > > For example, you have a Book object with many Authors. If the > application loads a book with the list of authors, adds another author > to this book and asks the new author about its parent book, the > current SPOPS implementation will re-fetch the book object, > potentially ignoring the changes that were made to original book > object. > > However, I only now realized that you suggest saving both the > parent-to-child reference and the reverse reference in the object > fields. (I distinguish between $author->{book_id}, a number, and > $author->{book}, an object. Let me know if I understood this > correctly.) I think you've got it right, except I'm not sure what you mean about the reverse reference case. Maybe the myA vs fetch_myA stuff below clarifies what I'm suggesting. > Thus, once $book->{list_of_authors} is populated, adding a new author > to the book should add the new object to this list, plus it should set > the field $author->{book} to the original book object. This will make > the above situation impossible. Right. Except that 'list_of_authors' in $book implies that you've specified a reverse fetch of the book field in author, so $author->{book} is just an id, not an object. More on object vs id below. > However, inconsistencies still may occur if: > > 1) I create a new author-book relationship by setting the field > $author->{book_id} instead of saying $book->add_author($author). This > can be discouraged as an incorrect way of altering data, of course, > but logically both make sense, and I'd like to be able to use them > both. > > Or, 2) if I am working with a second relationship, say > books-to-artists (illustrators). In this case, in one place in my > code, I could retrieve a book object by saying $artist->book, and then > in another place I'll call $author->book, and even though they may > refer to the same book, they will always be two different objects. Right. Without a cache these inconsistencies are always possible. But this is a "global" issue with OOP frameworks like SPOPS. I think Chris has taken the right approach in letting the developer decide at the application level whether s/he needs to always maintain that consistency and if so whether to use SPOPS level caching or application level caching/logic to ensure consistency. (And I understand there was a bug in SPOPS caching which prevented this from working correctly). But I don't think there is anything in the new has_a design that REQUIRES one to maintain consistency though an SPOPS level cache, right? I think the only thing you need to do is ensure that you don't have any auto-fetching loops (might even want to include lazy-fetching) when you do the configuration. In other words, you don't want to have a book auto-fetch it's list of authors AND have the author set to auto-fetch its book, creating an infinite loop. Even if you use a cache this would cause circular references which cause a problem for garbage collection unless you use weak references. I think this checking is important, but I haven't honestly given any thought to how to implement it. > So looks like cache is still necessary. Only if you need to guarantee a single in-memory copy for a process. I argue that this is an arbitrary requirement. In a read-only environment, it really doesn't matter (except for resource usage) if you have multiple copies of the same object in memory. And in a web environment, even using a simple cache doesn't guarantee consistency across multiple processes (apache children), so you still need a higher level synchronization mechanism to ensure consistency. >> It was for completeness and to offer a mode that is equivalent to >> current has_a behavior, that is, the field normally just gives you an >> id, but you also have a convenience method for fetching the object as >> well. My idea was that any 'has_a' spec, including 'manual', would >> create convenience methods for fetching the related objects. The >> 'auto' and 'lazy' options would simply call these methods >> automatically at the appropriate time and stash the return values in >> the object. So in the way I was picturing things, implementing >> 'manual' would simply be the first step in implementing 'auto' and >> 'lazy'. > > I feel dumb - I still don't quite get it. However, in your original > examples the method X->myA returns the id of A in the case of manual > fetch and A itself in the case of lazy/auto fetch, right? In my view, > X->myA always return the id and X->fetch_myA always returns the object > (I tend to use them like $author->book_id and $author->book in my > applications). So there is no need for manual fetching. > > I think that having X->myA return inconsistent values may be confusing. > Let me know what you think. Perhaps I am still missing the utility of > manual fetching. My thought was that specifying 'auto' or 'lazy' are equivalent to saying "this field is an object". Specifying 'manual' is equivalent to saying "this field is an object id". So X->myA always returns the value stored in the field and X->fetch_myA always returns the object. >> Without the manual option, you can't specify a relationship at all >> without having it define auto-fetching behavior. You can't, for >> example, auto-remove an object without having it also auto-fetched >> (which I can imagine you might want if you typically only need to >> deal with the ID of the secondary object). > > But in this case you still have to fetch the dependent object, because > it may define its own rules of auto-removal of even more objects. But the fetch only happens for the purpose of correctly doing the remove ... the 'manual' specifier still means that the field holds an object id, not an object. >> Just curious, does your implementation of 'auto' generate a public >> 'fetch_myA' method, for example? > > See above - even if the fetch method name ('alias' in the current > terminology') is not specified in the configuration, it'll be > auto-created by using the name given to the target class. (I mean the > name of config hash key for the target class, not its Perl name. In > the example I sent you, X_alias is such a name.) The problem I see with this is that it generates clashes when you have multiple fields with the same class. We need a method name that is unique for the field we want to fetch, not just for the class we use to fetch it. I think using the class alias is left-over from the old has_a config which used the class as the hash key (which you agreed is detestable :-). I vote once again that we stick with my proposal to use 'fetch_' prepended to the name of the field, by default, and allow an option to explicitly specify a method name. >>> OTOH, there are three types of removes - 'auto', 'manual' and >>> 'forget'. 'Auto' means complete removal of dependent objects, >>> 'forget' - nullifying id fields pointing to the removed objects, and >>> 'manual' - no action. The default should logically be 'forget', but >>> it may conflict with no autosaving, so I'll have to set it to >>> 'manual'. >> OK, but what is the 'reverse_remove'? Is specifying 'reverse_remove' >> => 'forget' in a 'has_a' the same as specifying 'remove' => 'forget' >> in the corresponding 'has_many'? If so, which one takes precedence if >> they are inconsistent? It looks like 'reverse_remove' => 'forget' is >> equvalent to what I called 'null_by', right?. I personally think that >> having multiple (and possibly conflicting) ways/places of defining >> the behavior for a single relationship is asking for trouble. I think >> it will make it difficult to write correct and clear documentation >> and it will create some debugging nightmares. (More on this below) > > This should not be a problem, because in my current proposal the > programmer specifies either has_a or has_many (which implies the > reverse has_a), so no conflicts should be possible. However, if we > change the syntax, this issue will go away. But since you can put a 'has_many' in Book and a 'has_a' in Author, for example, where Author has a 'book' field, I think they can be inconsistent. In my proposal you, for the 'has_a' in Author, you either specify a forward or reverse direction with no way to specify something conflicting in the Book class. >> Why do you include both 'link_class' and 'link_class_alias'? Aren't >> they redundant? (see [1] below). > > 'Link_class' refers to the Perl class name, 'link_class_alias' - to > the method name used to retrieve its instances (this is your > 'list_field' in the 'link' hash). But can't you always get the one, given the other? >> And I suppose the 'table' is only necessary if you don't specify the >> 'link_class' and vice versa, right? > > Yup. I am a little unhappy that in your proposal one has to have a > Perl class for the linking table even if one is never going to use it, > but I guess this is necessary for the sake of the uniform syntax. Which is why I wouldn't protest too much of we decided to leave the old 'links_to' syntax in untouched, at least for the time being. You would only need to define the linking class if you needed the auto-fetching/removing behavior. >> [1] I confess I never really did understand the purpose of the >> alias. What is the difference between the alias and the class? Isn't >> one of them redundant? > > The alias is used to generate access methods in other classes > referring to this one. In your configuration examples you always give > a value to the 'name' key, but if it's omitted, methods are given > names like 'fetch_X_alias'. Ah ... right ... detestable :-) Let's use something tied to the field name, not the class, as I mentioned above. Ray Zimmerman Director, Laboratory for Experimental Economics and Decision Research 428-B Phillips Hall, Cornell University, Ithaca, NY 14853 phone: (607) 255-9645 |
From: Vsevolod (S. I. <si...@cs...> - 2004-11-07 19:15:31
|
Ray, I've mulled over your comments, and I think I'm starting to see the light. :) I've looked at other OOP frameworks (Alzabo, Tangram, Hibernate, XORM), and they all seem to define the relationship in the class that has the ID field which refers to the other class. I continue the discussion of cache and the 'manual' keyword below, but when we converge to an opinion on those, I'll redo the code your way. It's less work than it seems. However, Chris has not replied yet. I'll ask him again about the compatibility issue. However, if I simply extend the has_a syntax the way you propose and smartly determine whether the old or the new syntax is used (I do it anyway now), it should not be a problem. > I assume that's why you changed the key values in the 'has_a' spec to > the class name instead of the field name? But isn't this a problem if > you have multiple has_a fields of the same class, with different > behavior for fetching/saving/removing? Or do you just use an arrayref of > hashrefs instead of the single hashref in that case? Blech. I agree, it's detestable. > And you say this full cache is necessary for the consistency circular > references. You mean to avoid infinite loops when you have A set to > auto-fetch B which is set to auto-fetch A? Seems to me that you should > be able to detect this type of thing when the classes are configured. > While I think a cache is a nice option that I may very well use, I don't > think it should be mandatory unless it's absolutely necessary. Can you > give me an example where consistency makes it absolutely necessary? For example, you have a Book object with many Authors. If the application loads a book with the list of authors, adds another author to this book and asks the new author about its parent book, the current SPOPS implementation will re-fetch the book object, potentially ignoring the changes that were made to original book object. However, I only now realized that you suggest saving both the parent-to-child reference and the reverse reference in the object fields. (I distinguish between $author->{book_id}, a number, and $author->{book}, an object. Let me know if I understood this correctly.) Thus, once $book->{list_of_authors} is populated, adding a new author to the book should add the new object to this list, plus it should set the field $author->{book} to the original book object. This will make the above situation impossible. However, inconsistencies still may occur if: 1) I create a new author-book relationship by setting the field $author->{book_id} instead of saying $book->add_author($author). This can be discouraged as an incorrect way of altering data, of course, but logically both make sense, and I'd like to be able to use them both. Or, 2) if I am working with a second relationship, say books-to-artists (illustrators). In this case, in one place in my code, I could retrieve a book object by saying $artist->book, and then in another place I'll call $author->book, and even though they may refer to the same book, they will always be two different objects. So looks like cache is still necessary. > I've used only application level caching, never SPOPS-level caching so > I'm not clear on how this works. Does this mean that SPOPS objects no > longer go out of scope and get destroyed until the program ends or the > cache is manually flushed? In a mod_perl environment then, do you flush > the cache at the end of every request or what? Seems like you could get > some huge apache children pretty quickly if you're not careful. It has to be flushed, of course. > It was for completeness and to offer a mode that is equivalent to > current has_a behavior, that is, the field normally just gives you an > id, but you also have a convenience method for fetching the object as > well. My idea was that any 'has_a' spec, including 'manual', would > create convenience methods for fetching the related objects. The 'auto' > and 'lazy' options would simply call these methods automatically at the > appropriate time and stash the return values in the object. So in the > way I was picturing things, implementing 'manual' would simply be the > first step in implementing 'auto' and 'lazy'. I feel dumb - I still don't quite get it. However, in your original examples the method X->myA returns the id of A in the case of manual fetch and A itself in the case of lazy/auto fetch, right? In my view, X->myA always return the id and X->fetch_myA always returns the object (I tend to use them like $author->book_id and $author->book in my applications). So there is no need for manual fetching. I think that having X->myA return inconsistent values may be confusing. Let me know what you think. Perhaps I am still missing the utility of manual fetching. > Without the manual option, you can't specify a relationship at all > without having it define auto-fetching behavior. You can't, for example, > auto-remove an object without having it also auto-fetched (which I can > imagine you might want if you typically only need to deal with the ID of > the secondary object). But in this case you still have to fetch the dependent object, because it may define its own rules of auto-removal of even more objects. > Just curious, does your implementation of 'auto' generate a public > 'fetch_myA' method, for example? See above - even if the fetch method name ('alias' in the current terminology') is not specified in the configuration, it'll be auto-created by using the name given to the target class. (I mean the name of config hash key for the target class, not its Perl name. In the example I sent you, X_alias is such a name.) >> Autosaving is always off by default, to preserve compatibility. > > > I'm not sure I follow. Auto-fetching is new so there is no previous > corresponding save behavior to be backward compatible with. Classes > defined without any auto-fetch/auto-remove behavior, could behave as > always. Classes defining new auto behavior could have whatever default > 'save' behavior we think makes sense. So I'm not sure there is a > backward compatibility issue here. > And the save behavior described in my updated proposal posted 4 Jan 2002 > still seems to be the most consistent and make the most sense to me. Yes, if we change the syntax, we will be free to follow you rules. >> OTOH, there are three types of removes - 'auto', 'manual' and >> 'forget'. 'Auto' means complete removal of dependent objects, 'forget' >> - nullifying id fields pointing to the removed objects, and 'manual' - >> no action. The default should logically be 'forget', but it may >> conflict with no autosaving, so I'll have to set it to 'manual'. > > > OK, but what is the 'reverse_remove'? Is specifying 'reverse_remove' => > 'forget' in a 'has_a' the same as specifying 'remove' => 'forget' in the > corresponding 'has_many'? If so, which one takes precedence if they are > inconsistent? It looks like 'reverse_remove' => 'forget' is equvalent to > what I called 'null_by', right?. I personally think that having multiple > (and possibly conflicting) ways/places of defining the behavior for a > single relationship is asking for trouble. I think it will make it > difficult to write correct and clear documentation and it will create > some debugging nightmares. (More on this below) This should not be a problem, because in my current proposal the programmer specifies either has_a or has_many (which implies the reverse has_a), so no conflicts should be possible. However, if we change the syntax, this issue will go away. > Why do you include both 'link_class' and 'link_class_alias'? Aren't they > redundant? (see [1] below). 'Link_class' refers to the Perl class name, 'link_class_alias' - to the method name used to retrieve its instances (this is your 'list_field' in the 'link' hash). > And I suppose the 'table' is only necessary if you don't specify the > 'link_class' and vice versa, right? Yup. I am a little unhappy that in your proposal one has to have a Perl class for the linking table even if one is never going to use it, but I guess this is necessary for the sake of the uniform syntax. > * didn't see any mention of the 'name' option for explicitly specifying > the name of generated methods It's called 'alias'. > * not clear to me what auto/lazy fetching, auto removing, etc options > are implemented for links_to If we use your syntax, they will be the same as in the fetch_by case. > [1] I confess I never really did understand the purpose of the alias. > What is the difference between the alias and the class? Isn't one of > them redundant? The alias is used to generate access methods in other classes referring to this one. In your configuration examples you always give a value to the 'name' key, but if it's omitted, methods are given names like 'fetch_X_alias'. Simon -- Simon (Vsevolod ILyushchenko) si...@cs... http://www.simonf.com Terrorism is a tactic and so to declare war on terrorism is equivalent to Roosevelt's declaring war on blitzkrieg. Zbigniew Brzezinski, U.S. national security advisor, 1977-81 |
From: Vsevolod (S. I. <si...@cs...> - 2004-11-02 14:46:39
|
> I've been trying to keep an open mind toward the differences between > your implementation and my proposed design, adopting a wait and see > approach. I hesitate to state my opinions too strongly since you're > contributing useful code and I'm just talking about design. On the other > hand, I suppose you probably want honest input. Ray, Honest input is, of course, appreciated. I'll look over the your questions over the weekend, but it seems that compatibility is the major question, as it drives the syntax that you don't like. I had to stay compatible by default, and I think only Chris can authorize a new syntax. So I would also like to hear what he has to say. My implementation is not written in stone - I personally don't really care what the syntax is as long as it provides the full fuctionality. Simon -- Simon (Vsevolod ILyushchenko) si...@cs... http://www.simonf.com Terrorism is a tactic and so to declare war on terrorism is equivalent to Roosevelt's declaring war on blitzkrieg. Zbigniew Brzezinski, U.S. national security advisor, 1977-81 |
From: Ray Z. <rz...@co...> - 2004-11-01 23:18:17
|
Hi Simon, On Oct 31, 2004, at 8:32 PM, Vsevolod (Simon) Ilyushchenko wrote: > My current progress - code pretty much finalized, and a .t file with > 64 tests (I'll have to add a few more tests to cover all combinations > of features). I'd like to do some more internal testing before I send > the official patch, but if anyone has time to play with the new > features (yeah, right :), I'd be happy to share it. Cool. I'd like to check it out. > It's completely backwards compatible. Is this a good thing? :-) (more below in this) I assume that's why you changed the key values in the 'has_a' spec to the class name instead of the field name? But isn't this a problem if you have multiple has_a fields of the same class, with different behavior for fetching/saving/removing? Or do you just use an arrayref of hashrefs instead of the single hashref in that case? Blech. For example, I'm thinking of something like a Transfer object that has a 'from_account' and a 'to_account' field, both of class 'Account', where you want to auto-fetch the 'from_account' and lazy-fetch (or manual fetch) the 'to_account'. This is something I did not like about the original has_a spec. It seems so backwards to me. I want to specify relationship behavior for each individual field. Whether or not two has_a fields belong (or not) to the same class is irrelevant and therefore (to me) it makes no sense to organize them by class. > > Just curious, at what level are you implementing this cache? Is it > > *always* used for all SPOPS objects? Only for DBI? Only for DBI when > you > > are using the new functionality? > > It's only for DBI. And you say this full cache is necessary for the consistency circular references. You mean to avoid infinite loops when you have A set to auto-fetch B which is set to auto-fetch A? Seems to me that you should be able to detect this type of thing when the classes are configured. While I think a cache is a nice option that I may very well use, I don't think it should be mandatory unless it's absolutely necessary. Can you give me an example where consistency makes it absolutely necessary? I've used only application level caching, never SPOPS-level caching so I'm not clear on how this works. Does this mean that SPOPS objects no longer go out of scope and get destroyed until the program ends or the cache is manually flushed? In a mod_perl environment then, do you flush the cache at the end of every request or what? Seems like you could get some huge apache children pretty quickly if you're not careful. > I actually did not implement manual loading - I just can't see where > it can be used (and a bit fuzzy on how it would work). So there is > only auto and lazy loading. It was for completeness and to offer a mode that is equivalent to current has_a behavior, that is, the field normally just gives you an id, but you also have a convenience method for fetching the object as well. My idea was that any 'has_a' spec, including 'manual', would create convenience methods for fetching the related objects. The 'auto' and 'lazy' options would simply call these methods automatically at the appropriate time and stash the return values in the object. So in the way I was picturing things, implementing 'manual' would simply be the first step in implementing 'auto' and 'lazy'. Without the manual option, you can't specify a relationship at all without having it define auto-fetching behavior. You can't, for example, auto-remove an object without having it also auto-fetched (which I can imagine you might want if you typically only need to deal with the ID of the secondary object). Just curious, does your implementation of 'auto' generate a public 'fetch_myA' method, for example? > Autosaving is always off by default, to preserve compatibility. I'm not sure I follow. Auto-fetching is new so there is no previous corresponding save behavior to be backward compatible with. Classes defined without any auto-fetch/auto-remove behavior, could behave as always. Classes defining new auto behavior could have whatever default 'save' behavior we think makes sense. So I'm not sure there is a backward compatibility issue here. And the save behavior described in my updated proposal posted 4 Jan 2002 still seems to be the most consistent and make the most sense to me. > OTOH, there are three types of removes - 'auto', 'manual' and > 'forget'. 'Auto' means complete removal of dependent objects, 'forget' > - nullifying id fields pointing to the removed objects, and 'manual' - > no action. The default should logically be 'forget', but it may > conflict with no autosaving, so I'll have to set it to 'manual'. OK, but what is the 'reverse_remove'? Is specifying 'reverse_remove' => 'forget' in a 'has_a' the same as specifying 'remove' => 'forget' in the corresponding 'has_many'? If so, which one takes precedence if they are inconsistent? It looks like 'reverse_remove' => 'forget' is equvalent to what I called 'null_by', right?. I personally think that having multiple (and possibly conflicting) ways/places of defining the behavior for a single relationship is asking for trouble. I think it will make it difficult to write correct and clear documentation and it will create some debugging nightmares. (More on this below) > As I mentioned, I am flexible on the matter of has_many. If you want, > I can create an additional config option (you call it 'auto_by' etc, > I think) and put it into the other class. Borrowing from some of your configuration terminology, we could call it 'reverse_auto', 'reverse_lazy' and 'reverse_manual'. That might be more clear. And for removes we could use 'manual_forget' and 'auto_forget' in place of my 'manual_null' and 'auto_null' if you prefer. Again, having more than one way to do it seems like a bad idea to me. (More on this below). > Links_to is harder, because to preserve backward compatibility it has > to stay in one of the edge classes. But if you want to add more > variables to the linking class, you'll have to define it for SPOPS > anyway, and there I can probably also implement your suggestion as an > option. I suppose if you want to enhance the existing links_to syntax you are correct. (More below on why I wouldn't do this). Why do you include both 'link_class' and 'link_class_alias'? Aren't they redundant? (see [1] below). And I suppose the 'table' is only necessary if you don't specify the 'link_class' and vice versa, right? After looking over your configuration syntax I started trying to convert my examples to your syntax to see if all the bases are covered. Here are some random comments, based on my understanding of what you're doing: * for both forward and reverse direction, auto-remove is not possible without auto-fetch (missing 'manual' mode) * for both forward and reverse direction, auto/lazy-fetch looks fine (except for default behavior of save) * forward direction manual/auto remove looks fine * reverse direction auto/forget remove looks fine, but I don't see an equivalent to manual_null (manual_forget) * is use of 'alias' and 'reverse_alias' consistent with the way alias is normally used in SPOPS? (see [1] below) * didn't see any mention of the 'name' option for explicitly specifying the name of generated methods * not clear to me what auto/lazy fetching, auto removing, etc options are implemented for links_to In conclusion, here are a few of my perspectives. I've been trying to keep an open mind toward the differences between your implementation and my proposed design, adopting a wait and see approach. I hesitate to state my opinions too strongly since you're contributing useful code and I'm just talking about design. On the other hand, I suppose you probably want honest input. I think that both links_to and has_many share a fundamental flaw. They both allow relationships to be defined from the "other" end, making it possible to define conflicting behavior for a single relationship. This unnecessarily complicates the error checking or precedence rules and the documentation. I think always forcing the relationship to be defined by the class with the linking field, and having a single syntax (has_a) for defining it, makes for a much simpler, cleaner, more consistent, easier to understand/document/implement, less error prone design. Including both approaches (e.g. has_many and reverse_auto or auto_by), in my opinion, only complicates and confuses things further. For this reason, I think we should forget trying to make the new syntax backward compatible. I think we should leave the old syntax in place (but deprecated) as long as necessary for backward compatibility and add a completely redesigned new syntax that folks can migrate to gradually or as they need the new features. If it were up to me I wouldn't touch links_to at all, and I wouldn't touch the old has_a either. I'd just add a new configuration handler for the case where the key of a 'has_a' spec matches one of the field names, in which case you assume it is the new syntax I proposed, otherwise you assume it's a class and use the old has_a semantics. I don't know if Chris has enough spare cycles to give an opinion on this at the moment, but I'd be interested in his perspective. While I appreciate all your work, I have to say that the above issues are leading me to a growing preference for my original proposed design. Ray Zimmerman Director, Laboratory for Experimental Economics and Decision Research 428-B Phillips Hall, Cornell University, Ithaca, NY 14853 phone: (607) 255-9645 [1] I confess I never really did understand the purpose of the alias. What is the difference between the alias and the class? Isn't one of them redundant? |