Re: [Modeling-users] Pythonic, and non-XML, Model description
Status: Abandoned
Brought to you by:
sbigaret
From: Mario R. <ma...@ru...> - 2003-03-03 00:40:33
|
Hi, Thanks for the ModelSet clarification. Further comments below... > About: Models' being mutable at runtime >> In general this is not what anyone woudl want to do, and I was trying >> to think of times when this would be useful. Can't really, but i >> *suspect* it might come in handy. But, in fact, imagine some >> application that is stocking lots of incoming unknown data into a >> db... let's say the data is XML, but each XML data-schema is unknown a >> priori. You can create a py model to correspond, at runtime (but saved >> for subsequent accessing), create the tables, and stock the unknown >> XML in the db directly as real db tables.... > > Okay, I now can be more precise here: adding an entity at runtime is > definitely not a problem. But mutating an entity (changing its > attributes for example, or its name), or removing an entity should > *not* > be done after an object of the corresponding class has been registered > within an EditingContext. > > (this should be a FAQ) OK, understood, and probbaly a good idea to mention it as an FAQ. ... >> What is wrong with a python list? >> multiplicity = [0,1] >> multiplicity = [1,-1] > > Well, I understand your point quite well and I'm ok with a list. [1,-1] > seems however obscure, if we go for a python list what do you think of > [0,'*'] or maybe better: [0,] ? Absolutely fine with me (used -1 to keep it as an integer.) I think i prefer the None version of this, which could also be explicit, i.e [1,None] >>> - Same for external type's width, precision and scale, such as in: >>> 'NUMERIC(12,2)' or 'VARCHAR(200)' >> >> Same thing here. This would also make it difficult to provide defaults >> separately for dbtype, and for width or precision. > > I disagree: default would be treated exactly the same way: if the width > or precision/scale cannot be found in the parsed string then the > defaults > apply. Ah, so you mean that assignem,ent will just map such an initial value onto the concerned properties, and then work with the separated value sfrom then on. Well, OK. But may lead to confusion if you for the value of type, and get something else than what you specified (e.g. only 'NUMERIC' without the qualifiers). >> So, I would vote to keep them separate (and tyo change the name of >> externalType to dbType). I think it will be easier to work with, >> and clearer for everyone. > > I'd really like to keep the former possibility, though. It can co-exist > with yours, I've no problem with that. > > I won't however go crazy if we do not support it ! I suggest we forget > about my proposal for 'VARCHAR(20)' and that we only consider explicit > parameters width/precision/scale. Since it's not a hard thing to add, > we > can safely forget it for the moment and add it later if this reveals to > be a users' request. I would agree with proceeding this way. > Now that I think of it, there's an underlying point that should be > discussed for clarification. For example a string/VARCHAR won't get > any defaults in the conversion process, unless explicitly stated in > the model itself (yours AString.default['width']) --same for NUMERIC, > etc. This way it makes it clear that a String requires a width, that > a > float requires precision and scale, etc., since we would get an > error if neither a default nor an explicit parameter is provided. Yes, this syntax will make that association clearer. But the validation should also point out if these are missing. > What do you think about this? In fact, I do not even know what a good > default for width, scale or precision can be. Different DB have > different defaults, and I don't think it's worth registering/tracking > them. Have no idea what reasonable defaults should be... I feel their main purpose is to make things less error-prone, especially in the beginning. Programs should probably specify them anyway, when they know what is best for the program (and, should probably not rely on default values set by the framework.) >>> - Again, same for an entity's parent which could be specified >>> along with the entity's name: 'Executive(Employee)' >> >> Same. I vote to keep separate, and to rename parentEntity to isAlso, >> e.g. Entity('Executive', isAlso='Employee', ... > > I know this is a detail, but I would have sponteanously shortened it to > 'parent'. 'isAlso' vaguely recalls me something, but I don't know > what... It sounds familiar. Is this UML jargon? Not UML -- it is just what i think of (isAlso) when i think of inheritance... But parent is just fine. >>> - What about the possibility to add a '*' to an attribute's name >>> when it's >>> required? (may be same for a relationship, equivalent to lower >>> bound==1) >> >> Same for adding '*' to att name > > Ok, again, let's forget it, I added it for completeness but didn't > really like it either. The explicit parameter 'required' is enough. > > Speaking of this term, maybe it's not clear enough: I observed people > being surprised that '' (empty string) is a valid value for a field > that > is required because they misunderstood it, while it only tells whether > the attribute can be None (python) / NULL (SQL). In the Attribute API > there's also the counterpart 'allowsNone', what do you think? Hmmn, i feel isRequired is more logical, while allowsNone is more implementation-oriented -- but quite clear. The problem with it is that it is a double negative (required => allowsNone = No) which i always find irritating. Plus, should one day a formal expression of what values are allowed for an attribute are also specified in the model, then isRequired will lose the confusion potential you mention, e.g. if the model supports a regexp for an attribute (to which all values must match) then isRequired will clearly mean that a value respecting this constraint must be present. Which is a good opportunity to propose the feature request of being able to specify a regexp for an attribute, against which the framework will validate all values to be assigned to the attribute (and which does not exclude providing other declarations to use for data validation for any attribute, such as type). ... >> See the defaults for sourceAttribute and destinationAttribute in >> RToOne and RToMany... Also, see the explicit declaration of the >> foreign keys (with the possibility of automatic the name for it). > > I'm afraid the naming scheme you propose won't make it. In fact, I was > thinking of automating the simple case, where only one association > exists between two entities. In this case, the foreign key only needs > to > be named 'fk<destinationEntityName>' ; this makes it easy for one > relationship and its possible inverse to actually refer to the same > source/destination attributes in their joins. > > Now a more complex example: suppose you want to model something like > this: > > A <------->> B > <------->> > > (this comes from a "real-life" example, two relationships from A to B > having a different semantics) > > Suppose we first process A's relationships: this will create two > different FK in B pointing to A's PK. Now we process entity B: there is > no way to identify which relationship coming from A is the inverse of a > toOne from B. Hence, this can be automated *if and only if* we also > have > the keyword 'inverse' in the declaration of a relationship. > > BTW, this is where I think automation will not be a real gain and > will > not be worth the effort, because that sort of model is really not > showing up frequently, and because when it shows up you usually want > to have names for foreign keys more explicit than > fk<destinationEntityName><count> > > (You stated: > fk<destinationEntityName><destinationAttributeName><count> but since > the destAttr. will always be the same for a given dest. entity (the > PK), I removed it here) > >>> This could be as handy as the feature you submitted in your proposal: >>> automatic definition parent properties which are not overriden in >>> sub-entities. >> >> Yes. Auto generation of foreign keys from a relationship definition >> does >> make sense in fact... but the possibility to declare them explicitly >> (with >> the desired values for their attributes) should always be there. Since >> such explicit declaration can be very simple, I prefer to leave it as >> must be explicit. > > My opinion here is: > > - make it possible to automate the simple case (PK and FK), as > explained above, > > - always keep the possibility to explicitly write every single > properties in models, entities, etc. Here i think the second point is more important than the first, and will not limit unneccessarily what the framework can handle. The first point is nice, maybe, but since these would be so simple to declare, there is not so much to gain. > You might think that I'm definitely reluctant to impose explicit > declarations (such as for the FK). I think I am but it's not a > religion ;) Well, i can understand that in an ideal world the "db details" are all handled automatically, and inded you cannot get to them. But unless full automation can be guaranteed, then going this route may cause more problems than it solves. And, will not handle existent databases. "Optional automation" is OK, but i feel one should always be able to override it. On the same line, i also feel that an object id should optionally be manually set by the client code.In 90% (or more) of the time you just want it automatic, but sometimes you may want to control it (for db optimization or whatever other reasons). > In fact, it's really a matter of usual practices in E-R modeling. This > is not only /my/ practice, but the way most people seem to deal with > entity-relationship modeling --they do not care about the db-schema > PK/FK details unless they need to, for example when defining a model > for > an existing model, or when some complex design is required (just like > the one I described above). > > Obviously this observation was not made on that framework (there are > still not enough people using it for such an observation), but it is > what I observed from the community using the original Apple's > Enterprise > Object Framework(tm). > > I'm possibly wrong and maybe I just think that people do not care > because most of the times *I* usually don't care. In the ZModeler I > added the ability to make the two relationships necessary for a > one-to-many association at a single-click reach for that very reason: > it's a valuable feature to my own eyes, and I think it may be for > other's. I agree that *most* of the time you want all these things to be managed automatically. But, again, as long as full automation cannot be guaranteed, one should always allow the possibility to do whatever the db allows you to, as long as one takes responsibility for it. This is also similar to python's way of doing things (that of not "forcing" that variables are private for example) and has turned out to be one of the strong points of the language (that whatever unforeseen situation may arise, you are not locked in). > Last comment on this: I do not really like the idea of imposing the > FK > to be declared while having a implicit default for the PK, it sounds > inconsistent to me. Yes, here i agree. I am fine with not requiring that FKs are declared explicitly, and the framework handles their definition automatically. But the framework should also recognize if the FKs are declared and and create them accordingly. ... >> Oh, and some other changes I have not mentioned here are listed in >> the top of PyModel_3_mr.py, but mostly name changes, and having >> entities have only one list, that mixes attributes and relations... >> (your opinion?). > > * deleteRule -> delete: ok > > * multiplicity lower/upper bound -> multiplicity as a python list: ok > > * entity only has attributes that may also be relations: I'm ok for the > principle, however i would prefer an distinct term for this, > relations > being attributes could be a source of confusion. Maybe 'properties' ? OK for me. > More comment on this: having attributes and relations mixed in a > common list will require extra-checks when analysing the model to > perform the possibly needed automated steps. This is because we > will > need to have all the entities and their attributes loaded and > processed before we can make a decision on whether a declared > relation needs automatic generation of FK. We will need to separate > attributes from relations anyway. When processing atts and rels, the logic would be to assign them to 2 different arrays, and work with those arrays. > I do not mean this is complex: it's not. However I'm not sure this > enhance readability. In the sample model your clearly first declared > attributes, than relationships, and I bet everyone will do it that > way, because it would be a real mess to have some attrs, than rels, > than attrs, etc. Well, some may prefer to group in some other logical way than by type... But, my main reason for this is that have to deal with extra list brackets inside the Entity constructor may be avoided, making the code a little more readable. > * misc.: at some point in the discussion the FK were made class > properties in the py-model, but they should not! Ooops, mistake. > > Regarding PyModel_3_mr.py: > > - I'm not sure we want a default for entity.adaptorName You mean model? OK, no problem. > - I'd add a APrimaryKey subclass of Attribute OK. The updated sample (4) is below, but you can also get to it, along with the new PyModel version (4) from http://ruggier.dyndns.org:8080/PyModel/ Cheers, mario ''' A sample Pythonic OO-RDB Model (re-expressing testPackages/StoreEmployees/model_StoreEmployees.xml) -- A PyModel must define a global variable called 'model', that is of type Model ''' from PyModel import * ## # Set preferred defaults for this model (when different from # standard defaults, or if we want to make things explicit) AInteger.defaults['precision'] = 10 AString.defaults['width'] = 20 RToOne.defaults['delete'] = 'cascade' RToOne.defaults['multiplicity'] = [0,1] RToOne.defaults['sourceAttribute'] = RToOne.attNameFromRel # 'fk'+destEnt+destAtt+count RToOne.defaults['destinationAttribute'] = 'id' RToMany.defaults['delete'] = 'deny' RToMany.defaults['multiplicity'] = [0,None] RToMany.defaults['sourceAttribute'] = 'id' RToMany.defaults['destinationAttribute'] = RToMany.attNameFromRel # fk+destEnt+sourceAtt+count # Note that Relation.attNameFromRel is a callable, that calculates the att name # from the indicated pieces (where count to distinguish between multiple relations # between same source and target Entity.defaults['properties'] = [ APrimaryKey('id', isClassProperty=0, isRequired=1) ] ## _connDict = {} model = Model('StoreEmployees',connDict=_connDict) model.entities = [ # Entity('Store', properties=[ AString('corporateName', isRequired=1), RToMany('employees', 'Employee') ] ) # Employee and its subclasses SalesClerk and Executive Entity('Employee', properties=[ AString('lastName', isRequired=1, usedForLocking=1), AString('firstName', isRequired=1, width=50, usedForLocking=1), AForeignKey(Attribute.fk('Store'), isClassProperty=0), RToMany('toAddresses', 'Address', delete='cascade'), RToOne('toStore', 'Store') ] ) Entity('SalesClerk', parent='Employee', properties=[ AString('storeArea') ] ), Entity('Executive', parent='Employee', properties=[ AString('officeLocation', width=5), RToMany('marks', 'Mark', delete='cascade') ] ), # Entity('Address', properties=[ AString('street', width=80), AString('zipCode'), AString('town', width=80), AForeignKey(Attribute.fk('Employee'), isClassProperty=0), RToMany('toEmployee', 'Employee', delete='deny') ] ), # Entity('Mark', properties=[ AInteger('month', isRequired=1), AInteger('mark', isRequired=1), AForeignKey(Attribute.fk('Executive'), isClassProperty=0), RToOne('executive', 'Executive' ] ] ) ] if __name__ == '__main__': print model.validate() print model.toXML() # plus whatever ... ## |