Re: [Modeling-users] Pythonic, and non-XML, Model description
Status: Abandoned
Brought to you by:
sbigaret
From: Sebastien B. <sbi...@us...> - 2003-03-01 16:00:48
|
Hi, > > (same for generating an xml-model, thus mapping a py-model to an=20 > > xml-model > > will be straightforward --cf. ModelSet.getXMLDOMForModelNamed() and > > getXMLStreamForModelNamed()) >=20 > Yes, should be the same idea, except that there'd be no need to have > methods with "forModelNamed" in their name, as this would be implied > by the object on which the method is called: >=20 > modelset.getXML() > modelset.getXMLDOM() > model.getXML() > entity.getXML() > ... Yes, we're meaning the same thing here. I just wanted to point out where it was implemented. And it was stupid, I should have pointed the existing methods Model.getXMLDOM() and .saveModelAsXMLFile() instead. > The ModelSet concept is not yet included in the PyModel description > (mostly becuase it is not fully clear to me how it is used).=20 I'm afraid there's no real documentation on that. Let's go for a short one: the framework can only deal with one ModelSet at a time, the one returned by defaultModelSet() in module ModelSet. The module & class main responsabilities are: 1. making sure that all the entities defined in a ModelSet instance h= ave distinct names: a ModelSet identifies its models and entities by n= ame. Hence, object defaultModelSet() can be asked at runtime for models= and entities, given their names 2. registering the object returned by defaultModelSet() as the default receiver for the notification ClassDescriptionNeededForEntityNameNotification posted by classDescriptionForName() in module ClassDescription. (I added this to the module's docstring) > How would you extend this description to include it? I won't ;! Since the framework only deals with one ModelSet at a time, I can't see any reason to add it in the model (there's no model set in the xml model either). About: key&locking attrs defined on atts and rels rather than entities > [...] But, see how it makes things easier [and] consistent [...] I saw ;) and I'm convinced. About: Models' being mutable at runtime > In general this is not what anyone woudl want to do, and I was trying > to think of times when this would be useful. Can't really, but i > *suspect* it might come in handy. But, in fact, imagine some > application that is stocking lots of incoming unknown data into a > db... let's say the data is XML, but each XML data-schema is unknown a > priori. You can create a py model to correspond, at runtime (but saved > for subsequent accessing), create the tables, and stock the unknown > XML in the db directly as real db tables.... Okay, I now can be more precise here: adding an entity at runtime is definitely not a problem. But mutating an entity (changing its attributes for example, or its name), or removing an entity should *not* be done after an object of the corresponding class has been registered within an EditingContext. (this should be a FAQ) > >> And, speed of course, as i see no reason why this model object > >> would also not serve as the model object in memory at runtime (but > >> that's for Sebastien to confirm) -- thus loading the model is > >> equivalent to "from MyModel import model". > > > > Well, it will need a few more lines of code to actually load the > > model into the so-called defaultModelSet (cf. Modeling.ModelSet and > > the generated __init__.py), but that's it. >=20 > Yes, but i was leaving all that to you ;) No problem! > But, details of how such a py model would integrate with the rest of > the framework are all to be decided, and please point out any problems > that only you can foresee! Integrating a py-model is just a matter of transforming a pymodel to a Modeling.Model. Hence, all possible problems are really conversion problems. And I keep my eyes opened! > in general i do not like 'special format strings' as values to > things. This adds possible errors, requires special documentation, > makes checking more difficult, and in general is less pythonic... I > think keeping the info units separate would be easier to work with, > and clearer. The overhead of having to specify the same parameters > many times over can be reduced by standard defaults and the > possibility to set defaults only for this model (see new sample...) >=20 > > - maybe we'll gain more readability if relationships' multiplicity > > lower- and upper bounds where encoded in strings, such as: '0-1= ', > > '1-1', '0-*' or '2-16'. >=20 > Yes, but not as strings. What is wrong with a python list? > multiplicity =3D [0,1] > multiplicity =3D [1,-1] Well, I understand your point quite well and I'm ok with a list. [1,-1] seems however obscure, if we go for a python list what do you think of [0,'*'] or maybe better: [0,] ? > > - Same for external type's width, precision and scale, such as in: > > 'NUMERIC(12,2)' or 'VARCHAR(200)' >=20 > Same thing here. This would also make it difficult to provide defaults > separately for dbtype, and for width or precision. I disagree: default would be treated exactly the same way: if the width or precision/scale cannot be found in the parsed string then the defaul= ts apply.=20 > So, I would vote to keep them separate (and tyo change the name of > externalType to dbType). I think it will be easier to work with, > and clearer for everyone. I'd really like to keep the former possibility, though. It can co-exist with yours, I've no problem with that. I won't however go crazy if we do not support it ! I suggest we forget about my proposal for 'VARCHAR(20)' and that we only consider explicit parameters width/precision/scale. Since it's not a hard thing to add, we can safely forget it for the moment and add it later if this reveals to be a users' request. Now that I think of it, there's an underlying point that should be discussed for clarification. For example a string/VARCHAR won't get any defaults in the conversion process, unless explicitly stated in the model itself (yours AString.default['width']) --same for NUMERIC, etc. This way it makes it clear that a String requires a width, that a float requires precision and scale, etc., since we would get an error if neither a default nor an explicit parameter is provided. What do you think about this? In fact, I do not even know what a good default for width, scale or precision can be. Different DB have different defaults, and I don't think it's worth registering/tracking them. > > - Again, same for an entity's parent which could be specified > > along with the entity's name: 'Executive(Employee)' >=20 > Same. I vote to keep separate, and to rename parentEntity to isAlso, > e.g. Entity('Executive', isAlso=3D'Employee', ... I know this is a detail, but I would have sponteanously shortened it to 'parent'. 'isAlso' vaguely recalls me something, but I don't know what... It sounds familiar. Is this UML jargon? > > - What about the possibility to add a '*' to an attribute's name= =20 > > when it's > > required? (may be same for a relationship, equivalent to lower= =20 > > bound=3D=3D1) >=20 > Same for adding '*' to att name Ok, again, let's forget it, I added it for completeness but didn't really like it either. The explicit parameter 'required' is enough. Speaking of this term, maybe it's not clear enough: I observed people being surprised that '' (empty string) is a valid value for a field that is required because they misunderstood it, while it only tells whether the attribute can be None (python) / NULL (SQL). In the Attribute API there's also the counterpart 'allowsNone', what do you think? > but definately yes to have constraints forced down as a result from > relations, e.g. if a one-to-one relation is required, then the related > attributes must also be required -- but this is automatic and taken > into account by the validation. Currently designing one-to-one is not possible, but we can automatically make a choice on which entity the foreign key should be dropped in. For one-to-many this is straightforward (almost, see below comments on FK names) > > - Allow litterals instead of the equivalent integers in the xml: > > 'CASCADE' for the delete rule is more explicit than int(2)! >=20 > Definately. Also, in lowercase! (Hate being screamed at, which is what > uppercase seems to be always doing \-) This is an old habit for declaring constants I inherited from the years where I was a C programmer! So lowercase, ok. > > Now that I get used to the idea of a python-model, I'm also thinking > > of some extents to your proposal: > > > > - We could have subclasses for your Attribute: PrimaryKey, Foreig= nKey > > (defaults for both would be: int, not class property, etc.), St= ring > > (with a default external size/width), Integer, Numeric, ... You= =20 > > get the idea. > >=20=20=20=20=20 >=20 > Great! There could be some standard sub-classes, but a user is > ofcourse allowed to make his own. What about the naming scheme propoes > by the example below, that Att subs start with the letter 'A' and rel > subs start with the letter 'R' ? (to avoid unnecessarily long names) I easily got used to it, if there's no other objections let's keep it > > Last, I'm thinking of some automatic processing which is already > > coded in the zmodeler and that could be done at model-time to reduce > > verbosity in a significant manner: > > > > - have a primary key 'id' automatically declared if not set, >=20 > In general "magical" behaviour is more trouble than gain... how about > if we have the possibility to define a default attribute on an Entity > description class? Fine, we then have standard defaults along with the ability to specify specific default for a model, all in one. +1 on this, definitely. > > - have foreign keys automatically set for relationships, using the > > same defaults the zmodeler already uses (e.g. FKEmployeeId for a > > to-many relationship pointing to the entity Employee). It would > > need some additional checks but it's definitely possible. >=20 > Again, I like simplifying the management of all this, but i do not > like being forced to accept decisions imposed by the framework, such > as the names of my columns (what if I want to provide a model for an > existing db?) > So, providing a default scheme, that the user may redefine (or not > use at all), seems more reasonable to me.=20 Ok -- what I wrote was not clear enough, I meant that too. > See the defaults for sourceAttribute and destinationAttribute in > RToOne and RToMany... Also, see the explicit declaration of the > foreign keys (with the possibility of automatic the name for it). I'm afraid the naming scheme you propose won't make it. In fact, I was thinking of automating the simple case, where only one association exists between two entities. In this case, the foreign key only needs to be named 'fk<destinationEntityName>' ; this makes it easy for one relationship and its possible inverse to actually refer to the same source/destination attributes in their joins. Now a more complex example: suppose you want to model something like this: A <------->> B <------->> (this comes from a "real-life" example, two relationships from A to B having a different semantics) Suppose we first process A's relationships: this will create two different FK in B pointing to A's PK. Now we process entity B: there is no way to identify which relationship coming from A is the inverse of a toOne from B. Hence, this can be automated *if and only if* we also have the keyword 'inverse' in the declaration of a relationship. BTW, this is where I think automation will not be a real gain and will not be worth the effort, because that sort of model is really not showing up frequently, and because when it shows up you usually want to have names for foreign keys more explicit than fk<destinationEntityName><count> (You stated: fk<destinationEntityName><destinationAttributeName><count> but since the destAttr. will always be the same for a given dest. entity (the PK), I removed it here) > > This could be as handy as the feature you submitted in your proposa= l: > > automatic definition parent properties which are not overriden in > > sub-entities. >=20 > Yes. Auto generation of foreign keys from a relationship definition d= oes > make sense in fact... but the possibility to declare them explicitly= =20 > (with > the desired values for their attributes) should always be there. Since > such explicit declaration can be very simple, I prefer to leave it as > must be explicit. My opinion here is: - make it possible to automate the simple case (PK and FK), as explained above, - always keep the possibility to explicitly write every single properties in models, entities, etc. You might think that I'm definitely reluctant to impose explicit declarations (such as for the FK). I think I am but it's not a religion ;)=20 In fact, it's really a matter of usual practices in E-R modeling. This is not only /my/ practice, but the way most people seem to deal with entity-relationship modeling --they do not care about the db-schema PK/FK details unless they need to, for example when defining a model for an existing model, or when some complex design is required (just like the one I described above). Obviously this observation was not made on that framework (there are still not enough people using it for such an observation), but it is what I observed from the community using the original Apple's Enterprise Object Framework(tm). I'm possibly wrong and maybe I just think that people do not care because most of the times *I* usually don't care. In the ZModeler I added the ability to make the two relationships necessary for a one-to-many association at a single-click reach for that very reason: it's a valuable feature to my own eyes, and I think it may be for other's. Last comment on this: I do not really like the idea of imposing the FK to be declared while having a implicit default for the PK, it sounds inconsistent to me. Please do not misunderstand what I'm saying here, my english is not good enough to be subtle. I'm just trying to make my opinions and positions clearer by being more explicit :) > I have taken your simplified model, and evolved it to correspond to my > comments above. Correspondingly, i have also updated the PyModel > module to indicate how it would change to support these changes. > [...] > Oh, and some other changes I have not mentioned here are listed in > the top of PyModel_3_mr.py, but mostly name changes, and having > entities have only one list, that mixes attributes and relations...=20 > (your opinion?). * deleteRule -> delete: ok * multiplicity lower/upper bound -> multiplicity as a python list: ok * entity only has attributes that may also be relations: I'm ok for the principle, however i would prefer an distinct term for this, relations being attributes could be a source of confusion. Maybe 'properties' ? More comment on this: having attributes and relations mixed in a common list will require extra-checks when analysing the model to perform the possibly needed automated steps. This is because we will need to have all the entities and their attributes loaded and processed before we can make a decision on whether a declared relation needs automatic generation of FK. We will need to separate attributes from relations anyway.=20 I do not mean this is complex: it's not. However I'm not sure this enhance readability. In the sample model your clearly first declared attributes, than relationships, and I bet everyone will do it that way, because it would be a real mess to have some attrs, than rels, than attrs, etc. * misc.: at some point in the discussion the FK were made class properties in the py-model, but they should not! Regarding PyModel_3_mr.py: - I'm not sure we want a default for entity.adaptorName - I'd add a APrimaryKey subclass of Attribute > > I really have the feeling that we will succeed in designing a very > > nice python model. Let's go for it now that Mario brought some light > > on the path! >=20 > That would not be XPath, would it ;-? Ooooh no, it wouldn't ;! Cheers, -- S=E9bastien. |