Re: [Modeling-users] Pythonic, and non-XML, Model description

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

        Hi,

> > (same for generating an xml-model, thus mapping a py-model to an=20
> > xml-model
> >  will be straightforward --cf. ModelSet.getXMLDOMForModelNamed() and
> >  getXMLStreamForModelNamed())
>=20
> Yes, should be the same idea, except that there'd be no need to have
> methods with "forModelNamed" in their name, as this would be implied
> by the object on which the method is called:
>=20
> modelset.getXML()
> modelset.getXMLDOM()
> model.getXML()
> entity.getXML()
> ...

Yes, we're meaning the same thing here. I just wanted to point out where
it was implemented. And it was stupid, I should have pointed the
existing methods Model.getXMLDOM() and .saveModelAsXMLFile() instead.

> The ModelSet concept is not yet included in the PyModel description
> (mostly becuase it is not fully clear to me how it is used).=20

I'm afraid there's no real documentation on that. Let's go for a short
one: the framework can only deal with one ModelSet at a time, the one
returned by defaultModelSet() in module ModelSet. The module & class
main responsabilities are:

  1. making sure that all the entities defined in a ModelSet instance h=
ave
     distinct names: a ModelSet identifies its models and entities by n=
ame.
     Hence, object defaultModelSet() can be asked at runtime for models=
 and
     entities, given their names

  2. registering the object returned by defaultModelSet() as the default
     receiver for the notification
     ClassDescriptionNeededForEntityNameNotification posted by
     classDescriptionForName() in module ClassDescription.

(I added this to the module's docstring)

> How would you extend this description to include it?

I won't ;! Since the framework only deals with one ModelSet at a time, I
can't see any reason to add it in the model (there's no model set in the
xml model either).

About: key&locking attrs defined on atts and rels rather than entities
> [...] But, see how it makes things easier [and] consistent [...]

I saw ;) and I'm convinced.

About: Models' being mutable at runtime
> In general this is not what anyone woudl want to do, and I was trying
> to think of times when this would be useful. Can't really, but i
> *suspect* it might come in handy. But, in fact, imagine some
> application that is stocking lots of incoming unknown data into a
> db... let's say the data is XML, but each XML data-schema is unknown a
> priori. You can create a py model to correspond, at runtime (but saved
> for subsequent accessing), create the tables, and stock the unknown
> XML in the db directly as real db tables....

Okay, I now can be more precise here: adding an entity at runtime is
definitely not a problem. But mutating an entity (changing its
attributes for example, or its name), or removing an entity should *not*
be done after an object of the corresponding class has been registered
within an EditingContext.

(this should be a FAQ)

> >> And, speed of course, as i see no reason why this model object
> >> would also not serve as the model object in memory at runtime (but
> >> that's for Sebastien to confirm) -- thus loading the model is
> >> equivalent to "from MyModel import model".
> >
> > Well, it will need a few more lines of code to actually load the
> > model into the so-called defaultModelSet (cf. Modeling.ModelSet and
> > the generated __init__.py), but that's it.
>=20
> Yes, but i was leaving all that to you ;)

No problem!

> But, details of how such a py model would integrate with the rest of
> the framework are all to be decided, and please point out any problems
> that only you can foresee!

Integrating a py-model is just a matter of transforming a pymodel to a
Modeling.Model. Hence, all possible problems are really conversion
problems.  And I keep my eyes opened!

> in general i do not like 'special format strings' as values to
> things. This adds possible errors, requires special documentation,
> makes checking more difficult, and in general is less pythonic... I
> think keeping the info units separate would be easier to work with,
> and clearer. The overhead of having to specify the same parameters
> many times over can be reduced by standard defaults and the
> possibility to set defaults only for this model (see new sample...)
>=20
> >   - maybe we'll gain more readability if relationships' multiplicity
> >     lower- and upper bounds where encoded in strings, such as: '0-1=
',
> >     '1-1', '0-*' or '2-16'.
>=20
> Yes, but not as strings. What is wrong with a python list?
> multiplicity =3D [0,1]
> multiplicity =3D [1,-1]

Well, I understand your point quite well and I'm ok with a list. [1,-1]
seems however obscure, if we go for a python list what do you think of
[0,'*'] or maybe better: [0,]  ?

> >   - Same for external type's width, precision and scale, such as in:
> >     'NUMERIC(12,2)' or 'VARCHAR(200)'
>=20
> Same thing here. This would also make it difficult to provide defaults
> separately for dbtype, and for width or precision.

I disagree: default would be treated exactly the same way: if the width
or precision/scale cannot be found in the parsed string then the defaul=
ts
apply.=20

> So, I would vote to keep them separate (and tyo change the name of
> externalType to dbType). I think it will be easier to work with,
> and clearer for everyone.

I'd really like to keep the former possibility, though. It can co-exist
with yours, I've no problem with that.

I won't however go crazy if we do not support it ! I suggest we forget
about my proposal for 'VARCHAR(20)' and that we only consider explicit
parameters width/precision/scale. Since it's not a hard thing to add, we
can safely forget it for the moment and add it later if this reveals to
be a users' request.

  Now that I think of it, there's an underlying point that should be
  discussed for clarification. For example a string/VARCHAR won't get
  any defaults in the conversion process, unless explicitly stated in
  the model itself (yours AString.default['width']) --same for NUMERIC,
  etc. This way it makes it clear that a String requires a width, that a
  float requires precision and scale, etc., since we would get an
  error if neither a default nor an explicit parameter is provided.

  What do you think about this? In fact, I do not even know what a good
  default for width, scale or precision can be. Different DB have
  different defaults, and I don't think it's worth registering/tracking
  them.

> >   - Again, same for an entity's parent which could be specified
> >     along with the entity's name: 'Executive(Employee)'
>=20
> Same. I vote to keep separate, and to rename parentEntity to isAlso,
> e.g.  Entity('Executive', isAlso=3D'Employee', ...

I know this is a detail, but I would have sponteanously shortened it to
'parent'. 'isAlso' vaguely recalls me something, but I don't know
what... It sounds familiar. Is this UML jargon?

> >   - What about the possibility to add a '*' to an attribute's name=
=20
> > when it's
> >     required? (may be same for a relationship, equivalent to lower=
=20
> > bound=3D=3D1)
>=20
> Same for adding '*' to att name

Ok, again, let's forget it, I added it for completeness but didn't
really like it either. The explicit parameter 'required' is enough.

Speaking of this term, maybe it's not clear enough: I observed people
being surprised that '' (empty string) is a valid value for a field that
is required because they misunderstood it, while it only tells whether
the attribute can be None (python) / NULL (SQL). In the Attribute API
there's also the counterpart 'allowsNone', what do you think?

> but definately yes to have constraints forced down as a result from
> relations, e.g. if a one-to-one relation is required, then the related
> attributes must also be required -- but this is automatic and taken
> into account by the validation.

Currently designing one-to-one is not possible, but we can automatically
make a choice on which entity the foreign key should be dropped in.  For
one-to-many this is straightforward (almost, see below comments on FK
names)

> >   - Allow litterals instead of the equivalent integers in the xml:
> >     'CASCADE' for the delete rule is more explicit than int(2)!
>=20
> Definately. Also, in lowercase! (Hate being screamed at, which is what
> uppercase seems to be always doing  \-)

This is an old habit for declaring constants I inherited from the years
where I was a C programmer! So lowercase, ok.

> > Now that I get used to the idea of a python-model, I'm also thinking
> > of some extents to your proposal:
> >
> >   - We could have subclasses for your Attribute: PrimaryKey, Foreig=
nKey
> >     (defaults for both would be: int, not class property, etc.), St=
ring
> >     (with a default external size/width), Integer, Numeric, ... You=
=20
> >     get the idea.
> >=20=20=20=20=20
>=20
> Great! There could be some standard sub-classes, but a user is
> ofcourse allowed to make his own. What about the naming scheme propoes
> by the example below, that Att subs start with the letter 'A' and rel
> subs start with the letter 'R' ? (to avoid unnecessarily long names)

I easily got used to it, if there's no other objections let's keep it

> > Last, I'm thinking of some automatic processing which is already
> > coded in the zmodeler and that could be done at model-time to reduce
> > verbosity in a significant manner:
> >
> >   - have a primary key 'id' automatically declared if not set,
>=20
> In general "magical" behaviour is more trouble than gain... how about
> if we have the possibility to define a default attribute on an Entity
> description class?

Fine, we then have standard defaults along with the ability to specify
specific default for a model, all in one. +1 on this, definitely.

> >   - have foreign keys automatically set for relationships, using the
> >   same defaults the zmodeler already uses (e.g. FKEmployeeId for a
> >   to-many relationship pointing to the entity Employee). It would
> >   need some additional checks but it's definitely possible.
>=20
> Again, I like simplifying the management of all this, but i do not
> like being forced to accept decisions imposed by the framework, such
> as the names of my columns (what if I want to provide a model for an
> existing db?)
> So, providing a default scheme, that the user may redefine (or not
> use at all), seems more reasonable to me.=20

Ok -- what I wrote was not clear enough, I meant that too.

> See the defaults for sourceAttribute and destinationAttribute in
> RToOne and RToMany... Also, see the explicit declaration of the
> foreign keys (with the possibility of automatic the name for it).

I'm afraid the naming scheme you propose won't make it. In fact, I was
thinking of automating the simple case, where only one association
exists between two entities. In this case, the foreign key only needs to
be named 'fk<destinationEntityName>' ; this makes it easy for one
relationship and its possible inverse to actually refer to the same
source/destination attributes in their joins.

Now a more complex example: suppose you want to model something like
this:

      A <------->> B
        <------->>

  (this comes from a "real-life" example, two relationships from A to B
   having a different semantics)

Suppose we first process A's relationships: this will create two
different FK in B pointing to A's PK. Now we process entity B: there is
no way to identify which relationship coming from A is the inverse of a
toOne from B. Hence, this can be automated *if and only if* we also have
the keyword 'inverse' in the declaration of a relationship.

  BTW, this is where I think automation will not be a real gain and will
  not be worth the effort, because that sort of model is really not
  showing up frequently, and because when it shows up you usually want
  to have names for foreign keys more explicit than
  fk<destinationEntityName><count>

  (You stated:
   fk<destinationEntityName><destinationAttributeName><count> but since
   the destAttr. will always be the same for a given dest. entity (the
   PK), I removed it here)

> > This could be as handy as the feature you submitted in your proposa=
l:
> > automatic definition parent properties which are not overriden in
> > sub-entities.
>=20
> Yes. Auto generation of foreign keys from a relationship definition d=
oes
> make sense in fact... but the possibility to declare them explicitly=
=20
> (with
> the desired values for their attributes) should always be there. Since
> such explicit declaration can be very simple, I prefer to leave it as
> must be explicit.

My opinion here is:

  - make it possible to automate the simple case (PK and FK), as
    explained above,

  - always keep the possibility to explicitly write every single
    properties in models, entities, etc.

You might think that I'm definitely reluctant to impose explicit
declarations (such as for the FK). I think I am but it's not a
religion ;)=20

In fact, it's really a matter of usual practices in E-R modeling. This
is not only /my/ practice, but the way most people seem to deal with
entity-relationship modeling --they do not care about the db-schema
PK/FK details unless they need to, for example when defining a model for
an existing model, or when some complex design is required (just like
the one I described above).

  Obviously this observation was not made on that framework (there are
still not enough people using it for such an observation), but it is
what I observed from the community using the original Apple's Enterprise
Object Framework(tm).

I'm possibly wrong and maybe I just think that people do not care
because most of the times *I* usually don't care. In the ZModeler I
added the ability to make the two relationships necessary for a
one-to-many association at a single-click reach for that very reason:
it's a valuable feature to my own eyes, and I think it may be for
other's.

  Last comment on this: I do not really like the idea of imposing the FK
  to be declared while having a implicit default for the PK, it sounds
  inconsistent to me.

Please do not misunderstand what I'm saying here, my english is not good
enough to be subtle. I'm just trying to make my opinions and positions
clearer by being more explicit :)

> I have taken your simplified model, and evolved it to correspond to my
> comments above. Correspondingly, i have also updated the PyModel
> module to indicate how it would change to support these changes.
> [...]
> Oh, and some other changes I have not mentioned here are listed in
> the top of PyModel_3_mr.py, but mostly name changes, and having
> entities have only one list, that mixes attributes and relations...=20
> (your opinion?).

* deleteRule -> delete: ok

* multiplicity lower/upper bound -> multiplicity as a python list: ok

* entity only has attributes that may also be relations: I'm ok for the
  principle, however i would prefer an distinct term for this, relations
  being attributes could be a source of confusion. Maybe 'properties' ?

  More comment on this: having attributes and relations mixed in a
    common list will require extra-checks when analysing the model to
    perform the possibly needed automated steps. This is because we will
    need to have all the entities and their attributes loaded and
    processed before we can make a decision on whether a declared
    relation needs automatic generation of FK. We will need to separate
    attributes from relations anyway.=20

  I do not mean this is complex: it's not. However I'm not sure this
  enhance readability. In the sample model your clearly first declared
  attributes, than relationships, and I bet everyone will do it that
  way, because it would be a real mess to have some attrs, than rels,
  than attrs, etc.

* misc.: at some point in the discussion the FK were made class
         properties in the py-model, but they should not!

Regarding PyModel_3_mr.py:

  - I'm not sure we want a default for entity.adaptorName

  - I'd add a APrimaryKey subclass of Attribute

> >   I really have the feeling that we will succeed in designing a very
> > nice python model. Let's go for it now that Mario brought some light
> > on the path!
>=20
> That would not be XPath, would it ;-?

Ooooh no, it wouldn't ;!

        Cheers,

-- S=E9bastien.