Re: [Modeling-users] Pythonic, and non-XML, Model description

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hello,

thanks for the detailed feedback, and I like most of it very much.
I comment on each point (needing comment) below...

...

>> The behaviour in general would stay the same as the XML way of
>> doing things, namely form this model you would generate the classes
>> and the db schemas.
>
> These features are available for any Modeling.Model object, so the 
> "only"
> thing to do is build it from the py-model, just like it goes for the 
> xml.
>
> (same for generating an xml-model, thus mapping a py-model to an 
> xml-model
>  will be straightforward --cf. ModelSet.getXMLDOMForModelNamed() and
>  getXMLStreamForModelNamed())

Yes, should be the same idea, except that there'd be no need to have 
methods
with "forModelNamed" in their name, as this would be implied by the 
object on
which the method is called:

modelset.getXML()
modelset.getXMLDOM()
model.getXML()
entity.getXML()
...

The ModelSet concept is not yet included in the PyModel description 
(mostly
becuase it is not fully clear to me how it is used). How would you 
extend this
description to include it?

>> Some differences are choice of default values, and that key and 
>> locking
>> attributes are defined on attributes and relations rather than 
>> entities
>> ;)
>
> Ok, I guess we won't argue any further on this by now ;)

But, see how it makes things easier! You declare a key on an att, you 
use
the att, defining also teh key on the using entity -- makes things 
consistent,
and less wordy. If you want that an id in one entity is a key but not 
in another,
then you are forced to define 2 different id 'types', which helps make 
the
difference obvious to anyone looking in.

>> As an exercise, I have taken the sample model in the distribution:
>> testPackages/StoreEmployees/model_StoreEmployees.xml and re-expressed 
>> it
>> in this pythonic way.
>
> Thanks, this made things even clearer.
>>
>> Apart from readability and losing of some verbosity, other gains are
>> "executability" and dynamicity -- it could be very easy to modify a 
>> model
>> at runtime, if you would ever want to do that.
>
> Oh oh, I can't believe you said that! Models' mutability at runtime is 
> an
> other subject we'd better discuss apart from this thread. It *is* 
> possible,
> under some conditions I certainly need to gather before answering. If 
> you
> need this sort of feature in short-term, you'd better picked me on 
> wildly!!

In general this is not what anyone woudl want to do, and I was trying 
to think
of times when this would be useful. Can't really, but i *suspect* it 
might come
in handy. But, in fact, imagine some application that is stocking lots 
of
incoming unknown data into a db... let's say the data is XML, but each 
XML
data-schema is unknown a priori. You can create a py model to 
correspond,
at runtime (but saved for subsequent accessing), create the tables, and
stock the unknown XML in the db directly as real db tables....

>>  Also, given that a model now becomes a module in and of itself, it 
>> gains
>> from what modules have to offer, such as self-tests.
>
> Ok.
>
>> And, speed of course, as i see no reason why this model object would 
>> also
>> not serve as the model object in memory at runtime (but that's for
>> Sebastien to confirm) -- thus loading the model is equivalent to "from
>> MyModel import model".
>
> Well, it will need a few more lines of code to actually load the model 
> into
> the so-called defaultModelSet (cf. Modeling.ModelSet and the generated
> __init__.py), but that's it.

Yes, but i was leaving all that to you ;)
But, details of how such a py model would integrate with the rest of
the framework are all to be decided, and please point out any problems
that only you can foresee!

>> Can you please take a look at the 2 linked files below (as temporary
>> URLs, as to attach they are too big);
>> - PyModel.py -- defines the classes (signatures for) for, and
>> documents the rules for, a PyModel instance
>> - sample_PyModel.py -- re-expresses the StoreEmployees model
>
> Ok, now let's take a look at the big stuff ;)
>
> Your python code is sound and clear. My first comments are:

A general comment to the comments below -- in general i do not like
'special format strings' as values to things. This adds possible errors,
requires special documentation, makes checking more difficult, and
in general is less pythonic... I think keeping the info units separate
would be easier to work with, and clearer. The overhead of having
to specify the same parameters many times over can be reduced
by standard defaults and the possibility to set defaults only for
this model (see new sample...)

>   - maybe we'll gain more readability if relationships' multiplicity 
> lower-
>     and upper bounds where encoded in strings, such as: '0-1', '1-1', 
> '0-*'
>     or '2-16'.

Yes, but not as strings. What is wrong with a python list?
multiplicity = [0,1]
multiplicity = [1,-1]

>   - Same for external type's width, precision and scale, such as in:
>     'NUMERIC(12,2)' or 'VARCHAR(200)'

Same thing here. This would also make it difficult to provide defaults
separately for dbtype, and for width or precision.
So, I would vote to keep them separate (and tyo change the name of
externalType to dbType). I think it will be easier to work with,
and clearer for everyone.

>   - Again, same for an entity's parent which could be specified along 
> with
>     the entity's name: 'Executive(Employee)'

Same. I vote to keep separate, and to rename parentEntity to isAlso, 
e.g.
Entity('Executive', isAlso='Employee', ...

>   - What about the possibility to add a '*' to an attribute's name 
> when it's
>     required? (may be same for a relationship, equivalent to lower 
> bound==1)

Same for adding '*' to att name, but definately yes to have constraints 
forced
down as a result from relations, e.g. if a one-to-one relation is 
required, then
the related attributes must also be required -- but this is automatic 
and taken
into account by the validation.

>   - Allow litterals instead of the equivalent integers in the xml: 
> 'CASCADE'
>     for the delete rule is more explicit than int(2)!

Definately. Also, in lowercase! (Hate being screamed at, which is what
uppercase seems to be always doing  \-)

>   - I think I would have made 'name' instead 'columnName' the default 
> for
>     displayLabel :)

Yes, it is probably better.

> Now that I get used to the idea of a python-model, I'm also thinking 
> of some
> extents to your proposal:
>
>   - We could have subclasses for your Attribute: PrimaryKey, ForeignKey
>     (defaults for both would be: int, not class property, etc.), String
>     (with a default external size/width), Integer, Numeric, ... You 
> get the
>     idea.

Great! There could be some standard sub-classes, but a user is
ofcourse allowed to make his own. What about the naming scheme propoes
by the example below, that Att subs start with the letter 'A' and rel 
subs start
with the letter 'R' ? (to avoid unnecessarily long names)

>   - Relation can also be sub-classed to 'ToOne' and 'ToMany' (default 
> would
>     be '0-*' for the latter)

Yes. Also, all defaults may be set for a specific model as necessary...

> Last, I'm thinking of some automatic processing which is already coded 
> in
> the zmodeler and that could be done at model-time to reduce verbosity 
> in a
> significant manner:
>
>   - have a primary key 'id' automatically declared if not set,

In general "magical" behaviour is more trouble than gain... how about if
we have the possibility to define a default attribute on an Entity 
description class?

>   - have foreign keys automatically set for relationships, using the 
> same
>     defaults the zmodeler already uses (e.g. FKEmployeeId for a to-many
>     relationship pointing to the entity Employee). It would need some
>     additional checks but it's definitely possible.

Again, I like simplifying the management of all this, but i do not like 
being
forced to accept decisions imposed by the framework, such as the names 
of
my columns (what if I want to provide a model for an existing db?)
So, providing a default scheme, that the user may redefine (or
not use at all), seems more reasonable to me. See the defaults for 
sourceAttribute
and destinationAttribute in RToOne and RToMany... Also, see the explicit
declaration of the foreign keys (with the possibility of automatic the 
name
for it).

> This could be as handy as the feature you submitted in your proposal:
> automatic definition parent properties which are not overriden in
> sub-entities.

Yes. Auto generation of foreign keys from a relationship definition does
make sense in fact... but the possibility to declare them explicitly 
(with
the desired values for their attributes) should always be there. Since
such explicit declaration can be very simple, I prefer to leave it as
must be explicit.

> I wrote all these items with something in mind, actually. You know, I'm
> basically lazing, most of the time I do not take about DB-Schemas 
> details,
> and I will be delighted if it was possible to write the same 
> StoreEmployee
> model really simply. This is a good illustration of the automatic 
> processing
> I described above because this model (like every test models) were 
> designed
> using the zmodeler and its functionalities.
>
>   In fact, I'm thinking that something simple like that could be 
> written:
> (this would imply some changes to your proposed API, such as making
>  'destinationEntity' the second argument for the Relationship's 
> initializer).

Ah, very good -- destinationEntity must in fact be specified everytime, 
so
this is better this way.

I have taken your simplified model, and evolved it to correspond to my
comments above. Correspondingly, i have also updated the PyModel
module to indicate how it would change to support these changes.
I have put the two files (PyModel_3_mr.py and sample_ PyModel_3_mr.py) 
at:
http://ruggier.dyndns.org:8080/PyModel/
But for convenience, I am also pasting below the sample model...

Oh, and some other changes I have not mentioned here are listed in
the top of PyModel_3_mr.py, but mostly name changes, and having
entities have only one list, that mixes attributes and relations... 
(your opinion?).

> Of course this would need some additional cpu-time to load a model, 
> but I
> bet it would be far quicker than parsing the xml!
>
>   This of course still needs to be discussed and refined. I did not 
> have any
> time to try & implement my proposal, but I guess that at some point of 
> the
> discussion we'll need to see the words take shape --at this point this 
> could
> naturally be made in a dev-branch if several of us are on this.

OK.

>   I really have the feeling that we will succeed in designing a very 
> nice
>   python model. Let's go for it now that Mario brought some light on 
> the
>   path!

That would not be XPath, would it ;-?

Cheers, mario

ps: sample_ PyModel_3_mr.py
'''
A sample Pythonic OO-RDB Model
(re-expressing testPackages/StoreEmployees/model_StoreEmployees.xml) --
A PyModel must define a global variable called 'model', that is of type 
Model
'''

from PyModel import *

##
# Set preferred defaults for this model (when different from
# standard defaults, or if we want to make things explicit)

AInteger.defaults['precision'] = 10
AString.defaults['width'] = 20

RToOne.defaults['delete'] = 'cascade'
RToOne.defaults['multiplicity'] = [0,1]
RToOne.defaults['sourceAttribute'] = RToOne.attNameFromRel # 
'fk'+destEnt+destAtt+count
RToOne.defaults['destinationAttribute'] = 'id'

RToMany.defaults['delete'] = 'deny'
RToMany.defaults['multiplicity'] = [0,*]
RToMany.defaults['sourceAttribute'] = 'id'
RToMany.defaults['destinationAttribute'] = RToMany.attNameFromRel # 
fk+destEnt+sourceAtt+count

# Note that Relation.attNameFromRel is a callable, that calculates the 
att name
# from the indicated pieces (where count to distinguish between 
multiple relations
# between same source and target

Entity.defaults['attributes'] = [
     AInteger('id', key=1, isClassProperty=0, isRequired=1)
]

##

_connDict = {}
model = Model('StoreEmployees',connDict=_connDict)
model.entities = [

     #
     Entity('Store',
         attributes=[
             AString('corporateName', isRequired=1),
             RToMany('employees', 'Employee')
         ]
     )

     # Employee and its subclasses SalesClerk and Executive
     Entity('Employee',
         attributes=[
             AString('lastName', isRequired=1, usedForLocking=1),
             AString('firstName', isRequired=1, width=50, 
usedForLocking=1),
             AForeignKey(Attribute.fk('Store'), isClassProperty=0),
             RToMany('toAddresses', 'Address', delete='cascade'),
             RToOne('toStore', 'Store')
         ]
     )

     Entity('SalesClerk', isAlso='Employee',
         attributes=[
             AString('storeArea')
         ]
     ),

     Entity('Executive', isAlso='Employee',
         attributes=[
             AString('officeLocation', width=5),
             RToMany('marks', 'Mark', delete='cascade')
         ]
     ),

     #
     Entity('Address',
         attributes=[
             AString('street', width=80),
             AString('zipCode'),
             AString('town', width=80),
             AForeignKey(Attribute.fk('Employee'), isClassProperty=1),
             RToMany('toEmployee', 'Employee', delete='deny')
         ]
     ),

     #
     Entity('Mark',
         attributes=[
             AInteger('month', isRequired=1),
             AInteger('mark', isRequired=1),
             AForeignKey(Attribute.fk('Executive'), isClassProperty=1),
             RToOne('executive', 'Executive' ]
         ]
     )

]

if __name__ == '__main__':
     print model.validate()
     print model.toXML()
     # plus whatever ...

##