[exprla-devel] Re: [XPL] The structure of classes in XPL

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

--- In xpl-dev@y..., Jonathan Burns <saski@w...> wrote:
Some commentary on the thread so far:

Alexander Gutman (Sunday 11/06/00) leads off:

At this moment I am thinking about how classes
will be declared in XPL. And I encountered a problem.

AG (if I may) has a <me> element, with <human>,
<mathematician> and <programmer> elements nested within.
And he wants to know how this XPL classic would be
expressed as a class (or strictly, object) in standard
OOP languages.

The obvious way to start off, would be to define a class
for each of the inner elements, and then to define a class
<me>, whose objects contain one instance each of the three,
as private data.

So then the question is, how does he want class <me> to look
from the outside?

Again, the obvious way is to add methods to the <me>
interface: get_human, get_mathematician, get_programmer.
Each of these methods will return a copy of the private
instance of the element. What's nice is that methods
so named can return lists of the desired element; a null
list if the element isn't there in an instance of <me>,
or a list of several, if <me> is a programmer at two
different workplaces.

So it's obviously feasible. As a variation, one can
define class <occupation>, and subclass <mathematician>
and <programmer> from it. Then we could define method:
get_occupation, to return both kinds of element in a
list.

Richard Hein (11/06/00) replies:

So if you use overloading, you call <human>, [etc] as if
they were different functions in the class?

Answer: Yes. But there's no special need for overloading.
If what we want is to provide special interfaces for <me>,
then the interfaces of classes <human>, <mathematician>
etc are just what we want. OOP is working as advertised.

The "different functions" in the class will just be the
different methods, get_human etc, which extract the desired
instances from an instance of <me>.

Then Kurt Cagle comes in with the OTHER side of the question 
(11/06/00):

[...] an XML Schema definition is
more like a VB type statement than a Java class -- the schema defines
the type and positional characteristics of an XML structure, but not 
any

associated methods or events.

However, you could make a superclass
structure that might actually include a schema and an associated
stylesheet with named templates, with templates acting as methods.

The other side of the question is, can we implement some simple
methods using off-the-shelf XML tech?

Answers:

(1) A schema or a DTD for <me> will define the element types
and placement of its private data; e.g. can we have zero or
multiple private instances of <human>? But a schema or a DTD
will not provide us with methods for getting the private
elements out of <me> - or doing anything with them if we could.

(2) XPath will get the inner elements.

child::mathematician()

will do that job directly. And other facilities of XPath will
extract elements of much more complicated documents; and
count and compare and filter them besides. The limitation of
XPath is that all these operations are read-only - they only
ever copy data from a document. There is no provision for
binding additional methods to elements on extraction.

But it is worth noting, that if we are content with public
nested state data in our objects, the get_ methods are
built-in with XPath. To that extent, all XML documents with
some defining schema or DTD are objects already.

(3) XSLT will apply template operations to <me>, acting on
any pattern (i.e. XPath expression) which we provide in
an XSL Transform.

This is the means to provide class <me> with a proper
interface. But, it is not directly adapted to providing
the kind of interface with which we are familiar in OOP.

As far as I can make out, an XSL Transform defines one
method, together with its implementation.

The standard OOP interface consists of many methods,
with or without implementations.

This isn't much of a problem, as far as I can see.
We can certainly have a transform which calls a proxy
template - a stub, which can go on to call the real
method implementation. And we can define an XPL class
interface as a collection of transforms.

Kurt goes on to say:

Keep in mind that the one primary difference between XML and 
procedural
objects is that the XML instance of a schema may not necessarily be
physically connected with the methods that operate on it.

We should thus ask whether XPL works by keeping the XML instance data
of a class as part of the object -- i.e., the XML acts as the local
variables --

 or the XML data arrives as input and is processed in real time 
against
the XPL/XSLT templates (in which case the XPL objects are methods 
only).

The former is a statist approach -- the data needs to be retained
between calls, while the latter is stateless.

Well, I'm not necessarily against statist approaches. The data
has to be stored somewhere, after all - it's just a matter of
how far away we want to push the commitment to stored data.

But we will want to provide an indirection mechanism, where the
data for the pattern to be recognized in a transform is not
present in the object, but is assembled on the fly.

It looks as if such a mechanism has to be built into XPath,
though. For our purpose here, XPath needs to recognize:

        <programmer ref=[URI] >
        </programmer>

or some such.

I will say, that if we develop XPL as statelessly as possible,
we will wind up in the company of the extreme functional languages,
which have to admit constant data eventually; but by that time,
they have pushed it into an obscure corner of the syntax, which
nobody can quite remember. (E.g., you get a major realm tacked on,
of addresses trying to pretend they are get_ functions and
failing.) Such languages do produce some beautiful dataflow
plumbing, however.

Alexander replies  (12/06/00):

Hmm... Could you clarify this idea?

Somewhat clarified, I hope.

As far as I know, the output of a transformation procedure
is always an XML document/object and, furthermore, we cannot
perform any actions during transformation other than those
which reflect the output (probably, the 'eval'; XSL element can
produce some side-effects during transformation, I do not know).

This raises the question, Is XSLT as powerful as we would like?

I'm undecided so far. XSLT provides quite the arsenal of tree-
building operations, and on first glance I can't gauge the
extent of the programming paradigms it supports.

Sometime soon, I'll give you at considerable length the benfits
of my experience with Mathematica, which is based on transformation
rules for tree-structured expressions. It looks very much like
XSLT to me. Mathematica demonstrates that you an do a LOT, just
by munging trees around.

Now my intuition is, when we want to bind "side-effects" to
XSLT, the way to do it is not to embed one dinky external-
language call in every production rule we recognize, but
rather to produce a complete document for whatever we want
an external system to perform. And we pass that document to
a special parser, which has the total API for the external
system bound into it. If we do it that way, then XSLT only
needs to transform trees for us.

So, how can an XSL template act as a method?
It seems that the only thing such a method can do
is returning an XML object. Do you mean that this object
is the body of the method? Or its return value?

The way I've been describing, the XML object is indeed
the return value of the transform, I mean method.

Actually, I thought about methods and discovered another
possibility of associating a method with an element type:
via an attribute whose name is that of the method
and default value the method's body (or its identifier).
However, I am not quite sure that this is a good idea.

We probably wouldn't want to do it with a raw instance
of <me>. In fact I'd argue most  strongly that at a deep
level we want it hardwired in that clients' parsers IGNORE
any references to executable stuff, except under very
specific conditions - e.g. their trusted browser detects
an <xpl:method> tag, and stops to verify that the XPL
document is from a trustworthy source.

However, the place we would actually like to put method
bindings is in XPL-enhanced schemas, which are XML
documents. The way I'm coming to see it, we would like
to provide a schema for each external API we address.

Oh! But didn't I say above that the place for our
methods was XSLT?

Not any more. Let XSLT munge trees.

But look, there's a major redundancy in XML technologies.

Everything which takes XML documents as input, does some
kind of traversal on the tree. Verifying parsers, XPath,
XSLT - they all do it. The fact is, they are all just
different flavours of Parser. They all recognize productions
in XML - that is, elements - as they wander up and down the
tree. They just take different actions at different points.

So there are various places in the XML accessory equipage,
where one might want to tack method hooks on. What we want
to do, is identify the best place to tack them, in order
to provide the standard OOP interface.

(1) Schema or DTD provides a map of the public interface.

(2) XPath, working on the schema, provides a general read-only
navigation interface, with actual get_ methods of many kinds.

(3) XSLT transforms, working on the elements returned by
XPath, provide the set_ methods which rearrange the tree,
and present the results as output.

(4) XSLT transforms also produce output trees (XML documents)
which define actions to be taken by external APIs.

(5) And special external-system parsers take those documents,
and traverse them so as to make the external-system calls
which implement the actions.

And NOW we have separation of interface from implementation.

Well, almost. Above, the schema maps the interface, inasfar
as the interface defines what kinds of elements are present.
It doesn't tell you want non-get_ methods are available.

The XSLTs in (4) and (5) do tell you what those methods are;
in fact so far, we're talking about on transform document
per method. But those transforms are the ones that actually
execute the methods, external side-effects aside.

So we probably do want a method-annotated Schema type for
XPL, which defines the available method interfaces by name
and parameter valence - but leaves the method actions to the
XSLTs.

Alexander has further questions on the OOP issue, but it seems
to me that the picture I'm drawing has a bearing on them all.
What do you think, AG?

Finally (so far), Garreth Galligan (12/06/00) puts the whole
OOP paradigm to the question.

<occupation id="occupation1">
    <title>mathematician</title>
    <workplace>An Institute</workplace>
    <alias>Some Guy</alias>
</occupation>

<occupation id="occupation2">
    <title>programmer</title>
    <workplace>A Company</workplace>
    <alias>Some Guy</alias>
</occupation>

<me>
    <human>
        <name>Alexander Gutman</name>
        <birthdate>1996-07-01</birthdate>
    </human>
    <prototype obj="occupation1">
        <workplace>BlahBlah Institute</workplace>
        <alias>Alex Goodman</alias>
    </prototype>
    <prototype obj="occupation2">
        <alias>Alex Softman</alias>
        <workplace>Foobar Software</workplace>
        <prototype>
            <description>My main job</description>
        </prototype>
    </prototype>
</me>

What we have here is fairly close to the Mathematica style.

Aggregates, let's call them objects, are written out arbitrarily.
By default, they are available as prototypes.

When reused, an object brings in all the data with which it was 
defined.

But any of that data may be overridden. Thus "occupation2" comes with
<title>, <workplace> and <alias> values as originally defined; but
the latter two get new overriding values. The whole <me> object
is available as a new prototype.

It's actually quite elegant - but there are things it can't tell me.
For instance, what if I now want to produce a second <me> instance,
for someone working two shifts as a mathematician. Do the default
rules state that the <programmer> element is copied to the new <me>,
or absent?

This kind of information - elements required, forbidden, unique,
many, etc - is easily included in a schema, once one has gone
to the trouble of building one.

With OOP - classes, schemas, interfaces, whatever - you get
control, for an effort upfront.

How did Mathematica make out, without it?

Not badly, for the most part. What you could do, with a bit of
manual work, was take any hand-made expression and substitute
variable names - named blanks - for chosen elements; the result
was then available as an expression pattern, which would match
the original and anything like it except for the values of
the blanked-out elements. And then, you made the pattern one
side of a rule, which produced a new expression from those
values.

When it came to Real OOP, there was a tendency to tie oneself
in knots. The problem was, if you had a complex structure,
you had to write horrible indexing expressions in quantity, to
abstract out the stuff you wanted to reuse.

I wouldn't rule out this bottom-up prototyping idea. If we can
hand-build this stuff, we can hand-convert it to schema form
with about the same effort.

All the best, cats

Jonathan
--- End forwarded message ---