[exprla-devel] RE: [xpl] Groves Annotated

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

--- In xpl-dev@y..., "Richard Anthony Hein" <935551@i...> wrote:
Thanks a lot Jonathan; this helps immensely.  Some comments inline
follow....
  -----Original Message-----
  From: me@m... [mailto:me@m...]On
Behalf Of Jonathan Burns
  Sent: June 19, 2000 9:29 PM
  To: xpl@e...
  Subject: Re: [xpl] Groves Annotated


  Richard Anthony Hein wrote:
     Great input Jonathan, once again.  I wish I understood more than 
half
of what you said.


    Half's a start. The gist of it is fairly straightforward:

    We must have a notation, to specify things.

    When we're specifying data or programs, the notation is almost 
always
text of some kind.
    (Well, it could be done with tables or diagrams, but that's 
uncommon.)

    The notation is designed for TWO purposes: to be read by people, 
and to
specify whatever it is.
    We can almost always design the notation so that both purposes are
served conveniently.
    The design of a notation is its syntax. The stuff it specifies is 
the
semantics.
    [Richard Anthony Hein]

    I understood this much ...

    It is usually possible to specify the stuff, and still have 
considerable
room for manoeuvre in designing the semantics. This is nice, because 
then we
can make the language more readable.
    For instance, we can have white space and indenting.

    But the semantics should be the same, regardless of syntactic 
niceties
like whitespace.


    Now it turns out that the official specification of XML specifies 
the
syntax, only.
    [Richard Anthony Hein]

    I didn't actually realize this ...

    We assume that parsing an XML document builds a tree structure 
(that's
the semantics),
    and usually it will, but the XML specification doesn't say that 
it must.

    This was also the case with SGML - and that caused confusion, 
because
different development teams were intersted in different aspects of the
semantics.
    It meant that different teams would build parsers differently, 
and each
would be satisfied with the results - but they couldn't share working 
code,
because they had built different styles of XML trees.

    Groves was invented to standardize the semantics of SGML, so that 
such
confusions couldn't arise.

    It can do the same for XML, if need be.


    Groves defines semantics only.
    [Richard Anthony Hein]

    This is a key point, and how simple it makes everything now that 
you
have said it!  I never grasped this out of Paul's paper.  I had an 
idea of
what Groves really meant, but it involved syntax as well, so this 
clears all
that up and makes it all very clear to me!

    The purpose of it is, to determine the properties of the elements 
of
documents. Then there can be no question as to what XML files (that 
is,
source files, in readable text) are supposed to represent.
    Or to put it another way, there can be no question as to what an 
XML
parser is supposed to build.

    (There is a quibble, which can be confusing. In some cases, we 
want our
programs to work on a semantics of characters, not tree elements. For
instance, parsers themselves. Their purpose is to create semantic 
structures
according to the syntax of the source text. Therefore, their 
implementations
must be programs using character and string semantics. Don't worry 
about it,
it's a quibble.)

    The intention is that XML syntax should be parsed so as to 
produce trees
with precisely the semantics defined by the XML groves property sets.
Knowing this, confusion between XML syntax and XML semantics is 
dispelled.
    [Richard Anthony Hein]

    Excellent.


    Another thing that groves is good for, is defining alternative 
versions
of a standard semantics, for different purposes. These versions are 
the
alternative views, which Paul speaks of. Some applications only need 
to know
about certain properties of (e.g.) XML trees, and can filter out the 
rest,
for example.

    Finally, since groves defines the semantics, groves can provide a 
unique
semantic identification of every element in a document - an address. 
To
define the semantics of a structure is to provide a complete 
addressing
scheme for it.


    So, groves are useful. Using them, we can state exactly what we 
mean by
an XML document, exactly how to build it, and exactly how to identify 
each
part of it.

    The question I was keeping an eye on was, Is groves the ONLY 
complete
description of XML semantics?
    [Richard Anthony Hein]

    Surely there are other ways, but they may just be reinventing 
Groves,
like Paul mentions in his article.  There may be a need to reinvent 
Groves
for XPath, XPointer and XLink, but in the end it will probably retain 
it's
fundamental nature, and end up looking pretty close to what it does 
now.
Since XPath, XPointer and XLink are not finalized yet, it would be
beneficial to discuss this with the working groups, in their forums, 
to find
out what they think about Groves, and how they see it relating to the 
specs.

    After all, up to a certain point, people were happy to use the 
DOM -
which defines document semantics in terms of interfaces, made up of
operation declarations (method call formats) in IDL (Interface 
Description
Language). Paul objects to the DOM because there is only one of it - 
where
with groves, we can define many different interfaces (alternative 
views).
I'm not sure we couldn't get a complete semantic definition, with
alternative DOMs, one for each view.
    [Richard Anthony Hein]

    But wouldn't this mean having more things to worry about?  Groves 
look
like they are one thing, while multiple DOMs sound like they would be
different, and require more knowledge, time and effort in order to 
learn,
understand and implement.  Perhaps I am wrong here, but why make 
multiple
DOMs to describe something one method of creating Groves can do?  I 
can't
stand having 500 different ways to make an e-commerce site ... what a 
pain
to try to figure out what technologies to even start with.  I wish 
there was
only one way that made sense - one of the reasons I went for XML ... 
it is
one thing that describes all kinds of ways to do lots of things, but 
if you
understand XML, you can understand all the others (in theory at 
least).

    But another candidate for a complete XML semantic definition is 
XPath,
etc. There has been considerable progress with these since Paul wrote 
his
article; by now, it may be that XPath can address everything in an XML
document that can be addressed; while XPointer can allow auxiliary
addressing schemes within documents, and XLinks can extend addressing 
to
other documents.

    [Richard Anthony Hein]

    This is an important point ... better than making multiple DOMs I 
think,
is to be able to use XPath, if it can do what Groves are supposed to 
be able
to do.  If anyone knows better than Jonathan and I about XPath and 
Groves,
please speak up.  If not ... let's go to the source.

    Once Paul had gotten through an explanation of how groves are
constructed, I was convinced that they are indeed a complete semantic
description.

    Futhermore, it is a description in very simple terms. Everything 
you can
say about an XML document type can be defined in terms of name-value 
pairs.
That means that if need be we can prove the correctness of the 
programs we
write - and especially the basic XPL tags we define.

    However, being low-level, it is also very tedious to read.
    [Richard Anthony Hein]

    True enough, name-value pairs is simple - it can't get much 
simpler.
Tedious to read ... true, but wasn't that program Paul talked about 
supposed
to document Groves automatically, since they are essentially
self-documenting?  Besides which, C++ and Java are pretty tedious too 
for
someone not used to it.  Put together a self-documenting IDE for 
Groves,
that documents the Grove (in a manner not unlike the link Paul gave) 
you are
using on the fly in a side box, and it'll be easy enough.   I think.

    What I hope, is that XPath etc can eventually be defined in groves
terms, and show to be correct, and maybe also to be a complete 
semantic
description. Because Xpath etc will usually be a much easier form to 
read.


    The more we can get all this straight in our heads, the more 
effective
our discussions will be.
    [Richard Anthony Hein]

    Yep.

    Enough for now

    Jonathan


    I also think that the newer XPath and other related 
specifications may
do some of the things that we need for XPL, and I will do whatever 
research
I can into these areas. Since Paul wrote in his article that comments 
were
welcome, perhaps you should send him an email with a link back to your
annotations to his article.  He would surely be able to help clear up 
some
of the questions and concerns you bring up; maybe he'll even join the 
group
(the author of WorldOS joined now, which is cool and a nice 
surprise).You
wrote in your annotations that you have a lot of compiler 
experience ... do
you have any suggested sources to look into to get a better idea about
compilers and even perhaps "how to make a computer language" type
information?  :-)  I still haven't a clue how you can make up a 
syntax and
semantics, and build a compiler to translate it all to machine 
language.
Don't you have to use an already existing language such as Assembly 
or C to
do make the compiler?  Probably a really stupid question, but I 
transfered
out of computer science and into neuroscience (which would explain my
interest in your XPL-fog posts. I started CS with the intent to go 
into a
neuroscience MS afterwards, hoping to work on neural networks and AI, 
but I
switched into Nesc. early - I blame it all on my boring COBOL co-op 
term and
all the really unhappy programmers working there), so I never got 
that far.
:-(I will be going back to finish up that plan once I pay back the 
money I
already owe, and save enough to go back to university. 
Sincerely,Richard A.
Hein


  To unsubscribe from this group, send an email to:
  xpl-unsubscribe@o...
--- End forwarded message ---