Re: [XPS-devel] XPL 0.1.1 Released

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

*This message was transferred with a trial version of CommuniGate(tm) Pro*
Victor,

Thank you for your feedback. My further comments are inline as well ..

On Mon, 2002-08-19 at 15:52, Victor Kirk wrote:
>  > I would like to solicit your feedback on a variety of topics about
>  > the language definition so far. When you get to the point that you
>  > grok what I've done, please see if you can provide your feedback on
>  > the following topics.
> 
> Sorry to take so long in replying to this, things are as hectic as
> ever :-(

Hey, it happens. I'm just glad to have any feedback at all :)

> 
>  > 1. Incorporating Memory Management (Regions and Segments) Into The
>  >    Language. I wanted to do this to be explicit in the language
>  >    about how memory is organized and allocated so that it isn't
>  >    shuffled off to some "library routine" that varies from machine
>  >    to machine and uses different strategies. By incorporating memory
>  >    management in the language, the programmer has greater control of
>  >    how the program utilizes memory.
> 
> I'm afraid I haven't anything worthwhile to contribute to this,
> perhaps after some prototyping has been done I will be in a better
> position. However my first thoughts would be to try and make this as
> automatic as possible as one of the aims is to reduce the overhead and
> complexity in developing software (which unfortunately transfers the
> complexity to us).

I agree that one of the primary goals that we're after is to reduce the
complexity of programming.  However, I'm also trying to strike a balance
between flexibility and performance. I have decided that there needs to
be a very rigorous interface between XPL and the XVM because they are
aimed at different purposes.  XPL is the base language from which
flexibility is derived (by extending the schema, defining new languages
and translating to XPL, etc.). That is, we use XML/XSLT in the usual
ways to invent new programming languages/models/abstractions that
(eventually) translate down to the core XPL language.  The XVM provides
a virtual machine for executing compilations of programs in that core
language.  As long as the mappings between the higher level domain
specific languages and XPL are sane, we can program at higher levels of
abstraction without losing anything in terms of performance. What this
means for the XPS, however, is that it must be very focused on (a)
ensuring consistent execution semantics across multiple platforms, and
(b) performance.  This leads to some interesting design issues with the
XVM ...  

One of those issues is memory management. We have to deal with the
efficient allocation and deallocation of memory in the XVM. There are
several strategies that could be undertaken ranging from completely
automated memory management (i.e. XVM handles all issues including
garbage collection) to completely manual (i.e. programmer has to decide
how to lay out memory and when to allocate/deallocate it). Neither of
these extremes is appropriate. The former (fully automated) can't
perform (e.g. garbage collection for real time systems). The latter
(fully manual) is tedious and error prone. So, I've tried to come up
with a happy middle ground by including regions and segments in the XVM.
Briefly, a region is a 64 bit memory space that can grow as needed. It
contains segments and can be either persistent or transient. Segments
are fixed size, relocatable data regions. Objects and data are allocated
inside segments. The algorithm for segment allocation in a region is
fixed. However, allocation of objects in segments depends on the kind of
segment. There are various techniques that can be employed to manage
memory and the XVM will support whatever is needed.  Everything from
"Blob" segments where the entire segment is allocated as one object to
B+ Trees could be implemented. The XVM will provide a number of useful,
standard memory allocation models and allow new models to be added,
depending on the needs of the program at hand (this is one of the ways
the XVM can be extended).  This design permits a few interesting
features for the XVM:

1. Because segments are relocatable (i.e. contain no pointers), they
can be mapped in and out of memory as needed. This allows the XVM to use
very efficient I/O and sharing techniques based on the mmap calls. 
2. Memory can be deallocated very efficiently. In general, programs will
want to allocate memory for a particular task. When that task is
complete and the results have been computed, it will want to deallocate
its memory. The program can simply allocate a segment, place all the
data it needs in that segment, and then deallocate the segment. This
kind of en masse deallocation means that the XVM doesn't have to keep
track of which objects are being referenced. The "garbage" is collected
simply be deallocating the segment.  Unfortunately, for the programmer,
this means that the programmer needs to do a little work to compute how
much memory is needed since growing a segment is an expensive operation
in this design. This is one of the trade-offs between ease/flexibility
and performance that I'm willing to make. 
3. Because regions can address a large memory area, most programs will
be able to work entirely within one region. However, there is
4. Because there is no difference between allocating transient or
persistent memory, the programming model is quite efficient.
5. Because persistent memory storage is supported, there is no need to
support things like serialization which is often a large performance
bottleneck. Cooperating tasks can simply build up a graph of objects in
a persistent (or even shared transient) memory segment and pass
information back and forth.  A program can save its state and reload it
very quickly (a few mmap calls) with all the referential integrity still
in tact (i.e. without rebuilding it from some canonicalized disk storage
format).  If you're familiar with Smalltalk, these ideas are borrowed
from Smalltalk's notion of an "image". 

I think the model I've selected is reasonably flexible and will perform
quite well. It may not be as simple as a fully automated memory
management scheme, but I just can't see how to make one of those perform
well for all classes of programs. To some degree, the programmer needs
to make good choices for memory management. XPS will provide good
alternatives that can be selected without worrying about the
implementation details. 

> 
>  > 2. Defining Files, Streams, Documents, Transforms, Segments, and
>  >    Regions as Object Classes. Regardless of whether you think it is
>  >    useful for these things to be in the language (as opposed to some
>  >    "standard library"), what do you think about including these
>  >    things in the class inheritance hierarchy. I think its a nifty
>  >    idea because it allows you to build a type hierarchy around
>  >    things that interact with "external entities" (i.e. files,
>  >    streams, documents, memory, etc. are all external to the XVM).

You didn't comment on this point, but I've made a little progress in my
thinking here. I've decided that regions, segments, files, and streams
are NOT objects.  Objects in XPL will carry a certain performance
penalty with them. Files, streams, regions, and segments all need to
perform exceptionally well.  The benefits of object orientation can
still be applied by wrapping a class around the language constructs that
support a file, stream, region, or segment. And, in fact, I plan to do
just that in the XPL standard library.  By making this decision, nothing
in terms of flexibility is lost but a lot is gained in the area of
performance and simplicity. So, the XVM will support separate operators
for files, streams, regions and segments.

I have also come to the conclusion that XPL objects are XML documents
and vice versa (well, at some level of abstraction anyway).  This means
that you could invoke parsing algorithm on an XPL object and what you'd
get is a traversal of its attributes.  You could also define an XML
document type but also adorn it with attributes and methods.  I think
this makes for some interesting design opportunities in XPL programs. 
What do you think?

> 
>  > 3. Using Lisp Style Expressions. The operator expressions in XPL are
>  >    similar to LISP. That is, an expression is built from the inside
>  >    out.  Instead of saying a+b, we say +(a,b). Instead of writing
>  >    (a*(b/c)) we write *(a,/(b,c)). This structure falls naturally in
>  >    line with XML content models and makes it very easy to correctly
>  >    parse the programming expressions with an XML parser.  I believe
>  >    that it will also  be very efficient way to program and generate
>  >    code.  However, it is not necessarily intuitive for the
>  >    novice. The last example above in XPL look like:
> 
>  >    <mult_ii>
>  >        <get_i>
>  >            <value_loc>a</value_loc>
>  >        </get_i>
>  >        <div_ii>
>  >            <get_i>
>  >                <value_loc>b</value_loc>
>  >            </get_i>
>  >            <get_i>
>  >                <value_loc>c</value_loc>
>  >            </get_i>
>  >        </div_ii>
>  >    </mult_ii>
> 
> I certainly believe that this is the way to go.  XML and lisp have a
> number of similar themes which should be exploited.  I've never seen
> any discussion of the rationale behind DSSSL (on which xsl is of
> course based) but I'm sure this one of the major factors DSSSL was
> implemented on scheme.

Yes, agreed. Using a Lisp style operator semantic will make code
generation to C/C++ a little bit harder, but it in other areas
(execution by the XVM) things will be more streamlined.

> 
>  > Its rather verbose, but functionally elegant.  I haven't worried
>  > about how "pretty" XPL is because I don't think anyone is going to
>  > program directly in the XML text.
> 
> C or C++ don't look pretty without experience of the language.  Even
> with experience, people still have issues with how `pretty' code
> should look.  I can understand where these arguments come from when
> you work with a text editor, but when, as Reid mentioned, the code is
> automatically generated I think this is no longer an issue.

Thanks for validating that.  I sometimes wonder if programmer's will
take a look at XPL in all its verbosity and dull syntax and just decide
its too "ugly" to bother with.  

> 
> The important aspect is how well the code can be transformed (of
> course I don't want to start imposing restrictions just to make
> processing easier).

Yes, like I mentioned earlier, much of the XPL/XVM design is going to be
focused on getting just the right balance of performance and
flexibility.

> 
>  > A visual editor (emacs mode?) that simplified editing XPL would be
>  > an excellent tool to build for XPL. Any volunteers out there?
> 
> While XPL is not intended to be programmed in directly I think this
> could still play a useful role.  Developers of higher level
> languages (HLL) (and XPL it's self) are going to what to try out
> things directly in xpl and, while xml is human readable, it can be
> quite hard to edit the file directly as text when the document becomes
> complicated.

Agreed. Hopefully the support for multiple natural languages will
mitigate this concern a bit :)

One of the things I'd like to do eventually is to write a Java2XPL
translator as part of the compiler validation suite. If we can translate
large Java applications to XPL and have them work correctly, then I
think we'll have accomplished something significant. It will also prove
that XPL doesn't have to be the syntax of choice. Going forward from
there, I envision visual programming environments that are manipulated
in something like virtual reality where the programmer doesn't
manipulate text, but manipulates programming objects visually (e.g. drag
a "class object" from a palette into the program space and manipulate
its facts visually).

> 
> Two emacs modes I really find useful are xslide and docbook, perhaps
> these could be plundered for a basis.  I would volunteer my self but I
> just don't have the time.  I would expect to be able to help extend
> and test as the project progresses though.

Yes, its definitely a "future" thing. I'm sure when XPL gets to a
certain level of maturity, some emacs hacker will take the plunge and
create an appropriate mode. However, there may be something useful out
there already.  All we need is an emacs mode that can grok an XML Schema
Definition. Since XPl is defined as one, the mode should work for XPL or
any other XML schema. Furthermore, as extension languages are defined
(in XML), the mode would automatically work for them too. So, the task
isn't so much a need for an XPL specific mode as it is for a mode that
understands and implements XML Schema Definitions very well (XPL uses
LOTS of XML Schema features). 

However, even if we had an XPL mode for emacs, we're still dealing with
text. I'd like to get away from that eventually and go much more visual
and/or verbal. Somewhere towards the end of my career (20 years out), I
envision being able to program my computer by talking to it and building
up a 3D visual representation of the program and then pushing a button
to translate to XPL and execute.

> 
>  > Does the inelegance of the syntax bother you?
> 
> As I sort of mentioned above the `inelegance' does not bother me, in
> fact I'd bet that after some experience it would become elegant, eye
> of the beholder and all that.
> 

Hmm .. I didn't expect THAT response! Having typed a bit of XPL myself
already, I find it tedious. We'll see if it grows on me (and others).

>  > What if the entire structure of the program was represented in 3D
>  > and manipulated in a VR kind of way? Would it bother you then?
>  > Because its XML, we can do all kinds of transformations from other
>  > languages, syntaxes and editing systems and still end up with XPL as
>  > the basic language that gets compiled. It is this latter aspect of
>  > the language that is important (rigid semantic specifications, fast
>  > compilation, accurate results).
> 
> The possibilities such as those you mentioned above are some of the
> main sources of my enthusiasm for this project.  The fact that the
> actual source code is in a format that is easy to process opens up the
> potential for tools for visualization and understanding of the
> code.

Yup, me too! I'm so lazy that I'm willing to work hard to make
programming easier :)

> 
>  > 4. Everything In A Method/Function  Is An Operator. I wanted to
>  >    provide a simplicity of design such that program could be
>  >    constructed by arbitrarily nesting various operators. The fact
>  >    that the control flow operators (if, repeat, for, foreach,
>  >    select, switch...) can return values means they can be used
>  >    directly in a computational expression. C++ provides a very
>  >    limited form of this with the ?: operator. I've generalized it so
>  >    that all control flow operators may be used in computation
>  >    expressions. For example, XPL permits the following C++
>  >    equivalent:
> 
>  >    int array[maxlen];
>  >    /* ... */
>  >    int i = for (int j = 0; j < maxlen; j++) { if (array[j] == 15)
>  >                 { result j; break; }
> 
>  > Note that I take the liberty that this C++ code can (a) assign the
>  > result of a for loop to an integer variable and (b) has an operator
>  > "results" which specifies the result for the enclosing block. I
>  > won't write the equivalent XPL because there is an example similar
>  > to this in the release.  But, what I would like to know is what you
>  > think of this ability? Power or confusion?
> 
> The fact that everything is an operator appeals to me.  While I don't
> think such a structure has a place in c++ (we have iterators an
> functors to so this), I think it makes sense in xpl.
> 
> I also believe that this would ease the transformation into native
> code, which is good ;-)

Yeah, that was one of my original idea for this (simple machine code
translation). Making control structures into operators is just going
back to the Lisp idea. 

> 
>  > 5. Method Discriminants. Unlike some other popular object-oriented
>  >    languages, I differentiate between the messages sent to an object
>  >    and the methods that implement those messages. To me, these are not
>  >    the same things. While the message send can be optimized to a simple
>  >    function call in many cases, XPL also directly supports active
>  >    objects (running in their own thread) in which case message sending
>  >    is an asynchronous operation. Furthermore, I allow a class to
>  >    implement a method multiple times. To disambiguate which methods get
>  >    run on which message sends, I have introduced the notion of a method
>  >    discriminant. Discriminants are simply boolean expressions that are
>  >    computed at runtime to determine which methods get run. While the
>  >    same ends can be achieved with if/unless/switch operators in the
>  >    body of a single method, the use of discriminants provides the
>  >    programmer with some syntactic sugar to help keep the program
>  >    organized. First, method names are not necessarily the same as the
>  >    message names. Since multiple methods can be defined for a single
>  >    message on a class, this must be true. Consequently, the method can
>  >    be named after what it does, not what the message means.
>  >    Extensibility is enhanced because adding a new behavior for a
>  >    message is as simple as adding a new method, leaving the existing
>  >    methods alone and simply defining a boolean expression that
>  >    determines the conditions under which the method executes. This
>  >    style of programming greatly aids state machine development since
>  >    the state of the machine can be used as the discriminant.  Finally,
>  >    the XPL compiler can eliminate any apparent overheads by simply
>  >    recombining all the discriminant expressions into a single function
>  >    that implements all the methods.   I'd like to hear your opinions
>  >    about this.
> 
> I guess we are talking about the `double dispatch' problem here.

In the last week or so, I have refined this idea to minimize the "double
dispatch" problem. In my new plan, the discriminant is actually executed
when the object is created and used to select which method will be
chosen for the implementation of the corresponding message. This allows
objects to react to their environments at instantiation time by
selecting the set of method implementations they will use. We could even
provide an operator that re-evaluates the discriminants at any point in
the life cycle of the object to provide a form of object transformation.
For example, consider an object that had different behaviors depending
on whether some other component was "online" or "offline". Rather than
check that components status in every method, the object could use a
discriminant and set up its method dispatch pointers one way if the
component is online and another way if it is offline. This could
actually make the system performance improve for the same reason that
virtual method calls are faster than the equivalent if statements to
distinguish between types of objects.

> 
>  > 6. Overlays -- Anyone Need 'em? Overlays in XPL are like unions in
>  >    C/C++. They provide a way for using the same memory for multiple
>  >    fields. However, because XPL is a type safe language, Overlays
>  >    can't be used for type coercion or bit fiddling. Any attempt to
>  >    extract a field other than the last one set will yield an
>  >    error. So, what's the difference between that and just using
>  >    individual variables or a struct? Is memory consumption these
>  >    days really that big a concern?
> 
> For client application I don't think memory consumption is an issue,
> however, for servers I think it is still an issue.  I would be
> inclined to leave them in for the time being and see how.  Having said
> that I'm sure they wouldn't be missed and consider adding them later
> as an enhancement.

Okay, they've worked their way into the XVMAPI so I'm going to support
them initially. The XVM is intended to work in both client and server
settings so we'll leave them in for those cases where memory overlay is
needed. They aren't particularly difficult to support and won't be hard
to drop if we decide they aren't useful.

>  From your comments about type safety I assume you are think along the
> lines of a discrimated union, like that found in CORBA, for the actual
> implementation.  If so I would agree with you.

Exactly.

> 
>  > 7. Documents & Transforms Needed? Since XPL is based on XML and the
>  >    XPS will use XSLT heavily, I thought that it would be useful to
>  >    incorporate the notion of an XML document and an XSLT
>  >    transformation directly into the language. However, I haven't
>  >    been able to think up many operators that could be invoked on
>  >    these kinds of objects. Perhaps I just haven't thought about
>  >    enough (a lot has gotten deferred), but it occurs to me that
>  >    perhaps these things are just not as fundamental to an XML
>  >    programming language as I had originally thought. The basic idea
>  >    is that you could invoke an XSLT transform on one document and
>  >    get a different document instance after transformation. Documents
>  >    could be parsed by sending messages to a parser class much the
>  >    same way as DOM/SAX do. In fact, it could just be the DOM or SAX
>  >    APIs. How important is it to put this in the core language as
>  >    opposed to supporting it in a standard library?  Is it something
>  >    that every XPL program might need?
> 
> I would be for putting it into a library. I don't think this is
> something that would be used by most programs.  Having it as part of
> the language would add to the complexity.

I think I agree with you on transforms. However, I have some more
thinking to do on documents and objects. There's something compelling
about being able to easily wrap a class around an XML schema definition
but I haven't through it through enough. Its like giving dynamic
behavior to an XML document type.  Microsoft would probably call it
"ActiveXML".  The converse is also true, there's something utilitarian
about treating a graph of XPL objects as if they were an XML document.
Automatic serialization? (not that we need it).

> 
>  > 8. Multiple Natural Language Support. One of my original goals in
>  >    designing XPL was to allow the language to support multiple
>  >    natural languages. That is why the specification is broken into
>  >    two parts, XPL-Abstract.xsd and XPL-en.xsd. The abstract
>  >    definition defines all abstract elements using verbose and
>  >    precise names in English. The English language version (XPL-en)
>  >    defines concrete elements using shorter english names. The intent
>  >    is to also create XPL-fr.xsd (Francaise), XPL-de.xsd (Deutch),
>  >    XPL-sp (Espanol), and other natural language versions of the
>  >    abstract language definition. All these natural language versions
>  >    would be compilable as XPL because the compiler will only look at
>  >    the abstract substitution group names for the elements, not the
>  >    actual names of the elements for a given natural language.  I
>  >    think this is very useful and would make XPL much more friendly
>  >    and amenable to programmers in other languages.  Your opinions on
>  >    this?
> 
> Being a native English speaker and not being able to speak any other
> language than English I don't think my opinions could carry, much
> weight.  But that's never stopped me before :-)
> 
> I've worked with quite a few non native English developers and they
> have always said that the fact that (programming) languages are in
> English has not interfered with their development.  A few however have
> mentioned that some novices have used C macros to use equivalents for
> `while', `for' etc...  However this tends to be dropped quite
> quickly.
> 
> I guess that means it would be useful for when learning the language,
> but not essential for people being able to use it.
> 
> It would also have the down side of making a lot of code unreadable to
> people who don't speak the language.  At least when it is in English
> a Spanish speaker would only have to be familiar with English, and not
> French, German ...
> 
> Of course we could have a natural language transform.

Interesting perspective. I'll have to give some more thought to this.
One of my original goals for XPL was to make it completely
internationalized. I imagine that your supposition holds for the
European based languages. But, I'd love to hear the perspective of some
Asians on this. English is sufficiently different from Kanjii or
Mandarin that I suspect being able to program in their native symbols
would be a big win for people using non-European languages.  I think
this is one of those things that should go on the questionnaire so we
can collect some feedback on it. 

> 
>  > I hope to hear back from you on these topics and any others you care
>  > to bring up.
> 
> I'll throw a few topics for consideration for the future (I'll
> continue Reids enumeration).
> 
> Sorry Reid, these are more orientated to what we can do to get
> something practical rather than thoughts about the language.
> 
> 
> 9. One of the major issues in adopting something is how easily can it
>     be incorporated into existing software.  I have worked on projects
>     where some of the legacy code was written before I was born,
>     something years of effort and trust has been invested.
> 
>     To encourage adoption and maybe to reduce the effort it producing
>     something to demonstrate it would be useful for a mechanism to
>     interface with existing libraries, or creating libraries.  Note when
>     I say libraries I mean binaries, not xpl libraries.

I agree that this is a necessary step. I already have plans to be able
to make calls out to any C or C++ based library. We could probably quite
easily do the same for any other language supported by GCC (Java,
Objective-C, Fortran). One of the ideas I had for extensibility is to be
able to map the definitions in an XPL package onto some implementation
in another language. However, there's lots of "glue" issues that have to
be worked out there. I don't want to end up with something like JNI.

> 
> 10. I would be useful if we could come up with a couple of trivial
>      example HLLS and identify ones that are certainly going to be
>      useful, e.g. MathML.

Absolutely. This is a much needed "to do".

> 
> 
> 11. Debugging.  Has anyone any thoughts on debugging xpl programs.
>      Someone who has written some code in a HLL is not going to want
>      (or even can't) debug at a C++ level and map that up mentally
>      through xpl to the language they wrote it in.
> 

Yup, that's pretty much been thought through on this end. You won't get
stuck using GDB with C++ tokens.  There's a trade off here. While much
of what we translate XPL into will be native code, that native code will
be manipulating a virtual machine. The machine will keep track of call
stacks, variables, etc. There will be a debugger interface that allows
you to interrupt the virtual machine, single step it, set break points,
examine the stack and variables on it, dump the contents of memory
segments, etc. The entire debugging interface will be (no suprises here)
XML based. That is, there will be an XML schema defined for interacting
with the debugger interface. You simply connect to the XVM, send some
commands, get some feedback.  As with XPL, to be truly useful, it will
need some kind of an IDE on the other end that can deal with the XML
input/output but this is the most flexible way to approach it.

> 
> I'll have to save the rest for another time.
> 

Wow, there's more?  I like this!

Thanks for your ideas!

> Vic
> 

Reid.