Re: [Sml-implementers] Developing ML

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hello Robin,

Thank you for taking the time to explain your position.  You have convinced
me of one major point, but I disagree with others.  In particular I don't
agree that your position "liberates ML instead of ossifying it".  I hope
that I explain my reasoning as carefully as you explained yours.  I also
propose a way forward, which I hope might form the basis of an acceptable
compromise.

The major point that you have convinced me of is that there is a particular
risk in any non-trivial changes to the semantics.  If such a problem arose,
one immediate effect would obviously be to reduce the confidence of users
in the language itself.  They may not make the distinction between the
language they are using and the language defined in The Definition of
Standard ML, thus attributing blame to innocent authors.  A more pernicious
effect would be if they also lost confidence in the advantages of the
formal definition process itself.

The first point I want to make in reply is that the discussion on the
sml-implementers list is attempting to find consensus between implementors
on how they want to develop their implementations.  So your statement that
"different paths can be taken" and that these paths can explore different
semantic styles, is not relevant to this particular discussion.  Although
it is undoubtably true to the broader development of ML,  any immediate
developments from this discussion will be bound tightly to The Definition.

The second point I want to make is that the difference between a new
language and a revision to the existing language is, to some extent, in the
eye of the beholder.  In other words, none of us can prevent people from
treating a specification as a revision of Standard ML, whatever name we
call it.  Let's consider your other example of how ML might develop:

>One natural path would be a
>short one, identifying a set of alleged shortcomings in SML and
>proposing improvements, with (as Andrew suggests) semantics in the
>style of - and with reference to - the '97 SML Definition.  It could
>be called - say - ML2002; not being called SML there is no need for
>complete consensus on this path, and indeed some implementers and
>users may well prefer to stay with SML.  

If we define a small set of changes to SML 97, ensuring backwards
compatibility with SML 97, and using the same formalism as SML 97, where
these changes are supported by the vast majority of extant implementations
of SML 97, then in my opinion you have a de facto revision of Standard ML,
whatever we might prefer it to be called.  It would be a case of "if it
talks like a duck, walks like a duck and quacks like a duck, then it is a
duck, but you have to call it a goose".  As a concrete example, if we
present (SML 97 + vector literals, withtype in signatures, and compilation
management) as a complete new language, our readers/users will recognise it
for what it is -- a minor revision of SML 97.  W

As you note, consensus is a vital part of this.   If one or two
implementations support these features, they remain isolated extensions to
the language.  If 9 out of 10 implementations support them, then they
become a revision of the language simply by existing.   The word "Standard"
in the name is important.

My third point is that there are some changes to the Definition that we can
have confidence in.  For example, a new derived form is highly unlikely to
break anything.  Neither is a new class of special constants.  (In each
case, I'm supposing that we don't make mistakes with the syntax of the
features, but syntactic mistakes are not in the same class as the
possibility of unforeseen semantic interactions).

My fourth point is that a revision to the language would not invalidate the
existence of the 1997 Definition of Standard ML, just as that did not
remove the 1990 Definition from history (and indeed some implementations
continued to support SML 90 as an option).  Nor (since I'm assuming
backwards compatibility) would it invalidate existing books and libraries
in the way that the 1997 revision did.  (Of course the books wouldn't be
completely up to date, but this also applies to the development of new
libraries, or to a standard for compilation management).   You can see the
same thing in other languages -- K&R C is a different beast from ISO C,
which is different from C9X. 

My fifth point concerns only your "tension #1", between evolution and
stability in general.  You say, "If you evolve (the Definition of) the
language itself, then you abandon a stable platform".  I don't agree that
there is a clear difference between the language and the rest of the
platform, i.e. libraries, FFIs, compilation management, and so forth.  To
take a concrete example, if some implementors develop a common interface to
C code, and I take advantage of that, then my code is limited to those
implementations.  If that interface then changes, I may have to adapt my
code to match.  Or if someone writes a better C interface, I have to decide
whether to change my code or stick with the lesser library.  The same thing
applies to a GUI library or a database library or any other library
(although it is particularly relevant to libraries that require some
support from the implementation itself, such as FFIs).  The tension #1 that
you describe is always present, and does not fit a clean separation between
language semantics on the one hand, and libraries on the other.

My final point is that the Definition contains some ambiguities.  This is
not a criticism of the work you put into writing it; in a project this size
some ambiguities were inevitable.  But as Andreas's recent message shows,
they do introduce observable (if small) differences between
implementations.  Implementors may well want to agree on minor changes (or
interpretations) to the Definition to resolve these ambiguities.

So can we find a common ground in which we can evolve "Standard ML" without
risking the problems that you warn of?  (Assuming that the implementers can
reach agreement between ourselves, of course, which is yet to be proven).
I think we can: the key is to minimise the risks to stability, to manage
expectations, and to clearly delineate authorship.

First, we need to ensure backwards compatibility with SML '97.  There was
some discussion on the list of simplifying parts of the language, or
removing unwanted features.  I think we have to leave such changes to the
development of new languages.  This assures a degree of stability, and
reduces the problems of books becoming out of date.

Second, we have to be conservative with changes and extensions.  I think
the discussion was already well down that path in any case.  But this is
not the same as banning any changes to the language.  Ambiguities should be
resolved (where possible), and changes such as vector literals (special
constants), or-patterns and functional record update (derived forms?), and
perhaps laziness can be made without incurring risk.

Where new language features are introduced, the document that describes
them should explain why we believe they are safe.  E.g. it could explain
that derived forms do not threaten the integrity of the language.  For
larger changes, it could point to extensive use in some implementations
before the feature was adopted as a standard, and/or to theoretical work
that proves certain properties hold in the context of the entire language
(not just a mini-language).

Finally, whenever a new feature is introduced, the parties agreeing to the
introduction should clearly state their responsibility for the decision.
They should also offer you (and other authors of SML 97, where appropriate)
a disclaimer to make clear that you have not taken part in this new revision.

With this process (or something similar), I believe we could evolve
Standard ML in a controlled manner.  The main thrust of the evolution would
be to support compilation management, a C FFI, and library development, but
the evolution could include the resolution of ambiguities in the Definition
and the addition of small but useful language features.

One current topic that may need more work is the interaction between
sharing and type abbreviations in signatures, which has turned out to be
more of a problem in practice than expected.   This actually supports your
arguments in favour of conservatism, in that we want to avoid such problems
in future.  For now, though, we have to deal with the problem we have.  I
think it's worth trying to improve this situation, if it can be done safely
(bearing your warnings in mind).

From a personal perspective, the above process doesn't take us as far as I
would like to go.  I think that some support for higher-order modules
and/or recursive modules is highly desirable from a practical point of
view.  But clearly it would be hard to add these features within the above
process.  Perhaps we can cross that bridge if we come to it.  

Best wishes,

Dave.