|
From: Andreas R. <ros...@ps...> - 2001-09-17 12:54:01
|
Objectives and procedures of language evolution seem to be a very common problem. Looking at how other language camps cope with it might probably give some guidance (in a positive or negative way ;-). IMHO, valid goals for changing/extending the language include: - Fixing obvious quirks (eg. syntactic or semantic ambiguities) - Reducing the complexity of the language by removing obsolete/unused features (eg. abstype) - Making the language safer, ie. removing common sources of programming errors or encouraging the use of features that help catching errors early on (eg. convenient syntax for annotating function types) - Increasing convenience of common programming techniques (eg. laziness) - Increasing expressiveness where that proved to be insufficient (eg. polymorphic recursion) - Extending the range of possible application domains (eg. concurrency) - Removing hurdles for portabiliy (eg. seperate compilation) - Enabling simpler or more efficient implementation (eg. wraparound arithmetics) Backward compatibility is probably the most fundamental problem. In my opinion, incompatible changes cannot always be avoided if you want to have a reasonably clean language in the end. If possible in any way, there should be transition paths for users, however. The standard way of implementing this seems to keep old features as deprecated (raising warnings) for some reasonable amount of time. If that is not enough, implementations can still support an SML'97 switch, preferably in a way that allows interoperability with new code. The steps for a language modification to `become real' could be that some group of people 1. comes up with a design for a particular extension, 2. makes a proof-of-concept implementation in one of the existing compilers, 3. writes a proposal in form of necessary modifications to the Definition, 4. lets the Definition's authors promote the proposal to the status of a `blessed addenda'. Something similar has recently been discussed for Haskell. Ideally, every step should be followed by discussion in a wider forum. Arguably, many of the more obvious extensions (eg. withtype, where structure) have already passed stage 2. The main problem that I understand you are hinting at is that some of the more ambitious extensions no longer fit into the current framework of the Definition and it is unavoidable that at one point the whole Definition and accumulated addendas have to be substituted by a more decent, type-theoretic specification. Since this may be a lot of work that takes a considerable amount of time I think it would be reasonable to adopt simpler and obvious changes in the current framework at first. Redesign of the language specification may also be the right moment to make most incompatible changes. Some specific points: > A very fundamental question is > whether to insist on explicit import lists, with specified signatures for > imported modules, or to instead rely on a CM-like dependency inference > mechanism (which works in 99% of the cases, but not all). There is also room for a middle course, namely requiring the programmer to specify what is imported but no explicit signatures. This has the advantage of making units more readable (explicit binders for all identifiers) and enabling unambiguous dependency analysis while not being overly inconvenient for the programmer. > Should it be a specific design goal of > the language to support interactive development, or can that safely be left > to each implementation? The language semantics should be designed in a way that does not preclude interactive environments, but IMO it is not necessary to specify the details of how such an environment actually could look like. If we assume that all `persistent' and thus potentially interchanged source code is written in the separate compilation model then an interactive environment just needs to be able to import such sources in some way. > How important is ML-style type inference? IMHO, very important. Personally, I would not like to see any major compromises with respect to this - I already strongly dislike the inconvenience of annotations forced by overloading and records. But if there are features that require annotations then the rules for providing these should at least be very intuitive and straight-forward (ie. local type inference or something similar is not an option). > If implementations are allowed to differ on the extent to which they support > type inference (plausible, especially if we admit both interactive and > non-interactive implementations), then what is the official "interchange" > format for programs by which we can be assured that code will be transferred > among implementations? Having corners of `implementation-defined behaviour' - that's how the C world calls it - is IMHO a very bad idea (SML'97 already contains some wrt the context used to resolve overloading or record typing). And it would seriously complicate matters for all sides - users and implementers - if the `interchange format' is not ordinary source code. > Datatype's cannot be made transparent with seriously changing the > language. In particular, contrary to popular opinion, making datatype's > abstract would *preclude* programs that are currently *admitted*. I am aware of that (and marked that point as incompatible). However, I believe changing it would only break a rather small number of programs. Moreover, the problems related to sharing seem likely to disappear if we move to using "where" exclusively. One advantage of transparent datatypes is that they make typed programming in a distributed environment easier - processes do not need to share their type declarations if they are not generative. > Either we should have a fully > worked-out, extensible overloading system (I'm very skeptical) or drop it > entirely (a better idea, IMO). Working with OCaml from time to time, where the latter is the case, I have to say that it sometimes is a nuisance. Taking into account the rich set of numeric types the Standard Basis provides I do not really see how it could be handled without some form of overloading. Removing generic equality would make lack of overloading even more problematic (note that OCaml not only has polymorphic equality but also polymorphic ordering to escape this). > First-class polymorphism raises problems for type inference. A simple solution might be not to introduce arbitrary rank-2 types but to take the same approach as for recursive types and tie first-class polymorphism to datatypes (as suggested by Mark Jones and others and implemented in Haskell systems). This way, any use of first-class polymorphism is marked by the occurance of a corresponding constructor which serves as an implicit type annotation. The only language constructs requiring modifications to their typing rules are constructor application and matches, type inference still works as expected. > At the level of syntax, I would support fixing the case ambiguity, > eliminating clausal function definitions entirely, Wow, please, no! In my average ML code clausal function definitions are probably the single most frequently used construct besides application! OTOH, I would strongly plead for a more accurate specification of their syntax... > fixing the treatment of > "and" and "rec" in val bindings, ...or probably removing all non-recursive uses of "and" altogether. > I would like to add a clean treatment of hierarchical extensible tagging > as a generalization of the current exn type. That would be great. Please also consider enabling programmers to introduce their own extensible types. They could possibly subsume some of the expressiveness of objects. > I would like to revamp datatype's to better harmonize them with modules > and to avoid the annoying problem of repetition of datatype declarations in > signatures and structures. I know how to do this, and have a preliminary > proposal for it, but it is entirely incompatible with the current mechanism > (but perhaps the old one could continue to be supported for a transition > period). The rough idea is to follow the Harper-Stone semantics, by which a > datatype is simply a compiler-implemented structure with a special form of > signature that provides hooks into the pattern compiler. This proposal > would also admit user-implemented datatypes (aka views, or abstract value > constructors), but I am not certain that this is a good idea. That sounds very interesting. Actually, IMO abstract views are one of the features that ML modules are seriously lacking (one thing I forgot on my little list :-). To me they seem absolutely essential to avoid the fundamental abstraction vs. convenience conflicts in designing interfaces. Finally let me ask how the ML2000 effort relates to all of this. May we conclude that you consider it more or less dead by now? Best regards, - Andreas -- Andreas Rossberg, ros...@ps... "Computer games don't affect kids; I mean if Pac Man affected us as kids, we would all be running around in darkened rooms, munching magic pills, and listening to repetitive electronic music." - Kristian Wilson, Nintendo Inc. |