From: Robert H. <Rob...@cs...> - 2001-09-18 17:27:38
|
It's good to see such an animated discussion about possible revision and extension of Standard ML. What quickly emerges from the discussion is the need to separate concerns. Before considering these, let me say that I don't like to see various efforts described as "valid" or "invalid", or for them to be divided into "camps". These descriptions are needlessly antagonistic to our common goals. Broadly speaking, we can consider two distinct enterprises: 1. Revision and extension of Standard ML to better support current and readily foreseen needs. The emphasis here is on maintaining continuity and enhancing utility, rather than experimentation and exploration. 2. Drawing on experience with ML to explore new territory in language design and implementation. The emphasis here is on developing the "next great language" in the ML lineage, with diminished continuity and compatibility constraints. Personally, I've put effort into both enterprises, which led to the 1997 revision of Standard ML and also to research on new languages and implementations. What is clear to me is that it is more than time to renew emphasis on revision and extension of Standard ML, for several reasons. First, there is a crying need, as is clearly evidenced by this discussion. Second, many of the proposed revisions are entirely do-able within a reasonable time frame. Third, it is essential for Standard ML to remain a viable language. I propose, therefore, that we focus our efforts on (1), leaving aside discussion of more ambitious projects (which I, among many, wish to undertake) for another day. I take it from the discussion that this will not be a controversial point. Reviewing the discussion, I can see a few major topics for immediate consideration. 1. Standardization of a separate compilation mechanism. This would entail defining what are compilation units, including how imports and exports are described, and what is the semantics of linking. At a high level this should not be exceptionally hard to work out, but the devil is in the details. For example, since interfaces for compilation units cannot be accurately expressed in the language, there is a fundamental distinction between "true" separate compilation (with specified interfaces for imports) and incremental recompilation (with inferred signatures obtained by scheduling the elaboration of units in dependency order). 2. Standardization of a substantial set of libraries. Quite obviously this goes far beyond the meager standard basis library that we currently share. There are lots of hard problems to be solved here, but given the substantial code base already in place, I'm sure we can formulate a reasonable plan. We would need to consider how libraries interact with separate compilation (the work on CM is highly relevant here), formulating some interface standards (we all probably have our own to contribute), and choice of which libraries to include (the more the merrier, but some harmonization would be required). 3. Standardization of a foreign function interface. I have not thought very much about this issue, but I can certainly see difficulties with compatibility not only across implementations (the various compilers out there), but also across platforms (primarily, Unix vs Windows). Fundamental issues such as the semantics of "int" will arise, since implementations differ, even on the same hardware and software platform, and since external code will impose its own requirements. The need for an FFI is largely pragmatic --- we cannot re-invent the world ourselves in our own way. However, as someone pointed out, buying into foreign code will certainly limit portability across platforms and quite possibly across implementations. 4. Extensions and modifications to the language itself. These include relatively trivial things like denigrating obsolete mechanisms (such as abstype), re-considering the semantics of structure sharing (which started this discussion), and adding support for new features (eg, updateable records, lazy evaluation, vector expressions, richer patterns, hierarchical extensible sums). I think there are strong arguments for all of these changes. We might also consider ways to improve the syntax while providing a path for porting old code. I propose that we confine ourselves to these four categories for immediate discussion. (If I've overlooked something, I hope we can quickly agree on what that is and whether to consider it now.) It might make sense to form sub-groups who take charge of specific topics, and report back to the full group with their proposal. Once a solid, but informal, proposal is in place, we can evaluate it by examining its semantics and its implications for implementation. Presumably this will lead to revision, but will also lead rather quickly to a solid revision or extension. My experience has been that even very modest revisions are very hard to make. One reason is that we all have a very substantial commitment to the language (in the abstract), its semantics, and its implementation. It's a tribute to the language that we all have such passionate views about it, and have contributed so much of our time and energy to it. It can also be an obstacle to consensus. Perhaps it is worthwhile to state a few principles that I hope can guide us. 1. Standard ML exists independently of its implementations. The language should continue to have a formal definition to which implementations agree to conform. 2. Revisions must be guided as much by the experience of users and implementors as by the demands of a clean formal definition. IMO the 1997 revision was hobbled by an excessive emphasis on the needs of The Definition without due consideration of implementation or application. 3. It is important to achieve a rough consensus, but complete agreement on all issues may be impossible to achieve. We will need to have a mechanism for reaching a decision in the face of disagreement. Let the discussion begin! Bob Harper PS: I, among many, have ideas about new language designs that would take us beyond the charter outlined above. It might make sense, if there is interest, to fork off a separate discussion of these issues. For example, I would consider the discussion about automatic generation of equality functions to fall within this category, as would the proposal I mentioned for re-working datatypes. (In fact these fit together nicely.) |