From: Andreas R. <ros...@ps...> - 2001-09-25 09:42:40
|
Long posting, this time not about language evolution but about interpretation of the current Definition :-) I would like to reraise an issue in this forum I already discussed with Stephen Weeks on comp.lang.ml recently. Since we did not entirely agree over the exact interpretation of the rules I hope that some more people, in particular the authors, can comment on it. At the end of section G.8 (Principal Environments) the Definition explains that the intent of the restrictions on free type variables at the toplevel (side-conditions in rules 87 and 89) is to avoid reporting free type variables to the user. However, I believe that this paragraph confuses two notions of type variable: type variables as semantic objects, as appearing in the formal rules of the Definition, and yet undetermined types during Hindley/Milner type inference, which are also commonly represented by type variables. However, these are variables on a completely different level. The former are part of the formal framework of the Definition, while the latter are an `implementation detail' that lies outside the scope of the Definition's formalism. Let us distinguish both by referring to the former as _semantic type variables_ and to the latter as _undetermined types_. Using this terminology, and judging from the remainder of the same paragraph in the Definition, the main purpose of the aforementioned restrictions obviously is to avoid reporting _undetermined types_ to the user. However, they fail to achieve that. In fact, it seems impossible to enforce such behaviour within the formal framework of the Definition, since it essentially would require formalising type inference (the current formalism has no notion of undetermined type). Consequently, the comment about the possibility of relaxing the restrictions by substituting arbitrary monotypes misses the point as well. In fact, I interpret the formal rules of the Definition to actually imply the exact opposite, namely that an implementation may _never_ reject a program that results in undetermined types at the toplevel, and is thus compelled to report them. The reason is explicitly given in the same section: "implementations should not reject programs for which successful elaboration is possible". Consider the following well-known example: val r = ref nil; r := [true]; Rule 2 has to non-deterministically choose some type (tau list) for the occurence of nil. The choice of tau is not determined by the declaration itself: it is not used, nor can it be generalised, due to the value restriction. However, bool is a perfectly valid choice for tau, and this choice will allow the entire program to elaborate. So according to the quote above, an implementation has to make exactly that choice. Now, if both declarations are entered seperately into an interactive toplevel the implementation obviously has to delay commitment to that choice until it has actually seen the second declaration. Consequently, it can do nothing else but reporting an undetermined type for the first declaration. The only effect the side conditions in rules 87 and 89 have on this is that the types committed to later may not contain free semantic type variables -- but there is no way this could happen anyway in a practical implementation, considering the way such variables are introduced in type inference (namely only during generalisation). There are two possibilities of dealing with this matter: (1) take the formal rules as they are and ignore the comment in the appendix, or (2) view the comment as an informal "further restriction" and fix its actual formulation to match the obvious intent. Since the comments in appendix G are not supposed to be a normative part of the Definition but merely explanatory, and moreover are somewhat inconsistent, strict reading of the Definition as is should give the formal rules priority and choose option 1, IMHO. Furthermore, this observation gives rise to the question whether the claim about the existence of principal environments in section 4.12 of the SML'90 Definition was valid in the first place. I believe it wasn't: a declaration like the one of r has no principal environment that would be expressible within the formalism of the Definition, despite allowing different choices of free imperative type variables. The reasoning that this relaxation was sufficient to regain principality is based on the same mix-up of semantic type variables and undetermined types as above. The relaxation does not solve the problem with expansive declarations, since semantic type variables are rather unrelated to it -- choosing a semantic type variable for an undetermined type is no more principal than choosing any particular monotype. I hope I was making sense and hope for some comments. If I am correct with these observations, should it be fixed in future versions? Best regards, - Andreas -- Andreas Rossberg, ros...@ps... "Computer games don't affect kids; I mean if Pac Man affected us as kids, we would all be running around in darkened rooms, munching magic pills, and listening to repetitive electronic music." - Kristian Wilson, Nintendo Inc. |