Re: [Sml-implementers] Free type variables on toplevel

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Andreas raises an interesting question.  Although the point appears of minor
import at best, it's not totally trivial since SML/NJ and Moscow ML seem to
behave differently with regards to it.  After studying the Definition carefully,
here's my take.  (If you don't care about this issue, you may still want to skip
to bullet 3 below, where I point out a related and somewhat more important
distinction between NJ and Moscow.)

1) I think you're right that the informal discussion in G.8 is somewhat wrong:
since you can always replace the free type variables introduced by a "topdec"
with some arbitrary monotype like "int", it's wrong to refuse to elaborate a
"topdec" on the grounds that you don't want to make a choice.  In fact, both NJ
and Moscow elaborate

val x = ref nil;

albeit with warnings.

2) I think the formal situation is more complex and subtle than you give it
credit.  The complexity stems from whether the place such a val binding as "val
x = ref nil" is defined and the place the free type variable in the type of x
gets resolved are in the same "topdec".  Consider the following two programs:

Program 1
---------

val x = ref nil
signature S = sig end
val y = (x := [true]);

Program 2
---------

val x = ref nil;
signature S = sig end;
val y = (x := [true]);

You can pretty much ignore the signature binding.  I just put it there because I
find the syntax of "topdec"'s (see page 14) to be a bit ambiguous.  Putting the
signature binding in forces the two val bindings in Program 2 to be in
*different* topdec's.  The two val bindings in Program 1 are in the *same*
topdec because there are no semicolons separating them.  At first glance, these
two examples might appear to elaborate the same way, but I claim that according
to the Definition it's not clear that they need to.

Here's why.  In Program 1, the two val's are in the same topdec.  If x can
correctly be given a type so that the whole topdec elaborates, then you have to
give it that type.  Therefore, in this case, I agree with Andreas that we should
be able to "guess" the type of x as being "bool list" to make the program
elaborate.  This seems to me to be incontrovertible.

In Program 2, however, the two val's are in different topdec's.  A careful
reading of Section 8 on the semantics of Programs seems to indicate that there
are two possible outcomes for the elaboration and execution of Program 2.  Under
Rule 188, we could guess the type of x as being "bool list" as in Program 1, and
thus elaborate the second val binding fine.  However, it is also possible to
guess the type of x as being "int list".  In this case, when we get around to
elaborating the second val binding, it will fail to elaborate.  This is ok
because this causes Rule 187 to apply, resulting in the abortion of execution of
the second val binding.  The ambiguity is due to the negative premise of Rule
187, which is matched when a topdec does *not* elaborate.  It is rather
concerning that there is such an ambiguity, so either I'm wrong about Rule 187
applying or the Definers could maybe comment on this?

On both Programs 1 and 2, SML/NJ (110.9.1) fails to elaborate, and Moscow ML
(2.00) succeeds.  Given that Program 1 must succeed according to the Definition
and Program 2 may, Moscow ML can be said to be implementing the Definition on
this point, while NJ can't.  Perhaps NJ implementers could comment why they
chose to diverge from the Definition on this point?

3) A related issue that I observed while playing around was that NJ and Moscow
have very different behavior with respect to failed elaborations in general, and
as usual Moscow ML is the one that is in line with the Definition.  The specific
question is: When you hit a topdec that does not elaborate, do you stop or do
you keep elaborating and executing the program?  The Definition (rule 187 as
before) is quite clear on this point: you keep going.  Thus, for instance,

val x = 3; val y = x + true; val z = 4;

causes NJ to give an error at the second val binding and stop, not elaborating
the third one.  Moscow ML, however, gives an error at the second val binding and
continues, elaborating "val z = 4" and adding z to the environment.  The
question is: which is really better?  Although I appreciate that Moscow ML
implements the Definition faithfully, I sort of think NJ is right: it makes more
sense for the compiler to stop evaluating the program once it hits a type
error.  Otherwise, you could end up with a slew of type errors if the rest of
the program depends on a variable bound by a buggy val binding.

Thoughts?

Derek

Andreas Rossberg wrote:
> 
> Long posting, this time not about language evolution but about
> interpretation of the current Definition :-)
> 
> I would like to reraise an issue in this forum I already discussed with
> Stephen Weeks on comp.lang.ml recently. Since we did not entirely agree
> over the exact interpretation of the rules I hope that some more people,
> in particular the authors, can comment on it.
> 
> At the end of section G.8 (Principal Environments) the Definition
> explains that the intent of the restrictions on free type variables at
> the toplevel (side-conditions in rules 87 and 89) is to avoid reporting
> free type variables to the user. However, I believe that this paragraph
> confuses two notions of type variable: type variables as semantic
> objects, as appearing in the formal rules of the Definition, and yet
> undetermined types during Hindley/Milner type inference, which are also
> commonly represented by type variables. However, these are variables on
> a completely different level. The former are part of the formal
> framework of the Definition, while the latter are an `implementation
> detail' that lies outside the scope of the Definition's formalism. Let
> us distinguish both by referring to the former as _semantic type
> variables_ and to the latter as _undetermined types_.
> 
> Using this terminology, and judging from the remainder of the same
> paragraph in the Definition, the main purpose of the aforementioned
> restrictions obviously is to avoid reporting _undetermined types_ to the
> user. However, they fail to achieve that. In fact, it seems impossible
> to enforce such behaviour within the formal framework of the Definition,
> since it essentially would require formalising type inference (the
> current formalism has no notion of undetermined type). Consequently, the
> comment about the possibility of relaxing the restrictions by
> substituting arbitrary monotypes misses the point as well.
> 
> In fact, I interpret the formal rules of the Definition to actually
> imply the exact opposite, namely that an implementation may _never_
> reject a program that results in undetermined types at the toplevel, and
> is thus compelled to report them. The reason is explicitly given in the
> same section: "implementations should not reject programs for which
> successful elaboration is possible". Consider the following well-known
> example:
> 
>         val r = ref nil;
>         r := [true];
> 
> Rule 2 has to non-deterministically choose some type (tau list) for the
> occurence of nil. The choice of tau is not determined by the declaration
> itself: it is not used, nor can it be generalised, due to the value
> restriction. However, bool is a perfectly valid choice for tau, and this
> choice will allow the entire program to elaborate. So according to the
> quote above, an implementation has to make exactly that choice. Now, if
> both declarations are entered seperately into an interactive toplevel
> the implementation obviously has to delay commitment to that choice
> until it has actually seen the second declaration. Consequently, it can
> do nothing else but reporting an undetermined type for the first
> declaration. The only effect the side conditions in rules 87 and 89 have
> on this is that the types committed to later may not contain free
> semantic type variables -- but there is no way this could happen anyway
> in a practical implementation, considering the way such variables are
> introduced in type inference (namely only during generalisation).
> 
> There are two possibilities of dealing with this matter: (1) take the
> formal rules as they are and ignore the comment in the appendix, or (2)
> view the comment as an informal "further restriction" and fix its actual
> formulation to match the obvious intent. Since the comments in appendix
> G are not supposed to be a normative part of the Definition but merely
> explanatory, and moreover are somewhat inconsistent, strict reading of
> the Definition as is should give the formal rules priority and choose
> option 1, IMHO.
> 
> Furthermore, this observation gives rise to the question whether the
> claim about the existence of principal environments in section 4.12 of
> the SML'90 Definition was valid in the first place. I believe it wasn't:
> a declaration like the one of r has no principal environment that would
> be expressible within the formalism of the Definition, despite allowing
> different choices of free imperative type variables. The reasoning that
> this relaxation was sufficient to regain principality is based on the
> same mix-up of semantic type variables and undetermined types as above.
> The relaxation does not solve the problem with expansive declarations,
> since semantic type variables are rather unrelated to it -- choosing a
> semantic type variable for an undetermined type is no more principal
> than choosing any particular monotype.
> 
> I hope I was making sense and hope for some comments. If I am correct
> with these observations, should it be fixed in future versions?
> 
> Best regards,
> 
>         - Andreas
> 
> --
> Andreas Rossberg, ros...@ps...
> 
> "Computer games don't affect kids; I mean if Pac Man affected us
>  as kids, we would all be running around in darkened rooms, munching
>  magic pills, and listening to repetitive electronic music."
>  - Kristian Wilson, Nintendo Inc.
> 
> _______________________________________________
> Sml-implementers mailing list
> Sml...@li...
> https://lists.sourceforge.net/lists/listinfo/sml-implementers