From: Vijay S. <vi...@sa...> - 2009-07-06 17:37:36
|
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type"> <title></title> </head> <body bgcolor="#ffffff" text="#000000"> Some more thoughts.<br> <ul> <li>A disadvantage of the separate primitives + classes proposal is that from a primitive P there is no way of getting a class which has the same data elements as P and the same methods as P. (<tt>Nullable[P]</tt> does not suffice.) Why would we want that? Because variables at an interface type can only be instantiated with objects (not with primitives) -- since they must have size 1, cannot have arbitrary size.<br> </li> <li>So how would the story outlined below be made to work if we permit the user to define only classes? Not difficult:<br> </li> <ul> <li>For every class C (with only <tt>val</tt> fields -- this restriction so we dont have to worry about aliasing issues), we let <tt>struct C</tt> be a new type. This type satisfies the following properties. <br> </li> <ul> <li>If <tt>C <: B</tt>, then <tt>struct C <: struct B</tt>.</li> <li>If <tt>C implements I</tt>, then <tt>struct C implements I</tt>.</li> </ul> <li>Let C be defined by <tt>Modifiers class C[X1,..., Xn](p1:T1,..., pn:Tn){c} extends B{d} implements I1, ..., Ik { Body}. </tt>Then <tt>struct C</tt> is defined as if by <tt>Modifiers struct C[X1,...,Xn](p1:T1,..., pn:Tn){c} extends struct B{d} implements I1,..., Ik { Body'}</tt>.<br> </li> <li>The fields of <tt>struct C</tt> are those of C. <br> </li> <li>The methods of <tt>struct C</tt> are those of <tt>C</tt> that are not marked <tt>ref</tt>. (That is, the definition of such methods are copied into the body of <tt>struct C</tt>. In these methods, the type of <tt>this</tt> is <tt>struct C</tt>.) The compiler throws an error if a method not marked <tt>ref </tt>uses this in an upcast, or assigns this to a variable whose type is an interface or whose type is a supertype of C. (These operations are not possible for <tt>struct C</tt>.)</li> <ul> <li><tt>ref</tt> is the dual of the idea of <tt>closed</tt>. The idea is that the programmer marks those methods as ref which make sense only for C, do not make sense for <tt>struct C</tt>.</li> <li>The compiler should give a warning if this is used with <tt>==</tt> and <tt>!=</tt>, since their semantics is different for <tt>struct C</tt>.<br> </li> <li>In the implementation, the methods of <tt>struct C</tt> are compiled into static methods where all dispatch choices are resolved statically, under the assumption that the type of <tt>this</tt> is exactly <tt>struct C</tt> (not <tt>struct D</tt>, for some subclass <tt>D</tt> of <tt>C</tt>.)<br> </li> <li>Note that if <tt>D extends C</tt>, then the static methods for <tt>struct D</tt> may not be related to the static methods for <tt>struct C</tt>.</li> <li>The main point is that the implementation does not keep state with a value of type <tt>struct C</tt> containing an itable or a vtable. Instead, code is expanded. <br> </li> </ul> <li>Note that if the class <tt>C</tt> is abstract, so is <tt>struct C</tt>. <tt>abstract</tt> methods on class <tt>C</tt> translate to abstract methods on <tt>struct C</tt>. <br> </li> <li>The types <tt>struct C</tt> are all exact. That is, a variable of type <tt>struct C</tt> can contain only instances of <tt>C</tt>, not instances of a class <tt>D</tt> extending <tt>C</tt>. Thus the size of a variable of type <tt>struct C</tt> is the sum of the sizes of the fields of <tt>C</tt>.</li> <ul> <li>This is forced by the defining characteristics of structs that equality means equality of all fields. Hence if v1 and v2 are two values at type struct P, they cannot be instances of classes extending P otherwise their state could be different at the fields introduced in these subclasses while being identical at P. Hence they are not ==, even though from the viewpoint of P they are ==.</li> </ul> <li>A value of type struct C cannot be assigned to a variable of interface type I, even if C implements I. (A value of type C can, of course.) A variable of interface type always has the size of a reference; it always contains a reference (an instance of a subclass of Object).</li> <li>So why introduce subtyping relations on the <tt>struct</tt> types? Because it generalizes the proposal in which <tt>struct</tt> (or <tt>inlined</tt>) could only be applied to final classes. It help in specifying bounds on generic types. e.g. <tt>def add[T](x:T, y:T){T <: struct Arithmetic} = x + y; </tt>Note the variables x and y are not being defined at type <tt>Arithmetic</tt>; they are being defined at some <tt>struct</tt> type T which extends <tt>Arithmetic</tt>. <br> </li> <ul> <li>Similarly, we can define a <tt>Comparable[X,Y] </tt>interface, have <tt>int </tt>implement <tt>Comparable[int,int]</tt>, and then instantiate <tt>SortedList[X] </tt>on <tt>int</tt> (<tt>SortedList[X] </tt>requires that <tt>X</tt> <tt>implements Comparable{X,X]</tt>).</li> </ul> <li>Given a value <tt>v </tt>of type <tt>struct C</tt>, one can obtain an object of type <tt>C</tt> using the operation <tt>new v</tt>. This new object will have space for a guid, as must all objects in X10, and its contents are initialized with the contents of v.</li> <li>Unrestricted type parameters can be instantiated by the types <tt>struct C</tt> (for some <tt>C</tt>). <br> </li> <li>If a type parameter <tt>X</tt> has a constraint <tt>X <: struct I</tt>, where <tt>I</tt> an interface, then <tt>X</tt> can be instantiated by a <tt>struct C</tt>, provided that <tt>C <: I</tt>. If <tt>X</tt> has a constraint <tt>X <: struct C</tt>, where <tt>C is </tt>a class, then <tt>X</tt> can be instantiated by a <tt>struct B</tt>, provided that <tt>B <: C</tt>. </li> </ul> </ul> It probably makes sense to permit the programmer to directly define <tt>struct C </tt>(in addition to having structs automatically defined from classes which have only final fields). In such a case it would also probably make sense to automatically define a class C based on the definition of the struct. <br> <br> Now, returning to unions. Instead of thinking of tagged unions as a kind of struct definition, we could in fact think of tagged unions at the level of fields. Then tagged unions would make sense when defining fields of structs or fields of classes. Here is how one would do it. In a class or struct C <br> <br> <tt>[var] <name> : union {<br> case i: Type0<br> case i+1: Type1<br> ...<br> case i+n-1: Typen-1<br> } [Initializer];</tt><br> <br> can be used to declare a tagged union <tt>var</tt> or <tt>val </tt>field. The base types of all the Typei must be different. <br> <br> This implicitly declares two fields, <name>.tag and <name>.value, with the first taking on the values i,...i+n-1, and the second taking a value of the corresponding type. Such values can be operated upon by a typecase as before. One should be able to use a typdef to define union types as well;<br> <br> <tt>typedef javaPrimitive = union { case 0: boolean case 1: int ....};</tt><br> <br> <tt>val x: javaPrimitive = 1; // the value of tag is inferred from the value of the initializer</tt><br> <br> If we were to go down this path, it would prolly make sense to have a subtyping relation on union types U1 <: U2 if U1 has all the summands of U2, and may have more. Then a value of type U1 can be assigned to a variable of type U2.<br> <br> The typecase statement and expression can be used to operate on such values:<br> <br> <tt>typecase (x) {<br> case i: ... // here x has type Type0 <br> case i+1: ...<br> case i+n-1: ...<br> }</tt><br> <br> <br> <br> Vijay Saraswat wrote: <blockquote cite="mid:4A5...@sa..." type="cite"> <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type"> You do mean that an instance of Variant will have space for the tag and one of its summands, not all of them.<br> <br> Yes, this could be made to work in a typesafe manner in X10. And this does provide a good resolution to the question of whether to special case int, uint etc.<br> <br> Here are the outlines -- this is a classes + primitives proposals, where primitives can be unions.<br> <ul> <li>Realize that with the restriction that only those types can be inlined which are final and whose fields are immutable, we are back in the domain of what I called "primitive" a month ago.</li> <li>The main point of a primitive definition (in shape it looks just like a class definition) is that equality is defined on it component-wise (and it is a final class, and it inherits from no class at all, and it has only final members). This is really the only reason for introducing the concept of primitives distinct from classes: Equality is defined differently.<br> </li> <li>We dont need Any, Object can be the top of the hierarchy. However, the type system is not unified. It has Object and all its subclasses and then it has an arbitrary number of primitives which are all "top-level". <br> </li> <li>Unrestricted type parameters can be instantiated by Object and its subclasses or by primitives.</li> <li>int, uint etc are all defined in the x10.lang package as primitives. (We will prolly define complex in x10.lang as well as a primitive).<br> </li> <li>Thus the types ValRail[int], Rail[byte] etc. make sense. ValRail[Object] cannot contain an int of course. <br> </li> <li>Primitives can be embedded into Object through the user-defined class Nullable[X].</li> <li>We flip the bit on "inlined". That is, instead of saying we have a class C and we can obtain from it inlined C (provided that C satisfies certain conditions), we say that the body of a primitive C defines what we have called "inlined C". To obtain a C, use Nullable[C]. Thus, all the primitives can be embedded into the type hierarchy below Object using the user-defined class Nullable[X] (given below). No magic!<br> </li> <li>In the body of a primitive definition C, the type of this is C (not Nullable[C]).</li> <li>instanceof, cast, ==, != are all defined on primitives ... however keep in mind that the type hierarchy of primitives is flat (no hierarchy). <br> </li> <li>The amount of space needed for a primitive is the space needed to represent all its fields. There is no need for any guid data. Instances of Nullable[X], like instances of any class should have guid data unless static analysis can establish that equality is never called on that object. (Tough!)</li> </ul> Now within this framework, we can introduce unions as a way of defining new primitives from existing primitives or classes.<br> <br> <tt>[Access modifier] union <name> (tag: int) {<br> data {<br> case 0: <type1><br> case 1: <type2><br> ...<br> case n: <typen><br> }<br> <constructors><br> <methods><br> }</tt><br> <br> All the types are required to be different. An instance is created with the syntax <name>(value), where value must have one of the n+1 types specified by the union <name>. <br> e.g. <br> <br> <tt>union javaPrimitive(tag: int(0..12)) {<br> data {<br> case 0: byte<br> case 1: short<br> case 2: int<br> case 3: long<br> case 4: boolean<br> ...<br> }<br> ...<br> }</tt><br> <br> We also introduce a new statement typecase.<br> <br> <tt>typecase(x) {<br> case 0: Stm0<br> ...<br> case n: Stmn<br> }</tt><br> <br> It is a compile-time error for x not to be of a union type U. The union must have n+1 elements. Within the body of Stmi, x.data has the type of the i'th summand of U. (Not clear whether x or x.data should have the type of the i'th summand.)<br> <br> Equality on unions is equality of the tag + equality of data (i.e. structural equality, hence unions are primitives). <br> <br> Nullable can be instantiated with any primitive, including a union. Hence Nullable[javaPrimitive] makes sense. An instance of Nullable[javaPrimitive] should require 8 bytes to store a javaPrimitive. <br> <br> More details later in the day.<br> <br> Best,<br> Vijay<br> <br> <tt>typedef Nullable[X](v:X) = Nullable[X]{self.v==v};<br> final class Nullable[X](v:X) {<br> def this(v:X): Nullable[X](v) {<br> property(v);<br> }<br> def me(v:X) = (v==this.v)? this : new Nullable[X](v);<br> }<br> </tt><br> Dave Cunningham wrote: <blockquote cite="mid:200...@do..." type="cite"> <pre wrap="">* Vijay Saraswat (<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:vi...@sa...">vi...@sa...</a>) wrote: </pre> <blockquote type="cite"> <pre wrap="">how do you write a tagged union in X10..? </pre> </blockquote> <pre wrap=""><!----> This is what I had in mind: (I called it Variant instead. For a laugh. Or maybe because it's a shorter name than TaggedUnion) If o was an Object instead of a Value we would probably do .equals() instead of == Also, HashMap seems not to have an apply(k,v) method, although it does have a put(k,v) method. import x10.io.Console; import x10.util.HashMap; public class Variant { protected val kind: Int; protected var c: Char; public def this (v:Char) { this.c = v; this.kind = 0; } public static operator (v:Char) : Variant = new Variant(v); protected var i: Int; public def this (v:Int) { this.i = v; this.kind = 1; } public static operator (v:Int) : Variant = new Variant(v); protected var f: Float; public def this (v:Float) { this.f = v; this.kind = 2; } public static operator (v:Float) : Variant = new Variant(v); protected var o: Value; public def this (v:Value) { this.o = v; this.kind = 3; } public static operator (v:Value) : Variant = new Variant(v); public def equals (that:Ref) { if (this==that) return true; if (!(that instanceof Variant)) return false; val other = that as Variant; if (kind != other.kind) return false; switch (kind) { case 0: return c == other.c; case 1: return i == other.i; case 2: return f == other.f; case 3: return o == other.o; } assert false : "Kind value unrecognised"; return false; } public def hashCode () { switch (kind) { case 0: return c.hashCode(); case 1: return i.hashCode(); case 2: return f.hashCode(); case 3: return o.hashCode(); } assert false : "Kind value unrecognised"; return 0; } public def toString () { switch (kind) { case 0: return c.toString(); case 1: return i.toString(); case 2: return f.toString(); case 3: return o.toString(); } assert false : "Kind value unrecognised"; return ""; } public static def main (args : Rail[String]) { val map = new HashMap[Variant,Variant](); map.put(3, 3.0f); map.put(3.0f, "str"); map.put("str", 3); Console.OUT.println("map(3) == "+map(3)); Console.OUT.println("map(3.0f) == "+map(3.0f)); Console.OUT.println("map(\"str\") == "+map("str")); } } // vim: shiftwidth=4:tabstop=4:expandtab </pre> </blockquote> <br> </blockquote> <br> </body> </html> |