From: Vijay S. <vi...@sa...> - 2009-07-07 05:37:35
|
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> </head> <body bgcolor="#ffffff" text="#000000"> You can now start from <a class="moz-txt-link-freetext" href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhDesign">http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhDesign</a> on the internal wiki.<br> <br> Nate, here are most of the relevant links.<br> <br> Best,<br> Vijay<br> <br> <ul> <li> Initially created: 14 June 2009. </li> <li> Last updated: vj 6 July 2009. </li> <li> Base link: <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhDesign" class="twikiLink">XtenTwoOhDesign</a> </li> </ul> <p></p> <h1><a name="Design_of_X10_2_0_Version_1_5"></a><a name="Design_of_X10_2_0_Version_1_5_"></a> Design of X10 2.0 (Version 1.5) </h1> <p> This page contains links to a set of wiki pages documenting the design of X10 2.0. Ultimately these pages will be incorporated into a standalone design document. Text based on these pages will be included in the Language Reference. </p> <p></p> <p> </p> <ul> <li> <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhOverview" class="twikiLink">XtenTwoOhOverview</a> </li> <li> <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhObjectModel" class="twikiLink">XtenTwoOhObjectModel</a> </li> <li> <strike>XtenTwoOhPrimitives</strike> <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhSimpleStructs" class="twikiLink">XtenTwoOhSimpleStructs</a> </li> <li> <strike>XtenTwoOhTypeSystem</strike> </li> <li> <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhStatics" class="twikiLink">XtenTwoOhStatics</a> </li> <li> <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhXtenLangObject" class="twikiLink">XtenTwoOhXtenLangObject</a> </li> <li> <strike>XtenTwoOhXtenLangNullable</strike> </li> <li> <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhXtenLangRail" class="twikiLink">XtenTwoOhXtenLangRail</a> </li> <li> <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhXtenLangValRail" class="twikiLink">XtenTwoOhXtenLangValRail</a> </li> <li> <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhXtenLangPoint" class="twikiLink">XtenTwoOhXtenLangPoint</a> </li> <li> <span class="twikiNewLink">XtenTwoOhXtenLangRegion<a rel="nofollow" href="http://xj.watson.ibm.com/twiki/bin/edit/Main/XtenTwoOhXtenLangRegion?topicparent=Main.XtenTwoOhDesign" title="Create this topic"><sup>?</sup></a></span> </li> <li> <span class="twikiNewLink">XtenTwoOhXtenLangDist<a rel="nofollow" href="http://xj.watson.ibm.com/twiki/bin/edit/Main/XtenTwoOhXtenLangDist?topicparent=Main.XtenTwoOhDesign" title="Create this topic"><sup>?</sup></a></span> </li> <li> <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhNativeDesign" class="twikiLink">XtenTwoOhNativeDesign</a></li> <li><br> </li> </ul> <br> <ul> <li> Initially created: 14 June 2009 </li> <li> Last updated: vj 6 July 2009. </li> <li> Base link: <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhDesign" class="twikiLink">XtenTwoOhDesign</a> </li> </ul> <p></p> <h1><a name="The_X10_2_0_Object_Model_Overvie"></a> The X10 2.0 Object Model Overview (Version 1.5) </h1> <p> <strong>X10 is a class-based, object-oriented, generic, global programming language, supporting user-definable primitives, closures and a dependent type system.</strong> </p> <p></p> <h2><a name="Class_based"></a> Class-based </h2> <p> As in Java-like languages, an X10 program consists primarily of a collection of <em>classes</em> and <em>interfaces</em>. Classes are organized in a single-inheritance hierarchy, with root <code>x10.lang.Object</code>. </p> <p>A class specifies a collection of <em>properties</em>, (mutable and immutable) instance and static <em>fields</em>, <em>constructors</em>, and (overloaded) <em>methods</em>. A property is a <code>final</code> instance field that can be used in constructing types based on constraints. Properties are required </p> <p>A class may implement multiple <em>interfaces</em>. An interface specifies a collection of methods and properties that must be implemented by a class. </p> <p></p> <h2><a name="Generic"></a> Generic </h2> <p> X10 classes and methods may take <em>generic type parameters</em>. The parameters may be constrained by clauses which specify bounds on parameters (e.g. <code>X <: Comparable[X,X]</code>), and hence specify properties and methods available at that type. </p> <p>If <code>T1,..., Tn</code> is a sequence of types satisfying the constraint <code>c</code> associated with a class <code>C[X1,..., Xn]</code> then <code>C[T1,..., Tn]</code> is a legal instantiated class. The code for this class is obtained from the code for <code>C[X1,..., Xn]</code> by replacing each <code>Xi</code> with <code>Ti</code> (heterogeneous translation). </p> <ul> <li> This is how the <em>semantics</em> of generic classes is defined. The implementation is free to choose a different strategy as long as it is faithful to the semantics. For instance it may choose to make a heterogeneous translation only if an actual type parameter is a <code>struct</code> type. </li> </ul> <p></p> <h2><a name="Global_state"></a> Global state </h2> See <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhObjectModel" class="twikiLink">XtenTwoOhObjectModel</a> for details. <p>Place <code>0</code> contains the static state for all classes. </p> <p>Instances of a class are created by invoking a constructor using <code>new</code>. Variables may store <em>references</em> to objects. Each object has a globally unique identity, two object references are equal (<code>=&#61</code>) only if they point to the same object. </p> <p>The instance state of a class is divided into <em>local</em> and <em>global</em> state. All <code>var</code> instance fields of a class are local; some <code>val</code> instance fields may be marked <code>local</code> by the programmer. The <code>local</code> state of an object is accessible only in the place in which the object is created; this location is called the <em>home</em> of the object and is available through the <code>home</code> property of the object. The global state of an object can be accessed from any place. </p> <p>Methods are also classified as <code>local</code> or <code>global</code>. The compiler ensures that an object's <code>local</code> methods can only be called from the <code>home</code> of the object. </p> <p></p> <h2><a name="User_defined_primitives"></a> User-defined primitives </h2> <p> See <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhSimpleStructs" class="twikiLink">XtenTwoOhSimpleStructs</a> for details. </p> <p>An instance of a class <code>C</code> (an <em>object</em>) is represented in X10 as a contiguously allocated chunk of words in the heap, containing the fields of the object as well as one or more words used in method lookup. Variables with base type <code>C</code> (or a supertype of <code>C</code>) are implemented as cells with enough memory to hold a <em>reference</em> to the object. The size of a reference (32 bits or 64 bits) depends on the underlying operating system. </p> <p>For many high-performance programming idioms, the overhead of one extra level of indirection represented by an object is not acceptable. For instance, a programmer may wish to define a type <code>complex</code> (consisting of two <code>double</code> fields) and require that instances of this type be represented precisely as these two fields. A variable or field of type <code>complex</code> should, therefore, contain enough space to store two <code>doubles</code>. An array of <code>complex</code> of size <code>N</code> should store <code>2*N</code> doubles. Parameters of type <code>complex</code> should be passed inline to a method as two doubles. If a method's return type is <code>complex</code> the method should return two doubles on the stack. </p> <p>X10 supports user-defined primitives (called <em>structs</em>). </p> <ul> <li> For C/C++ programmers, an X10 struct corresponds to a C/C++ struct all of whose fields are <code>const</code>. Such structs do not have any aliasing issues. </li> </ul> <p>For every class <code>C</code> all of whose fields are <code>val</code> fields, X10 automatically defines a type <code>struct C</code> with the same fields as <code>C</code>. </p> <ul> <li> Thus <code>struct Object</code> is the root of the <code>struct</code> hierarchy. </li> </ul> <p>Unlike objects, structs do not have global identity. Instead, two structs are equal <code>==</code> if and only if their corresponding fields are equal <code>==</code> (this is the central property of structs). This implies that a variable of type <code>struct C</code> can contain only values of type <code>struct C</code> -- and not values of type <code>struct D</code> (for some subclass <code>D</code> of <code>C</code>). We say that _struct types are exact types_ to mean that a variable declared at a struct type must take on precisely the values of that type (and not a subtype). </p> <p> </p> <ul> <li> A value <code>v</code> of type <code>struct D</code> can be converted to a value of type <code>struct C</code> using the <code>v to struct C</code> conversion expression, if <code>D</code> is a subclass of <code>C</code>. Such a conversion retains only those fields of <code>D</code> which are fields of <code>C</code>. </li> </ul> <p>The size of a variable of type <code>struct C</code> is the size of the fields defined at <code>C</code>. </p> <p>X10 permits an expression of type <code>C</code> to be assigned to a variable of type <code>struct C</code> (this results in copying the fields of the object on the RHS to the fields of the variable on the LHS). The expression =new v= can be used to get a value of type <code>C</code> from a value <code>v</code> of type <code>struct C</code>, initialized with the state of <code>v</code>. </p> <p> </p> <ul> <li> To better support generic programming, <code>struct v</code> is permitted for <code>v</code> of type <code>struct C</code> as well. It simply copies the value of <code>v</code>. </li> </ul> <p> </p> <ul> <li> Similarly, <code>new v</code> is permitted for <code>v</code> of type <code>C</code> as well. It returns a <em>clone</em> initialized with the state of <code>v</code>, and guaranteed to be <code>!=</code> from <code>v</code> only if <code>v</code> has mutable state. </li> </ul> <p>The methods available on <code>struct C</code> are all those methods not marked <code>ref</code> on <code>C</code>. X10 requires that a method <code>m</code> for class <code>C</code> must be marked <code>ref</code> if it uses <code>this</code> in any context that distinguishes between the types <code>C</code> and <code>struct C</code>. Specifically <code>this</code> may not be an argument of cast, <code>==</code>, <code>!&#61</code> or <code>instanceof</code> operators, and may not be passed into a method at type <code>C</code> (or a supertype) or returned from a method with return type <code>C</code> (or a supertype) or assigned to a variable of type <code>C</code> (or a supertype). </p> <p> </p> <ul> <li> In all these cases the programmer may use <code>new this</code> instead. </li> <li> A method marked <code>ref</code> can only be overridden by a method marked <code>ref</code>. </li> </ul> <p>In methods not marked <code>ref</code>, <code>this</code> ambiguously has the type <code>struct C</code> or <code>C</code>. </p> <p>The subtype relationship on structs is obtained from the subtype relationship of the underlying classes. If <code>C</code> is a subclass of <code>B</code>, then <code>struct C</code> is a subtype of <code>struct B</code>. If <code>C</code> implements the interface <code>I</code>, then <code>struct C</code> implements the interface =struct I=. </p> <ul> <li> Variables cannot be declared at the type <code>struct I</code>, for <code>I</code> an interface. Such types are of use in placing bounds on type variables. </li> </ul> <p>Since <code>struct</code> types are exact types, the primary use of the subtype relationship on <code>structs</code> is in imposing constraints on type parameters (e.g. <code>X <: struct C</code>). The constraint <code>X <: Object</code> can be used to ensure that <code>X</code> can only be instantiated with classes. Similarly, <code>X <: struct Object</code> ensures that <code>X</code> can only be instantiated with structs. </p> <p></p> <h2><a name="Closures"></a> Closures </h2> <p> See <span class="twikiNewLink">XtenTwoOhClosures<a rel="nofollow" href="http://xj.watson.ibm.com/twiki/bin/edit/Main/XtenTwoOhClosures?topicparent=Main.XtenTwoOhOverview" title="Create this topic"><sup>?</sup></a></span> for details. </p> <p>X10 supports closures and closure literals. For any expression <code>e</code> of type <code>T</code> with free variables <code>x1,..., xn</code> of type <code>T1,..., Tn</code> respectively, the expression <code>(x1:T1,..., xn:Tn):T =>e</code> is a value of type <code>(T1,..., Tn)=>T</code>. If <code>f</code> is a value of such a type then <code>f(e1,..., en)</code> is of type <code>T</code> provided that each <code>ei</code> is of type <code>Ti</code>. </p> <p>The qualifier <code>local</code> may be used on type <code>(T1,..., Tn)=>T</code> to indicate that that values of this type must be applied only in the place in which this value was created. </p> <p></p> <p></p> <h2><a name="Dependent_type_system"></a> Dependent type system </h2> <p> For <code>C</code> a class and <code>c</code> a <em>constraint expression</em> then <code>C{c}</code> is a type. <code>c</code> may reference <code>val</code> variables currently in scope, and the special variable <code>self</code> which may be used to access the properties of <code>C</code>. An object <code>o</code> is of type <code>C{c}</code> if the constraint expression evaluates to true when executed in an environment in which the the <code>val</code> variables in scope are assigned any type-correct values, and <code>self</code> is assigned <code>o</code>. </p> <p> </p> <ul> <li> Thus, the type <code>Region{self.rank==2}</code> is satisfied by all <code>Region</code> objects whose <code>rank</code> property contains the value <code>2</code>. </li> </ul> <p>We define in a similar fashion what it means for a struct <code>o</code> to be of type <code>struct C{c}</code>. </p> <p></p> <p></p> <h3><a name="Type_definitions"></a> Type definitions </h3> <p> X10 also permits the programmer to specify <em>type definitions</em>. A type-definition specifies a name (e.g. <code>n</code>), an optional list of type parameters (e.g. <code>X1,...,Xn</code>) and an optional list of value parameters (e.g. <code>y1,..., yk</code>), and a type <code>T</code> which can refer to these parameters. An invocation <code>n[T1,.., Tn](e1,..., ek)</code> then stands for the type <code>T</code> in which the actuals have been substituted for the formals. Thus for instance </p> <p></p> <pre>typdef Pair[X,Y](x:X,y:Y)= Pair[X,Y]{self.x==x, self.y==y} </pre> permits us to use <code>Pair[int,int](0,1)</code> as a type whose only member is the <code>Pair</code> whose first component equals <code>0</code> and the second component equals <code>1</code>. <p>Type definitions may be top-level members of packages. </p> <p>A <code>public typedef</code> with name <code>n</code> must be specified in the file <code>n.x10</code> to facilitate separate compilation. </p> <p>Type definitions are expanded at compilation time. Recursive type definitions are not permitted; the compiler may not terminate when processing such type definitions. </p> <p></p> <p></p> <h2><a name="Class_hierarchy"></a> Class hierarchy </h2> <p> The package <code>x10.lang</code> defines the following classes </p> <p></p> <pre>Object Rail[X] // classes with immutable state ValRail[X] Point Region Dist Place String // immutable state Boolean // typedef boolean = struct Boolean; Byte // typedef byte = struct Byte; Short // typedef short = struct Short; Integer // typedef int = struct Integer; Long // typedef long = struct Long; Float // typedef float = struct Float; Double // typedef double = struct Double; UByte // typedef ubyte = struct UByte; UShort // typedef ushort = struct UShort; UInteger // typedef uint = struct UInteger; ULong // typedef ulong = struct Ulong; Char // typedef char = struct Char; Exception NullPointerException BadPlaceException ArrayIndexOutOfBoundsException ClassCastException ClockUseException IllegalOperationException RankMismatchException Error OutOfMemoryError Runtime System </pre> <p> </p> <h1><a name="TODO"></a> TODO </h1> <p> </p> <h1><a name="FAQ"></a> FAQ </h1> <p> </p> <h1><a name="Comments"></a> Comments </h1> <p> </p> <h1><a name="History"></a> History </h1> <p> </p> <p></p> -- <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/VijaySaraswat" class="twikiLink">VijaySaraswat</a> - 06 Jul 2009<br> <br> <ul> <li> Initially created: 14 June 2009 </li> <li> Last updated: vj 6 July 2009. </li> <li> Base link: <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhDesign" class="twikiLink">XtenTwoOhDesign</a> </li> </ul> <p></p> <h1><a name="The_X10_2_0_Object_Model_Version"></a> The X10 2.0 Object Model (Version 1.5) </h1> <p> X10 2.0 has a rather simple <em>distributed object model</em>. </p> <p>The state of an object is partitioned into <em>global</em> state (a programmer defined subset of <code>val</code> fields) and <em>local</em> state. </p> <p> </p> <ul> <li> Field definitions are marked with the qualifier <code>local</code> if they are intended to be included in the local state. If the <code>local</code> qualifier is omitted, the field is considered global. Properties may not be marked <code>local</code>. <code>var</code> fields are implicitly marked <code>local</code>. <code>local</code> fields may be overridden only by <code>local</code> fields. </li> </ul> <p>Similarly, the methods of an object may be qualified as <code>local</code>; if they are not <code>local</code> they are considered global. Global methods may access only the global fields of an object. They must be written in such a way that they behave as intended when invoked from any place. </p> <ul> <li> A global method can always access <code>local</code> state or invoke a <code>local</code> method through an <code>at</code> statement, e.g. <code>at (this.loc()) { m() }</code>. </li> </ul> <p>Consider the execution of an <code>at (P) S</code> statement at a place <code>Q</code> different from <code>P</code>. Suppose <code>x</code> is an in-scope final local variable and contains a reference to an object <code>o</code> created at <code>Q</code>. Then within <code>S</code>, <code>x</code> is said to be a <em>remote reference</em> to <code>o</code> (references to <code>o</code> from place <code>Q</code> are said to be <em>local references</em>). X10 permits <code>global</code> fields to be read and <code>global</code> methods to be invoked through a remote reference. </p> <ul> <li> Remote references to an object <code>o</code> are implemented by serializing the global state of <code>o</code> across the network, together with information about the source place and the local reference to <code>o</code>. The data is deserialized at the receiver to create an implementation-level entity that is the remote reference. There is no requirement that the implementation intern such entities; however the implementation must correctly implement equality (see below). </li> </ul> <p>Like local references, remote references are first-class entities: they may be passed as arguments to methods, returned from methods, stored in fields of objects. </p> <p>Remote references may also be compared for equality (<code>==</code>). Two remote reference are equal if they are references to the same object. Equality is guaranteed to be a constant-time operation and not involve any communication. </p> <p>When a remote reference to an object <code>o</code> located at place <code>P</code> is transmitted to <code>P</code> it automatically becomes a local reference to <code>o</code>. Therefore the situation in which a local reference can be compared to a remote reference simply cannot arise. </p> <p>The class <code>x10.lang.Object</code> defines the global method <code>loc():Place</code>. When invoked on a (reference to) a rooted object <code>x</code> created at place <code>P</code>, <code>x.loc()</code> returns <code>P</code>. If <code>x</code> is a reference to a mobile object, <code>x.loc()</code> returns <code>here</code>. </p> <p>The X10 compiler ensures that <code>local</code> methods on <code>o</code> can only be invoked in a place where <code>here== o.loc()</code>, i.e. the place where <code>o</code> was created. (If <code>here== o.loc()</code>, we say <code>o</code> is <em>local</em>.) The programmer may always invoke a <code>local</code> method <code>m</code> on such an object through the code <code>at (o.loc()) { o.m() }</code>. </p> <p></p> <h3><a name="Local_execution"></a> Local execution </h3> <p> The semantics of <code>atomic</code> and <code>when</code> constructs requires that their bodies do not execute any <code>at</code> operations, implicitly or explicitly. Hence the compiler must establish that if a <code>local</code> method <code>m</code> is being invoked on a reference <code>o</code> in the body of such a construct, <code>o</code> is a local reference. </p> <p>To support compile-time analysis we introduce the type qualifier <code>local</code>. For any type <code>T</code>, a variable <code>x</code> of type <code>local T</code> can only contain local references, i.e. references to objects created in the "current" place. If <code>x</code> is a local variable, or a method parameter then the current place is <code>here</code>, the place at which the current activity is executing. If <code>x</code> is a field of an object <code>o</code>, it must be a <code>local</code> field, and must contain references to objects created at the same place as <code>o</code>. If the return type of a method is <code>local T</code> then it must return an object of type <code>T</code> created <code>here</code>. </p> <p> </p> <ul> <li> Thus it is legal (sound) for an activity executing the code of a <code>local</code> method for an object <code>o</code> to read the value of a <code>local</code> field of <code>o</code> of type <code>local T</code> into a local variable of type <code>local T</code>. </li> </ul> <p></p> <h3><a name="Object_hierarchy"></a> Object hierarchy </h3> <p> </p> <pre>class Object { def toString():String; def hashCode():Int; def loc():Place; } </pre> <p> We no longer need to have a <code>Value</code> class or a <code>Ref</code> class. </p> <p>The classes <code>String, Point, <span class="twikiNewLink">ValRail<a rel="nofollow" href="http://xj.watson.ibm.com/twiki/bin/edit/Main/ValRail?topicparent=Main.XtenTwoOhObjectModel" title="Create this topic"><sup>?</sup></a></span>, Region, Dist</code> will not have any local state. The class <code>Rail</code> will have local state. </p> <h2><a name="Examples"></a> Examples </h2> <strong>Example</strong> Assume the class declarations. <p></p> <pre>class C { ...} class D { var f:C=null; } </pre> <p> Now consider the code: </p> <pre>val x = new C(..); // C object o created, reference stored in x. at (P) { // In the body x contains a remote reference to o val f = new D(); f.x1 = x; // remote reference stored in f.x1 Console.OUT.println((f.x1 == x); // must print true Console.OUT.println((x == x); // must print true at (Q) { // x continues to be a remote reference to o1. at (P) { Console.OUT.println(f.x1 == x); // must print true Console.OUT.println((x == x); // must print true } } } </pre> <p> </p> <p><strong>Example</strong> </p> <pre>val x = new C(..); // C object o created, reference stored in x. // type of x is local C{c} if the return type of the constructor is C{c}. at (P) { val x1 = x; // type of x is C{c} because of the place shift introduced by at(P) at (x.loc()) { // x is now bound to o through a local reference. So is x1. Console.out.println(x1==x); // Must print true. // local methods can be invoked on x or x1 and will execute locally on o // type of both x and x1 is local C{c}. } } </pre> <p> </p> <h1><a name="Programming_Methodology"></a> Programming Methodology </h1> <p> A programmer wishing to ensure that a <code>val</code> field is not serialized when the containing object is serialized (e.g. because it contains a large cache which makes sense only in the current place) must mark that field as <code>local</code>. </p> <p></p> <h1><a name="Todo"></a> Todo </h1> <ul> <li> Figure out an explicit <code>global System.copy[T](o:T):T</code> operator. Could also be provided as a method on <code>Object</code> provided that we introduce <code>MyType</code>, i.e. <code>copy():MyType</code>. The <code>copy</code> operator should make a shallow copy of the object. If the object is rooted, this will involve communication to make a copy of the local state of the object. There are no atomicity requirements. </li> </ul> <p> </p> <ul> <li> Check the semantics of <code>as</code> and ensure that it can be performed locally on a proxy. </li> <li> <strike>Figure out how best to make the following claim: "The properties of proxies are such that the programmer is guaranteed that the invocation of a pure global method (with identical arguments) will return identical results on the proxy or the root object."</strike> </li> <li> <strike>Should there be <code>shallowCopy()</code> and <code>deepCopy()</code> methods? </strike> </li> <li> <strike>Figure out with Nate whether we need an <code>= = =</code> or does an <code>==</code> suffice?</strike> </li> </ul> <p></p> <h1><a name="Comments"></a> Comments </h1> <h2><a name="What_s_different_on_X10_2_0_that"></a> What's different on X10 2.0 that impacts the object model? </h2> <ul> <li> In X10 2.0, there's a more complex notion of a remote object/references. Unlike in previous version of X10, activities in a remote place can perform some operations on the object without having to async to the objects home place first. In particular, the following operations are now allowed in non-local objects: <ul> <li> access to final fields of the object </li> <li> invocation of "global" instance methods on the object (either via a virtual call or via an interface call). </li> <li> perform basic object model operations such as <code>instanceof</code> and <code>as</code> </li> </ul> </li> <li> X10 2.0 does not have X10 1.7's Value types </li> <li> X10 2.0 has user-definable primitives (<a href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhPrimitives" class="twikiLink">XtenTwoOhPrimitives</a>). </li> </ul> <p></p> <h2><a name="Background_Remote_Objects_in_X10"></a> Background: Remote Objects in X10 1.7 C++ backend/runtime </h2> In 1.7, remote objects were more or less opaque handles (all that could be done to them was to pass them around and find out what place they were located at). As a result, we were able to use a fairly space efficient tagged union scheme to represent them. If an object was local, then the bottom two bits of the pointer to the object were <code>00</code> and the pointer pointed to an instance of the expected C++ type. If the bottom two bits of the pointer were <code>01</code>, then the pointer pointed to a struct containing the place and address of the object on the remote place. <p></p> <h2><a name="Proposed_X10_2_0_object_model_fo"></a> Proposed X10 2.0 object model for C++ backend/runtime </h2> Core idea: we use real C++ object instances for both local and remote objects. We use the same tagged pointer trick as in 1.7. If the pointer is tagged as remote, then immediately before the object in memory we place the struct containing the home place/address information. Thus, only remote objects pay the space overhead for this. <ul> <li> When deserializer allocates space for the object, it does a single alloc that gets space for both the struct and the object instance. It puts in the struct, adjusts the raw pointer, and then constructs the object and fills in its final fields. </li> <li> In the short run, this can be made to work with C++ virtual inheritance by ensuring the following properties: <ul> <li> Objects are always cast up to <code>Ref</code> before being serialized </li> <li> To access the location of an object that is remote, one must upcast to <code>Ref</code> first (to ensure this pointer is pointing to first word of object) then one can look backwards in memory for the location & remote addr field </li> <li> Before doing dynamic casts, etc one must untag the pointer to allow C++ to access the objects vtable, etc. Retag after done. </li> </ul> </li> <li> Longer term, we get rid of virtual/multiple inheritance at the C++ level and implement interface dispatching ourselves. This gets us to a simpler C++ object model where we don't need dynamic casts and the <code>this</code> pointer will never be adjusted. </li> </ul> <p>Alternatives: </p> <ul> <li> Could put location/remote addr information in a side hashtable instead of adjacent to the object. Less efficient and slower, but less tricky. </li> <li> Could pay space overhead of putting location/remote addr information in all objects (ie, put them in <code>Ref</code>). Simple, but wastes space </li> <li> Could build complex C++ type hierarchy where for a single X10 type <code>T</code> we generate three C++ types (abstract superclass, local subclass, remote subclass). </li> </ul> <p><strong>Vj</strong>: My suggestion is to start with the hashtable implementation -- guaranteed to work. Remote operations are going to be expensive anyway. </p> <ul> <li> If the first scheme is to be tried make sure there are lots of small test cases, and these are run with both xlc and gcc on all the architectures of interest to us. Dont see any reason for the complex C++ hierarchy. You will face the problem that X10 subclass relations are not represented by C++ subclass relations. </li> </ul> <strong>Vj</strong> (06/23): Idle thought: Wonder whether there is a way to create proxy objects without allocating all the extra space for local fields. I wonder whether an X10 class <code>A</code> (extending <code>B</code>) could be implemented with two classes <code>AWhole</code> (has all the state) and <code>AProxy</code> (only has global state), with <code>AWhole</code> inheriting from <code>BWhole</code> and <code>AProxy</code>? (<code>AProxy</code> inherits from <code>BProxy</code>.) In the C++ translation, the type <code>A</code> is translated to <code>AProxy</code>. Local methods are invoked on <code>AProxy</code> after casting it to <code>AWhole</code>. <ul> <li> <strong>IP</strong> We said that the global fields will be available remotely, but we have never promised that accessing them would be as efficient as doing the same locally. So, the compiler could generate special remote accessor methods for global fields that would fetch the field on-demand and store it in newly allocated storage. That way, proxies could be as large or as small as we would want to make them... The downside is that we would have to generate all of those extra classes and code. </li> </ul> <p></p> <h2><a name="Language_Semantics_Questions"></a> Language Semantics Questions </h2> <ul> <li> What is the expectation for <code>==</code> on remote objects? <strong>VJ: See above.</strong> <ul> <li> Codegen for equals to handle both local case inline and other cases outofline? </li> <li> Must we canonicalize the object in remote place? Ie, since all of the state is final, can we just allow there to be multiple copies of the object in a place as long as we define equals appropriately? <ul> <li> NN: yes, <code>==</code> on remote refs can compare the location/remote addr, not the pointer to the cached payload. Note we must be careful to not use remote refs for local objects. </li> <li> <strike>VJ: Hmm. I dont really understand what NN wrote above. <code>==</code> is always reference equality. If the references are proxies, reference equality implies root equality. But two proxies may be not equal and still have equal roots.</strike> <strong>Changed to require that <code>==</code> compare the remote address.</strong> </li> </ul> </li> </ul> </li> <li> <strike>Can the programmer control (via annotations?) that some final fields are actually not to be sent to the remote place? Ie, the notions of val/var and global/local may need to be separated for network efficiency. </strike> <ul> <li><strike> NN: Fields could be annotated <code>@Nontransferable</code> (alternatives: <code>@Fixed</code>, <code>@Immovable</code>). This is perhaps only useful if the cached payload in the remote proxy is sparse; that is, if we use a different C++ class for remote objects, eliding the var fields, and local objects and generate different field access code. Field access code for nontransferable fields will be expensive (but probably not much compared to the communication needed to retrieve the contents). </strike></li> <strike> </strike> <li><strike> VJ: It is reasonable to assume that the programmer should be able to control which fields of a class are serialized on a remote reference. Hence the introduction of <code>local</code> above.</strike> <strong>The <code>local</code> annotation on fields resolves this issue.</strong> </li> </ul> </li> </ul> <p></p> <h2><a name="Do_we_need_mobile_objects"></a><a name="Do_we_need_mobile_objects_"></a> Do we need mobile objects? </h2> <p> Version 1.3 gets rid of the mobile objects of Version 1.2 (see earlier versions of this page to get at the Version 1.2 description). </p> <p>Igor notes that: </p> <ul> <li> <strong>IP</strong>: In particular, this means that transmitting any <span class="twikiNewLink">ValRail<a rel="nofollow" href="http://xj.watson.ibm.com/twiki/bin/edit/Main/ValRail?topicparent=Main.XtenTwoOhObjectModel" title="Create this topic"><sup>?</sup></a></span> or closure will incur the remote reference overhead. </li> </ul> <p>This is correct. In essence, the suggestion is that introducing the notion of mobile objects -- over and above the <code>val</code> kernel that will be transmitted for the object anyway -- is a premature optimization that complicates the object model for not much gain. </p> <p>There are several points to make: </p> <ul> <li> If you really truly do not want to send the local reference over when serializing the <code>val</code> state of the object, then consider using a primitive instead of an object. Note this is not a like-for-like change: primitives come with significant restrictions. <ul> <li> Closures should be implemented as opaque primitives (with no member fields). This means we may want to consider permitting generic primitives. Note that the implementation of an <code>async</code> requires the transmission of an implementation-level closure. There should be no reason to transmit the local reference with such an <code>async</code>. In fact one should not, in order to get performance for <code>RandomAccess</code>. </li> <li> We prolly want to introduce <span class="twikiNewLink">OpenCL<a rel="nofollow" href="http://xj.watson.ibm.com/twiki/bin/edit/Main/OpenCL?topicparent=Main.XtenTwoOhObjectModel" title="Create this topic"><sup>?</sup></a></span> vectors as primitives in the language. </li> </ul> </li> <li> What are good examples of programs in which <code>ValRail</code> or <code>Point</code> cross place boundaries in a performance-efficient context? They do in <code>NQueens</code> and <code>UTS</code>, but communication is not the bottleneck there. </li> <li> If the compiler can establish that the deserialized object is not subject to <code>&63;&63;</code> then it does not need to transmit the local reference. This may be possible to do for programs which keep the deserialized object on the stack. </li> <li> The implementation doesnt really need to transmit all the bits for the local reference. It could transmit, for instance a run-length-encoded unique id. e.g. keep a table that maps indices to local references, and send out only the indices. Indices are run-length encoded. That is, you send a byte that encodes the number of bytes, followed by that many bytes. </li> </ul> <p></p> <p></p> <h2><a name="Fields_vs_properties"></a> Fields vs. properties </h2> <p> </p> <ul> <li> NN: I wanted to remove properties, and just have programmers just use final fields in constraints. The objection was that there can be cycles among final fields, but not cycles among properties for decidability reasons. But, could we just not allow cycles of final fields? I don't see that they're that useful and I would rather eliminate a mostly redundant concept from the language. <ul> <li> VJ: (I think you mean <code>public final</code> instance fields should not be distinguished from properties.) I think removing the ability to have final cyclic structures would be a significant restriction -- you do not want to impose restrictions on what is possible to do with object structures in the heap. e.g. you create a fixed (immutable) data flow graph through which you pump data via connectors. Not being able to create such a graph, or requiring that the edge relation must be mutable would be unfortunate, particularly given the emphasis in immutability as a way to get determinacy. Also, see the discussion of <code>local</code>. It makes sense to have <code>local</code> final fields, but not <code>local</code> properties. The key point is to think of properties as capturing some public aspect of the object that can be used in categorization (i.e. in the type system) and is hence part of the public specification of the object as opposed to part of the implementation. </li> </ul> </li> </ul> -- <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/DaveGrove" class="twikiLink">DaveGrove</a> - 01 Jun 2009 <p></p> <hr> <h1><a name="History"></a> History </h1> <p> </p> <ul> <li> Version 1.4. Removed section on primitives. The 1.4 proposal automatically defines a <code>struct</code> for each class with only <code>val</code> fields. Structs have their own inheritance hierarchy, but only exact types are permitted at structs. Unconstrained generic variables can be instantiated with either structs or class types. </li> </ul> <p> </p> <ul> <li> Adopted another suggestion from Cunningham, seconded by Nate, to consider all classes to be <code>rooted</code>. Hence, no mobile objects. No need for the <code>rooted</code> qualifier. So to create a remote reference the implementation must serialize enough information to encode a local reference to the object. Main advantage: Simplifies object model. You can only create objects which can have remote references and which use rooted equality for <code>??</code>. If you want to use structural equality, use primitives. Version 1.2 may be found at <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhObjectModel?rev=13" target="_top">rev 13</a> </li> <li> Introduced the <code>rooted</code> qualifier on classes. A class must be declared <code>rooted</code> if it has any local state. </li> <li> Version 1.1 of the design may be found at <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhObjectModel?rev=10" target="_top">rev 11</a> of this page. <ul> <li> Changed semantics of <code>==</code> so that it behaves like <code>System.equalRoots(..)</code>. This simplifies things considerably. No need to talk of proxies explicitly. </li> </ul> </li> <li> Version 1.0 of the design may be found at <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhObjectModel?rev=10" target="_top">rev 10</a> of this page. </li> <li> Change from Version 1.0 <ul> <li> Eliminated <code>global</code> classes. </li> <li> Removed any connection between inheritance and <code>local</code> state ... any class may have <code>local</code> state even if its super-class or sub-class does not. Primitives do not have <code>local</code> state. </li> <li> In particular, there is no longer a <code>loc():Place</code> method on <code>Object</code>. </li> <li> As before, <code>local</code> state may be mutable or immutable. </li> <li> Eliminate <code>valEquals</code> and any discussion of <code>equals</code>. It is up to the programmer to define <code>equals</code>, as in Java (obeying the contract to keep it consistent with <code>hashCode</code>.) </li> </ul> </li> <li> Dave C is proposing that <code>p==q</code> be defined as <code>System.rootEquals(p,q)</code>. The idea is that this is more robust. I think this change can be made below without changing anything else. It essentially means that proxies behave as if they are interned -- there is no way for an X10 computation to figure out then that <code>p</code> and <code>q</code> are two distinct (i.e. <code>!=</code>) objects but point to the same root. </li> </ul> <br> <ul> <li> Initially created: 6 July 2009 </li> <li> Last updated: vj 6 July 2009 </li> <li> Base link: <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhDesign" class="twikiLink">XtenTwoOhDesign</a> </li> </ul> <p></p> <h1><a name="User_defined_primitives_Version"></a><a name="User_defined_primitives_Version_"></a> User-defined primitives (Version 1.5) </h1> <p> An instance of a class <code>C</code> (an <em>object</em>) is represented in X10 as a contiguously allocated chunk of words in the heap, containing the fields of the object as well as one or more words used in method lookup. Variables with base type <code>C</code> (or a supertype of <code>C</code>) are implemented as cells with enough memory to hold a <em>reference</em> to the object. The size of a reference (32 bits or 64 bits) depends on the underlying operating system. </p> <p>For many high-performance programming idioms, the overhead of one extra level of indirection represented by an object is not acceptable. For instance, a programmer may wish to define a type <code>complex</code> (consisting of two <code>double</code> fields) and require that instances of this type be represented precisely as these two fields. A variable or field of type <code>complex</code> should, therefore, contain enough space to store two <code>doubles</code>. An array of <code>complex</code> of size <code>N</code> should store <code>2*N</code> doubles. Parameters of type <code>complex</code> should be passed inline to a method as two doubles. If a method's return type is <code>complex</code> the method should return two doubles on the stack. Two values of this type should be equal precisely when the two doubles are equal (structural equality). </p> <p></p> <h2><a name="Structs"></a> Structs </h2> X10 supports user-defined primitives (called <em>structs</em>). A distinguishing characteristic of X10 is that code for structs is generated automatically from code for classes, and structs themselves are organized in an inheritance hierarchy. <ul> <li> For C/C++ programmers, an X10 struct corresponds to a C/C++ struct all of whose fields are <code>const</code>. Such structs do not have any aliasing issues. </li> </ul> <p>The size of a variable of type <code>struct C</code> is the size of the fields defined at <code>C</code> (upto alignement considerations). No extra space is allocated for a vtable or an itable. </p> <p></p> <h2><a name="Definition_of_a_struct"></a> Definition of a struct </h2> Consider a class <pre>Modifiers class C[X1,..., Xn](p1:T1,..., pn:Tn){c} extends B{d} implements I1, ..., Ik { Body } </pre> <p> all of whose fields are <code>val</code> fields. For such a class, X10 automatically defines: </p> <p></p> <pre>Modifiers struct C[X1,..., Xn](p1:T1,..., pn:Tn){c} extends struct B{d} implements I1, ..., Ik { Body' } </pre> where <code>Body'</code> is defined as follows. <ul> <li> The fields of <code>Body'</code> are those of <code>Body</code>. </li> <li> Say that a constructor or method for class <code>C</code> is <code>open</code> if it uses <code>this</code> in a way that distinguishes between the types <code>C</code> and <code>struct C</code>. Specifically, the constructor or method must be marked open if: <ul> <li> It passes <code>this</code> into an argument of type <code>C</code> (or a supertype) </li> <li> It returns <code>this</code> from a method those return type is <code>C</code> (or a supertype). </li> <li> It assigns <code>this</code> to a variable whose type is <code>C</code> (or a supertype). </li> </ul> </li> <li> We require that methods in which <code>this</code> is open are marked <code>open</code>. </li> <li> We require that <code>open</code> methods can be overridden only by <code>open</code> methods. </li> <li> Methods that are not marked <code>open</code> are said to be <em>closed</em>. Closed methods can typecheck with <code>this: struct C</code> as well as <code>this:C</code>. </li> <li> <code>Body'</code> contains precisely the constructors and instance methods of <code>Body</code> which are closed. (Thus <code>struct C</code> may beabstract= and may contain <code>abstract</code> methods.) </li> <li> <code>Body'</code> does not contain any instance initializers. </li> <li> <code>Body'</code> does not contain any static fields or initializers. </li> </ul> <p>Note that in <code>Body'</code> the type of <code>this</code> is <code>struct C</code>. </p> <p><strong>Implementation Note</strong>. The compiler may implement the methods of <code>struct C</code> by statically resolving all <code>this</code> and <code>super</code> calls, and translating the methods of <code>C</code> into static methods. Thus an instance of <code>struct C</code> does not need a vtable or an itable, at the cost of code expansion. </p> <p>Values of a <code>struct C</code> type can be created by invoking a constructor defined in <code>struct C</code>, but without prefixing it with <code>new</code>. </p> <p></p> <h3><a name="Struct_equality"></a> Struct equality </h3> <p> Unlike objects, structs do not have global identity. Instead, two structs are equal <code>==</code> if and only if their corresponding fields are equal <code>==</code> (this is the central property of structs). This implies that a variable of type <code>struct C</code> can contain only values of type <code>struct C</code> -- and not values of type <code>struct D</code> (for some subclass <code>D</code> of <code>C</code>). </p> <ul> <li> Otherwise we would be in the untenable position that two values are considered equal at a type <code>struct C</code> but not at a type <code>struct D</code>, where <code>D</code> is a subtype of <code>C</code>. </li> </ul> <p>We say that <em>struct types are exact types</em> to mean that a variable declared at a struct type must take on precisely the values of that type (and not a subtype). </p> <p></p> <h3><a name="Subtyping_of_structs"></a> Subtyping of structs </h3> <p> The subtype relationship on structs is obtained from the subtype relationship of the underlying classes. If <code>C</code> is a subclass of <code>B</code>, then <code>struct C</code> is a subtype of <code>struct B</code>. If <code>C</code> implements the interface <code>I</code>, then <code>struct C</code> implements the interface <code>struct I</code>. </p> <ul> <li> Variables cannot be declared at the type <code>struct I</code>, for <code>I</code> an interface. Such types are of use in placing bounds on type variables. </li> </ul> <p>Since <code>struct</code> types are exact types, the primary use of the subtype relationship on <code>structs</code> is in imposing constraints on type parameters (e.g. <code>X <: struct C</code>). The constraint <code>X <: Object</code> can be used to ensure that <code>X</code> can only be instantiated with classes. Similarly, <code>X <: struct Object</code> ensures that <code>X</code> can only be instantiated with structs. </p> <p></p> <h2><a name="Conversions"></a> Conversions </h2> <p> For <code>D</code> a subclass of <code>C</code> we permit a value <code>v</code> of type <code>struct D</code> to be assigned to a variable of type <code>struct C</code> by slicing off the fields from <code>v</code> which are not in <code>C</code>. </p> <p>For <code>v</code> a value of type <code>C</code> or <code>struct C</code>, <code>struct v</code> is the value of type <code>struct C</code> obtained by copying the fields of <code>v</code> (the <em>unboxed</em> version of <code>v</code>). </p> <p>For <code>v</code> a value of type <code>struct C</code> or <code>C</code>, <code>new v</code> is a value of type <code>C</code> whose fields are initialized with the values of the fields of <code>v</code> (the <em>boxed</em> version of <code>v</code>.) </p> <p> </p> <ul> <li> Note that the implementation may perform some form of interning. Hence there is no guarantee that <code>new</code> will actually create a new object. This does not create any aliasing issues since <code>struct C</code> has no mutable state. </li> </ul> <p>Even if <code>C</code> implements an interface <code>I</code>, a value <code>v</code> of type <code>struct C</code> cannot be assigned to a variable <code>x</code> of type <code>I</code>. However, the boxed version, <code>new v</code> can be. </p> <p></p> <h2><a name="Generic_programming_with_structs"></a> Generic programming with structs </h2> <p> An unconstrained type variable <code>X</code> can be instantiated with subtypes of <code>Object</code> (classes) or subtypes of <code>struct Object</code> (structs). Within a generic class, all operations are available on a variable of <code>X</code>. For instance, variables of <code>X</code> may be used with <code>==</code>, <code>!=</code>, <code>instanceof</code>, casts etc The programmer must be aware of the different interpretations of equality for structs and classes and ensure that the code is correctly written for both cases. </p> <p></p> <p></p> <h1><a name="TODO"></a> TODO </h1> <ul> <li> Explore whether it makes sense to permit the programmer to define a <code>struct</code> directly. It should be possible to define a <code>class</code> from the <code>struct</code> in a similar manner. </li> <li> Consider marking fields as <code>open</code>. Such fields are not copied to <code>struct C</code>. Closed methods cannot access <code>open</code> fields. (Need good use case for such fields.) </li> </ul> <p></p> <h1><a name="FAQ"></a> FAQ </h1> <p> </p> <ul> <li> Why is <code>struct C</code> defined only when <code>C</code> has no <code>var</code> fields? _Because this means that there are no aliasing issues to consider. Otherwise it would be necessary to introduce an address-of operator see <a href="http://xj.watson.ibm.com/twiki/bin/view/Main/XtenTwoOhInlined" class="twikiLink">XtenTwoOhInlined</a>). This complicates the story on generating the code for the <code>struct C</code> automatically from <code>C</code>, and on instantiating generic code with <code>C</code> and <code>struct C</code>. </li> </ul> <p></p> <h1><a name="Comments"></a> Comments </h1> <p> </p> <p></p> <h1><a name="History"></a> History</h1> <br> <br> </body> </html> |