[Jscheme-user] Re: generic method invocation in JScheme

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Ken Anderson <kan...@bb...> writes:

> Joe, 
> Your talk made me want to look at code i haven't looked at for
> a long while - how Java methods are invoked.  They are like Common
> Lisp generic functions, the right method is looked up at runtime
> based on the types of all aguments.  There is not "type widening".
> There is no MOP, like you may have.
>
> For a Java class i was teaching i put all constructor/method/field
> invocation into a class Invoke, which was easily separated from
> JScheme, though we didn't advertise that.
>
> Basically an application of an instance method does a cache lookup
> on (list isStatic methodName ClassName canAccessPrivateData) The
> comments below shows the calling sequence for a JavaMethod.  We do
> nothing fancy for method lookup except if there is only one, a
> common case, we invoke it and hope for the best.
>
> Instance methods do a lookup on isStatic, and methodName which we
> should memoize in the JavaMethod.  It looks like we should
> specialize JavaMethod into JavaStaticMethod JavaInstanceMethod and
> JavaSpecifiedMethod (where both the class and name are specified) .

Wow.  It looks as if our approaches are very different.

Larceny (the Scheme runtime) provides a `syscall' primitive that
simply transfers control to a known C# method.  The Larceny runtime
leaves the arguments to syscall in known `registers' and expects the
Result register to be updated before control is returned.

Larceny also provides a primitive type called a ForeignBox.  It has no
functionality whatsoever beyond being a first-class Scheme object, but
it is a useful wrapper for holding reflection objects.

So there is no attempt at all to integrate the .NET system with the
Scheme system at the primitive implementation level.  Everything is
done at arms length and mediated via wrappers and `syscalls'.

The syscall for method invocation expects to find three Scheme objects
in the `registers':  a foreign box holding the reflected method, a
foreign box holding the instance, and an array of foreign boxes
holding the arguments.  It simply unboxes everything and attempts to
invoke the contents of the first box on the contents of the remainging
boxes:

    // Get the arguments   
    SObject arg1 = Reg.register3;
    SObject arg2 = Reg.register4;
    SObject arg3 = Reg.register5;

    // Unbox the method
    MethodInfo mi = (MethodInfo) ((ForeignBox)arg1).value;

    // Unbox the arguments
    SObject[] sargv = ((SVL)arg3).elements;
    object[] args = new object[sargv.Length];
    for (int i = 0; i < args.Length; i++)
        args [i] = ((ForeignBox)(sargv[i])).value;

    // Call the method
    object result = mi.Invoke (mi.IsStatic ? null : ((ForeignBox)arg2).value, args);

    // Box up the result
    Reg.Result = Factory.makeForeignBox (result);
    return;

There are two ways to get a handle to a reflected method.  There is a
syscall that can return a reflected method given a reflected type and
the types of the arguments, or one could invoke the GetMembers method
on the type and get an array of all the methods, fields, properties,
and constructors associated with the type.  I use the former mechanism
only for bootstrapping (maybe a dozen or so methods).

It is in Scheme that everything interesting happens.  I adapted an
object system based on Tiny-CLOS to Larceny (it started as Tiny-CLOS,
Eli Barzilay adapted it to PLT Scheme and enhanced it to make
`Swindle'.  I adapted it and tailored it for Larceny and .NET)  .NET
methods get wrapped within CLOS methods which are installed in the
appropriate CLOS generic functions.  .NET type descriptors are wrapped
in CLOS class objects, and .NET objects are wrapped in CLOS
instances (hence the need for a MOP: a .NET type must be both an
instance and a class).

Suppose that we had invoked a .NET method (through my code) and it
returned a .NET object (in a ForeignBox).  My code determines the
runtime type of the object in the box (via another syscall) and finds
the CLOS class object that represents that type.  It instantiates an
instance of the class with a slot holding the ForeignBox.  The
instance is what the user sees.

All .NET methods are instances of classes that inherit from the .NET
type `System.MethodBase'.  My code knows about this and has
special-cased this particular class to inherit from both its reflected
meta-type *and* the Tiny-CLOS method class.  Thus wrappers for
reflected methods are themselves CLOS methods.  

When instantiating a wrapper for a reflected method, we need to
compute the specializers (to determine under what conditions it is
called) and a procedure (to determine what happens when it is called).
The specializers are determined by taking the declared types of the
method arguments and finding an appropriate reflected class object.
For the most part, this is simply the class object that represents the
reflected type, but some classes need special treatment.  Therefore,
the generic function `argument-specializer' (which is the identity
function by default) is overridden in some particular cases.  For
example, the specializer for the .NET string class is the Scheme
string class and the specializer for the .NET System.Object (the root
class) is instead the class of all Scheme objects.  Arrays and enums
are handled specially here, too.

The procedure that gets invoked has to package things up for the
invoke syscall.  .NET objects will already be in their foreign boxes,
but the boxes themselves need to be unwrapped from the CLOS
instances.  Boxes holding integers and strings will need to be
constructed.  So there is a generic function `argument-marshaler' that
given a reflected .NET type, returns a function that can correctly
convert a scheme object to a foreign box.

In addition, if the .NET method has optional arguments, the CLOS
method must supply the defaults which are specified in the reflected
method parameter list.  The end result, in all its gory detail, is
this: 

(define (clr-method-info->method class info)
  (call-with-values
   (lambda () (clr-methodbase/get-parameters info))
   (lambda (required-parameter-count
            optional-parameter-count
            default-values
            specializers
            out-marshalers)
     (let* ((declaring-type (clr-memberinfo/declaring-type info))
            (name (clr-memberinfo/name info))
            (instance-marshaler (argument-marshaler declaring-type))
            (arity (if (= optional-parameter-count 0)
                       (+ required-parameter-count 1)
                       (make-arity-at-least (+ required-parameter-count 1))))
            (max-arity (+ optional-parameter-count required-parameter-count 2))
            (in-marshaler (return-marshaler (clr-methodinfo/return-type info))))
       (make class
         :arity arity
         :max-arity max-arity
         :clr-handle info
         :name name
         :specializers (cons (argument-specializer declaring-type) specializers)
         :procedure (nary->fixed-arity
                     (lambda (call-next-method instance . args)
                       (dotnet-message 4 "Invoking method" name)
                       (in-marshaler
                        (clr/%invoke info
                                     (instance-marshaler instance)
                                     (marshal-out (+ optional-parameter-count required-parameter-count)
                                                  out-marshalers args default-values))))
                     (arity-plus arity 1)))))))

So if a method object is somehow returned from the .NET code to
Scheme, a CLOS instance that multiply inherits from both its
appropriate .NET class and the CLOS method class is constructed with
the appropriate specializers, marshalers, and defaults.

There is one more thing to do.  When this method object is
instantiated, it should be registered with the appropriate generic
function to invoke it.  There is a post-initialization method that is
added to subclasses of the System.MethodBase reflected class.  This
registers the method:

   (add-method initialize-instance
      (make (*default-method-class*)
        :arity 2
        :specializers (list methodbase-class)
        :procedure (lambda (call-next-method instance initargs)
                     (call-next-method)
                     (process-method
                      instance
                      (clr-methodbase/is-public?
                       (clr-object/clr-handle instance))))))

(define (process-method clr-info public?)
  (let ((handle (clr-object/clr-handle clr-info)))
    (if (clr-methodbase/is-static? handle)
        (install-static-method (make-static-name handle) clr-info public?)
        (install-instance-method (clr-memberinfo/name handle) clr-info public?))))

Since all this is part of the instance creation protocol, the only
thing that needs to be done at this point is to get the syscall layer
to return method objects.  To bootstrap the system, I just walk the
type tree and ask it to list the methods.

Here's an example of method invocation.  Suppose I call the .ToString
method on an object.

(.ToString foo)

The javadot syntax is handled by the macro expander, so this becomes

  ((clr/find-generic #f 'tostring) foo)

The #f indicates public method rather than private.  The
*clr-public-generics* hash table will be searched to find the CLOS
generic function named 'tostring.

The CLOS generic function does the usual multimethod dispatch on the
types of its arguments (using the standard CLOS cacheing tricks to
make it fast).  Supposing that foo were an instance of a CLOS wrapper
to a .NET object, the generic function would find that method
specialized to the class of (the wrapper to) foo.  That method would
now be invoked.  Recall that it will be something like this:

    (lambda (instance . args)
      (in-marshaler
        (clr/%invoke info
                     (instance-marshaler instance)
                     (marshal-out (+ optional-parameter-count required-parameter-count)
                                     out-marshalers args
                                     default-values))))

But since there are no arguments beyond the instance, we can ignore
them and the effect is more like this:

    (lambda (instance)
      (in-marshaler
        (clr/%invoke info
                     (instance-marshaler instance)
                      ))

Info is the ForeignBox containing the reflected method and
instance-marshaler will simply be a function that unwraps the CLOS
wrapper.  clr/%invoke is the syscall, so we end up passing the
appropriate ForeignBoxes to the C# code above.

The in-marshaler is a function that performs an appropriate action on
the return value.  In this case, the return type of the .NET method is
System.String, so the return value of .ToString will be a ForeignBox
containing a .NET string object.  The in-marshaler for most .NET
objects is to simply wrap them with the appropriate CLOS wrapper, but
we special case the in-marshaler for .NET strings to convert the
ForeignBoxed .NET string into a Scheme string.

--------------------

I'm sure your eyes have glazed over by now, so I'll finish here.  It
looks as if we are both taking the same information into account when
trying to go from a Scheme generic to a Java or .NET method.  It'd be
interesting to compare the utility, performance and the edge cases.
At this point, I'm the sole user of my code, so I really don't know
what end users will perceive as advantages or disadvantages to my
approach.  From what I can tell, the overhead of tiny-CLOS is minimal
compared to the other overheads of going across the syscall boundary
and the interpreter overhead.  In the simple, usual case, where there
is one method to invoke, we'll get the same results, but I wonder if
there are more complex cases where your method and mine diverge.

~jrm