Re: [Sablevm-developer] README for libsablevm

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On Sat, Mar 15, 2003 at 03:20:39PM -0500, Chris Pickett wrote:
> 1) it seems that everything in macros.c should go in cast.list, except 
> for _svmf_is_same_object which should go in one of the util files.  I'm 
> not sure if macros.c is obsolete or not.  It seems that macros.h is 
> probably obsolete as well, and if not, should probably be targeted for 
> replacement with m4 macros in the future.

You are mostly right.  A couple of methods cannot be generated using
cast.list because of the additional conditional.  Also, the name
"macros.[ch]" is not really intuitive.  It is part of a few legacy
things from before the SableVM rearchitecturing to use m4...

> 2) all constants defined should go into constants.h and this should be a 
> policy for anyone modifying the source.  i'm not sure if there are 
> constants defined elsewhere in the VM.

Except for system-specific things, of course.

> 3) I don't know why we need direct_threaded.m4, inlined_threaded.m4, 
> switch_threaded.m4,
> instructions_preparation.m4, and instructions_switch.m4.  In general, I 
> think the very short files  like these are unnecessary.

They are necessary.  Look in src/libsablevm/Makefile.am.  These file impact the
selection of the appropriate macro expansion.

> 4)
> error_bits.m4.h:
> error_classes.m4.h:
> error_init_methods.m4.h:
> error_instances.m4.h:
>        very short multicallable macros to declare error types.
>        not sure why all four of these files exist ... they are
>        very similar and all declare macros of the same name.

That's the idea: They declare macros of the same name, yet expand
differenlty.  Look into src/libsablevm/Makefile.am again.

> 
> 5) gc_generational.c -- should be moved into libsablevm/dev or something 
> until it is working.

Or maybe get rif of it, altogether.  I will show up in CVS (attic
directory).

> 6) class_file_parser.h .... don't really think we need this to be a 
> separate file.  is it obsolete?

Maybe. :-)

> 
> 7) I need a better explanation for global_refs.* and local_refs.*

The idea is that you get type-safe allocation and free functions.  In
addition, the free function resets the pointer variable to NULL,
making sure any subsequent attempt to dereference the freed pointer
causes a segfault (instead of using an obsolete pointer value, which
can be very, very difficult to track).

Also, these functions do all the necessary to throw an exception in
case of out-of-memory situation.

Being type-safe, you know that the allocated memory is of the right
size.  Once you'll have accumulated a lot C development experience,
you'll know how easy it is to make the following mistake, when you
copy/paste code:

sometype1 *var = malloc (sizeof(sometype2));

And how easy it is to forget to check whether var == NULL.  You'll
also notice how it is annoying to retype the correct error handling
code.

The global_refs do all of the tedious work for you, so you get to
only write:

  if (_svmm_gzalloc_gc_map_node (env, method->parameters_gc_map) != JNI_OK)
    {
      return JNI_ERR;
    }

As for the ..._no_exception version, they do the same, but they do not
create and throw an OutOfMemoryError.  They are necessary, because in
early bootstrapping, the VM has not yet created the heap, so it cannot
instantiate an Error object instance.

> 8) heap_manager.c: at least part or all of this file is obsolete i think 
> (definitely the end)

Possible.

> 9) heap_manager.h .... don't need an essentially empty header

Probaby. :-)

> 10) I don't know why instructions_switch.m4.c exists, it seems similar 
> to the content in instructions_preparation.m4.c and there are no files 
> called instructions_inlined.m4.c, or instructions_direct.m4.c. 

It is necessary.  Unlike the direct/threaded threaded engines, the
switch-based interpreter has to provide a real "switch" statement of
bytecode implementations to be included in _svmf_interpreter()
[interpreter.c], yet it does need also a separate "switch" (like the
two other engines) to provide information about each bytecode (in
prevision of method preparation [prepare_code.c]).

The key to understand what is happening is to trace the execution of
_svmf_interpreter using a debugger (I suggest DDD) by setting
breakpoints at the appropriate locations.  For this, you need to
compile the 3 engines, ideally with --enable debugging-features,
unless you want to be intrigued by the segfault used in the normal
control flow of the VM...  ;-)

> 
> 11) move the java_lang_* stuff into libsablevm/java and remove vmlib.* 
> altogether.

No.  The Auto* tools don't really support compilation of a single
library out of files in multiple directories.

If you want the C optimizer to do a good job, you want to compile a
single library in one shot.  I never understood why people do a per
file compilation of C code to .o.  It makes no sense; C wasn't
designed like that.  An optimizing compiler cannot inline functions or
do any global optimization if you arbitrarily separate your
compilation unit in smaller units on a source file boundary.

It only makes snese to generate a .(s?)o if you plan to reuse the same
functionality across many executable/libraries, and dont want your
optimizer to do any cross function boundary optimization.

libsablevm.so should only export a restricted set of symbols: JNI_*
and Java_*.  Using more than one compilation unit would cause the
exportation of additional (read: inteternal!) symbols.

So, unless you want to fix the GNU auto* tools, you'll have to live
with the current file structure. :-(

> 12)
> jnidefs.h:
>        definitions of jobject, jarray, jfieldID, and jmethodID.  why do
>        these get their own special file, instead of being included in
>        types.h or jni.h?

Because jni.h is installed on the users system and made available for
Java programmers to be able to write JNI libraries to link with the
VM.  The types above are "opaque" types, from a Java programmer's
point of view, so that the same JNI library can wotk with *any* JVM
implementing the JNI interface.  Of course, within SableVM itself,
these types should't be opaque, so there you have tjnidefs.h to
"enlighten" these types. :-)

> 13) not sure what lib_init.c does

You can invoke JNI_CreateJavaVM multiple times (concurrently) within a
single process (as long as each call is on a different thread).

lib_init serves to make sure some global libsablevm initialization
happens only once, using the standard POSIX pthread_once().

> 
> 14) unclear about "specific" methods in method_invoke.list

The non-"specific" method invocations are invocation of method
decalred in classes which are automatically loaded by the bootstrap
class loader.

The "specific" method invocations are used to invoke a "specific"
version of a method signature selected at runtime.  So, in the
"specific" case, you have an additional formal parameter for providing
a _svmt_method_info pointer.

For example, to invoke the <clinit> method of a class (part of class
initialization), the VM needs the _svmt_method_info which is specific
to that class.  It cannot use a generic _svmt_method_info from
method/class loaded at bootstrap time.

If you don't like the "specific" name, (I'm starting to dislike it
quite a bit), I'm open to suggestions for a replacement.

> 
> 15) do you need native_interface.h?
> 

These are remnants of the old SableVM code base where I was using
separate sompilation units (as is most intuitive with auto* tools).

So, unless the "extern" needs a forward declaration, we could maybe
get rid of it.

[Order of inclusion in libsablevm.c can be a good technical reason to
keep a .h file. :-)]

> 16) all type declarations should go in types.h .... this should be a
> policy like for constants.h ... the stuff in system.h obviously
> shouldn't be in there though.  % grep typedef * shows that only
> pthread_rec_svm.* violates this.

pthread_rec_svm.* should remain so.  It is an implementation of
recusrive mutexes on top of POSIX non-recursive mutexes.  These files
can easily be reused by other applications, with no dependency
whatsoever on other SableVM internal data structures & types.  

This could go into a seprate directory, yet I prefer to keep more
opimization opportunities by leaving them as part of the same
compilation unit.

> 
> 17) util.h, util.m4.c., util1.c, and util2.c need to be cleaned up (see 
> my comments in the file).  split
> the util functions into two categories: those for working with the C 
> language, and those for
> working with Java data structures.  furthermore, util.h could be built 
> from an m4 file and a list file it
> seems.

Patches are welcome.  The reason it is not all in a single file is
because of compile dependencies (order of #include's).

> 
> 18) all #include directives should go in libsablevm.c

except for pthread_rec*, and some other ones like in
_svmf_interpreter().

OK.  Enough for today.  For the remaining of my answers, I'll give
them to you at our meeting next week and I'll assign to you the task
of writing them down and emailing them to this list for the benefit of
all other sablevm developers. :-))

Have fun!

Etienne

-- 
Etienne M. Gagnon, Ph.D.             http://www.info.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/