Re: [Sablevm-developer] regression testing staging@r1575 and sablevm-classpath@r1575 against 1.0.9

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Grzegorz B. Prokopski wrote:

>>Basically:
>>
>>1) I found no regressions
> 
> I have to believe you, because I looked at the diffs and firstly:
> 
> I don't know what I am comaring with what. I mean, that you used
> just diff, instead of ex. 'diff -u', which would put filenames
> into the header.

Each SableVM output is compared against java.out, which is supposed to
be the correct output (from sun's 1.4.1-b21 java).  If there's a 
problem, the name of the SableVM output file is printed, and then the diff.

e.g.

./grande/section1/sablevm-1.0.9-switch-debug-JGFSerialBench.out
1c1,12
< Exception in thread "main" java.lang.VerifyError: (class:
JGFSerialBench, method: JGFrun signature: ()V) Incompatible types for
storing into array of arrays or objects
---
> *** Couldn't bind native method 
Java_java_lang_Class_getDeclaredFields ***
> *** or Java_java_lang_Class_getDeclaredFields__ ***
> java.lang.UnsatisfiedLinkError

If they're identical, save for timing information (grep -i -v -E
'time|total|msec|returned value'), the result is a pass in the
report file.  The `paranoid' diffs are those that have everything (but 
still worth scanning over -- one time I found negative numbers reported 
for timing information with my spmt stuff).  I ran the SableVM's with '-Y'.

If you want, it's very easy to get a 'diff -u'.  Let me know (I don't
think it will provide more useful information, just nicer formatting).

> Second thing, partially being result of the first one: I looked at
> http://www.sable.mcgill.ca/~cpicke/sablevm/r1575-vs-1.0.9/grande_section1_diff_feb12
> and it seems that errors are not identical. I guess that both cases
> qualify as "doesn't work", but still - I'd love to know which errors
> do we get now. Do I get it right, that w/ 1.0.9 we get link error,
> and now we get security provider error (so it's more CP problem)?

Without having looked at any code, yes, I think so.

"Doesn't work" is sort of abused in this case, because it "doesn't work"
for Sun's Java either.  But you're right, there is an improvement on
JGFSerialBench: the error is handled more gracefully.

>>2) mtrt seems to have been fixed for switch-debug (no blank line)
> 
> oh, and how about switch w/o signals? It might be interesting to see.

$ diff sablevm-staging-r1575-switch-nosig.out java.out

and

$ diff sablevm-1.0.9-switch-nosig.out java.out

both give me nothing, which is interesting.  N.B. I only did it once -- 
the error may be intermittent.  I think for now we need to focus on 
micro synchronization problems first (e.g. JGFSyncBench).

> 
>>3) JGFSyncBench, JGFLUFactBench, and JGFRayTracerBench fail
>>      (single- and multi-threaded errors)
>>4) JGFSerialBench fails, but it fails for java as well
>>      (verification error)
>>5) In general things are faster, but there are a couple of exceptions.
> 
> Working on it. Inlined is staging is ATM not reliable in terms of speed.
> There are good chances to get fixed this weekend.

Okay.  I'll put the scripts for doing the regression testing in 
~/sablevm/regression in my home directory -- they need some unification 
to become truly reusable, but as long as you have the same directory 
layout for the benchmarks as me, and run 4 different SableVM's, they 
should be fairly easily adjusted (using some editor's find-and-replace).

>>1) A list of known issues (incl. the above benchmark failures and the 
>>fact that synchronized methods may crash the VM)
>>2) A fix for the classpath configure problem (I think the easiest is to 
>>undo Mark's configure.ac patch, since apparently this was only for JamVM).
> 
> I guess we can just undo the change for this moment, but it'd be
> best to ask on gnu classpath ML. Hmmm... I think he's on SableVM list.
> 
> Mark? :-)

Fixed (see other email to Mark).

> I think Chirs, that you've just pushed 1.1.0 a lot to make it reality.
> Thanks a lot!

No problem :)  Most of the time went into compilation (configure wasn't
seeing gcc-3.3.2 for staging, and once that was fixed I had to go back
and specify gcc-2.95.4 for 1.0.9), installing dependencies, and
adjusting benchmarking scripts.  The actual benchmarks took about 6
hours total to run (4 different machines, I think), but that could be 
sped up if the grande benchmarks were broken into section1, section2, 
and section3, and creating the reports is trivial once the scripts are 
set up.  I want a release personally too -- if anything, so that I have 
a supposedly more stable codebase to work against if I want it.

According to the policy, we have to try and fix regressions as soon
as possible, if they appear.  I think "bugfree" is a bit of a
misnomer, it should really be thought of as "regressionfree", since no
software is truly bug-free.

One last thing:  I think it would be helpful to include David's build 
script for classpath, and if we're not putting one in for SableVM, at 
least put a note that says "the default configure options are: 
--with-threading=inlined, etc. etc."  We also need a note about gcc.

Cheers,
Chris