[Pyret-devel] Design comments

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

My name is Paul Larson and I work with James on the LTC Test team.  I've 
looked at the design document a little today and have a few 
questions/comments about it.

First, as Nigel already pointed out, NFS being called out specifically 
seemed a little odd.  Was this just an example or is there a particular 
reason why it needs to be NFS?

Is there a mechanism for specifying hardware requirements for the test 
to run correctly?  I talked to James about this and I get the impression 
now that part of defining the job includes limiting the job to run on a 
set of specified machines.  What would be very nice to see here in the 
design, is how a job is defined.  I would assume that you have some 
definition of what a job control file should look like?  Probably short 
of a full blown language, but it's worth designing this now and 
specifying how it should work so that you can poke holes in it and find 
shortcomings at design time.

How is the searchspace definition structured?

What is the LoadModules step and what happens at this step?

What logging is performed during the build/install/test process?  I 
would think it necessary to log test and log patch status for 
instance... lets stop here a second.  Let's say you have a set of 
changes in your searchspace A,B,C,D,E,F,G.  You know that A-1 passed, 
and G+1 failed.  So you take D, and either pull the changeset, date, or 
apply the patch.  Now lets say it doesn't even build.  How do you treat 
that failure?  Probably you want to either back off some, or move 
forward some because you probably pulled at a point that either had 
merge patches following it, or a point which broke the build.  You 
cannot say with certainty whether this would have passed the test or 
failed it though, so this needs an additional state to classify it as 
both unknown and unusable.

It may also be necessary to log some other form of remote output.  In 
terms of kernel testing, this often takes the form of either output over 
a serial line, or from another ip:port which is redirecting serial 
output from the box.  This comes with additional baggage of needing to 
have tests specifications that not only includes 0 or non-zero, but also 
expecting output over this other line.  For instance, if your test 
crashes the machine, then there's probably something in the debug 
output, or at least loss of heartbeat to the machine that can tell you 
it failed.  Perhaps this goes beyond the scope of what you are trying to 
accomplish.  If so, then it should be explicitly called out in the 
design that the tool  is only being designed with the idea of hunting 
for non-fatal errors.  It is certainly more complicated to allow the 
searching of fatal errors, but significantly more useful as well.

Thanks,
Paul Larson