Menu

RegressionServer

Anonymous

The Regression Server

This page contains information relevant to the ODTBX regression testing server.

Machine Information

TBD

Network Issues

Eric Grejda captured a summary of network issues relevant to ODTBX regression testing server in Mantis issue 356 on Mar 15, 2011. (Copied here)...

There are a number of issues which, singly, can result in impaired operation of the regression testing mechanism.

The first is the limited number of licenses of MATLAB in general (and the statistics toolkit in particular). At any given time users can be running MATLAB, which removes a license from the communal pool. It is possible for a lack of licenses to delay the completion of regression testing for several hours, though this is mitigated somewhat by the time the tests are executed. It is also possible for regression testing to lock other users out of MATLAB until the end of the run.

Another possible problem is a network connectivity failure somewhere inside the network of building 11 (as happened March 14, 2011 [last night]). If the network is impacted in any way no one will be able to contact the testing server, the server will not be able to transmit status reports at the end of every night, and the server may not be able to get MATLAB licenses, which would hold up regression testing until connectivity was restored.

The ODTBX regression testing server is not part of the FFTB network, though it is kept in the FFTB lab.

The mail transfer system of the regression testing server relies upon the FFTB collaboration server to relay e-mail (test status reports); the collaboration server then relays mail through a pair of systems run by GSFC itself (collectively referred to as mailhost.gsfc.nasa.gov). It isn't possible to send mail out of the Goddard network without going through the mailhosts. If the collaboration server is unavailable for any reason (say, the firewall is misconfigured or the FFTB network has been disconnected), status report transmission will be impacted. This can be worked around by configuring the testing server to relay through mailhost directly.

The Goddard DNS (Domain Naming System) infrastructure is used by the testing server to convert hostnames into IP addresses, which are used internally. If the Goddard DNSes are acting up (say, one or more of them are down or overloaded) then resolution requests will time out. This includes sending status reports. Unfortunately, the Goddard DNSes are a little flaky and their zone files really need maintenance (for example, mailhost.gsfc.nasa.gov is a CNAME for three machines (mailhost1, mailhost2, and mailhost3) but mailhost3 doesn't exist). Due to how round-robin CNAME-to-A record resolution works, every third time someone tries to resolve mailhost.gsfc.nasa.gov it won't work. This can be worked around somewhat by hardcoding IP address/hostname mappings in /etc/hosts on the test server but if anything changes outside of our sphere of influence here (GSFC changes the IP address of one of the mailhosts) I'll have to track down what changed and fix it. Best practice is to avoid hardcoding addresses unless there is no other option.


Navigation

[ForDevelopers] - Go back to the main development page

WikiStart - Go back to OD Toolbox Home


Related

Wiki: ForDevelopers
Wiki: Home

MongoDB Logo MongoDB