From: E. S. Venkatraman <venkat@bi...> - 2006-04-19 18:06:38
We are testing openSSI as the platform for our computational cluster.
It is based on a HP DL140 blade server with a total of 10 nodes. Each
blade has two 3.2 GHz Xeon processor and 2Gb RAM. Currently there are 7
nodes with node 10 as the root node. The home directory is on an NFS
share. The use of software RAID on the two IDE drives of the root node
required editing the initrd image (in order to include the necessary
modules and scripts) to make the machine bootable.
We have faced some additional issues and would appreciate any assistance
in resolving them.
1. the nfs client appears to have some trouble establishing the
connection with the server. Syslog has the following repeated 5-6 times:
Apr 19 12:13:04 plbscc04 kernel: RPC: failed to contact portmap (errno -5).
Apr 19 12:13:04 plbscc04 kernel: lockd_up: makesock failed, error=-5
Apr 19 12:13:39 plbscc04 kernel: portmap: server localhost not
responding, timed out
portmapper and lockd don't seem to be running.
2. Some of the nodes go down and we can't figure out why. Syslog also
has the following message:
Apr 15 15:42:59 plbscc05 kernel: OpenSSI: Node 4 has gone down!!!
Apr 15 15:43:04 plbscc05 kernel: Assertion failed! vma->vm_pgoff ==
pgoff, cluster/ssi/vproc/as_xscribe.c, as_do_vma, line=987
Is the node going down connected to the message regarding vma...?
3. Is there an updated 1.9.2 version of debian packages that will solve
Please note that this e-mail and any files transmitted with it may be
privileged, confidential, and protected from disclosure under
applicable law. If the reader of this message is not the intended
recipient, or an employee or agent responsible for delivering this
message to the intended recipient, you are hereby notified that any
reading, dissemination, distribution, copying, or other use of this
communication or any of its attachments is strictly prohibited. If
you have received this communication in error, please notify the
sender immediately by replying to this message and deleting this
message, any attachments, and all copies and backups from your