From: Wilton W. <ww...@ha...> - 2002-09-26 07:10:42
|
On Wed, 25 Sep 2002, Stanley, Matthew D. wrote: > I have been trying to make gigabit networking work in a 4 node cluster now > for several weeks. We have been trying both the Scyld 27Z-8 release and the > Clustermatic March 02 release with RH 7.2. Both situations produce similar > results. Using Dlink DGE-500T and Intel PRO/1000 gigabit ethernet cards we > have problems booting the network boot.img from the master server. I have I have had nothing but trouble with the DGE-500T's we tracked them down to a probelm with the Tyan 2466 motherboards we are using and to a BIOS related PCI interrupt issue and are working with Tyan to resolve this, it's not really high priority, since we don't have the same problem with the Intel e1000 cards and the cost difference now is negligable. I will see if I can scrounge up a couple of e1000 cards sometime this week and see if I can duplicate your problem in our setup here. I only have a few questions: Which e1000 drivers are you using ? What is your motherboard ? What switches are you using ? > I realize this means more administration work, but can I just install all > four machines identical and then use bpslave instead of the beoboot system? > If I do this, what modifications are required to the scripts (ie. node_up) to > provide similar functionality. These clusters are just used for NAMD > research. In this case I belive all node_up really has to do is exit with a "0" error number ie: <SNIP> #!/bin/sh exit 0 </SNIP> I don't think any other modification is nescessary.. tho' the bpslave file caching code for library loading on demand may be a bit screwy.. I haven't tried this configuration. > I read somewhere else on the list where Erik suggested using the mcastbcast > ethX parameter to fix these issues but it didn't seem to solve mine unless I The mcastbcast option makes normally multicast packets (file transfers in bproc) into broadcast packets, this is to accomadate switches/hardware that doesn't handle multicast packets quickly. I have been pretty swamped this week, doing other stuff for Hard Data Ltd. but soon I will have our patched and updated version of BProc ready for download from our website or ftp server, I'll announce it on the list when I put it up. - Wilton ----[ Wilton William Wong ]--------------------------------------------- 11060-166 Avenue Ph : 01-780-456-9771 High Performance UNIX Edmonton, Alberta FAX: 01-780-456-9772 and Linux Solutions T5X 1Y3, Canada URL: http://www.harddata.com -------------------------------------------------------[ Hard Data Ltd. ]---- |