From: Larry B. <ba...@us...> - 2002-06-22 01:06:06
|
I am a newcomer to Linux and Linux clusters, and I know just enough = system admin commands to get a workstation up and running on our = network. I'm more interested in computing platforms to run our Fortran = and C programs faster, which is how I found Clustermatic. I installed Clustermatic to experiment with a small Beowulf cluster. My = master node is an old 233 MHz PII with 64MB. I installed a 100GB disk = and partitioned it with 4GB for / (includes /boot, /usr, and /opt), 1GB = for /var, 2GB for swap, and the rest (~90 GB) for /home. I have 2 slave = nodes: each is a 1.8 GHz P4-A with 1GB DDR. The slaves boot from floppy = and have no local hard disk (i.e., no swap); all disk access is to NFS = exports from the master (/bin->/bin(ro), /home->/home(rw), = /opt->/opt(ro), /sbin->/sbin(ro), /usr->/usr(ro), and = /var->/var/node.#(rw) in /etc/beowulf/fstab, and I rmdir then soft-link = /tmp->/var/tmp and /scratch->/home/node.# in /etc/beowulf/node_up). I installed the Intel C++ and Fortran compilers on the master and = successfully ran one of my (uniprocessor) benchmarks. I also installed = Intel's Math Kernel Library. This is where I ran into trouble. Intel's = BLAS test suite runs fine on the master (using the default Pentium Pro = target architecture). But, when I try to run the same test suite on one = of the slaves, I get a compilation error: ld: warning: libpthread.so.0, needed by = /opt/intel/mkl//lib/32/libguide.so, not found (try using -rpath or = -rpath-link) I looked for /lib/libpthread.so.0 on the master node using "ls /lib", = and it's there. But, when I looked for it on a slave node 0 (or 1) = using "bpsh 0 ls /lib", it's not there. "/usr/sbin/bplib -l" shows = /lib/libpthread.so.0 in the library list, but it apparently is not = accessible. My /etc/beowulf/config says that the libraries in /usr and /usr/lib are = supposed to be "automagically" made available on the slaves: libraries /lib /usr/lib But it doen's explain how. I don't know why some libraries are in the = /lib and /usr/lib directories on the slaves and others are not. I found = /usr/lib/beoboot/bin/setup_libs, which looks like it copies libraries to = the slaves, but it is commented out in /usr/lib/beoboot/bin/node_up (is = it obsolete?). The beoboot kit in the Clustermatic tarballs directory = has an rc.beowulf which looks like it does similar things, but I could = not find an rc.beowulf anywhere else on the master node, so I don't = think it is called either. Could someone please explain how and which libraries get copied to the = slave nodes, and what I have to do to get libpthread.so.0 (and, maybe = more later) included? Are all the libraries in the bplib list supposed = to be there? Is there a problem if the bplib list does not match the = libraries on the slaves? Could I have run out of RAM disk? (Which = brings up another question: how do I tell how much RAM disk each slave = has and how full it is?) Thanks in advance for your help, Larry Baker US Geological Survey P.S. My benchmark ran 16% faster on my 1.8A P4 with PC2100 DDR (VIA = chipset) than on my 667 MHz Tru64 Unix Alpha DS20E. |