From: SourceForge.net <no...@so...> - 2003-11-26 02:28:34
|
Bugs item #845971, was opened at 2003-11-20 12:38 Message generated for change (Comment added) made by muglerj You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109368&aid=845971&group_id=9368 Category: Installation Group: 2.4 >Status: Closed >Resolution: Fixed Priority: 9 Submitted By: Jason Brechin (brechin) Assigned to: John (muglerj) Summary: LAM installed after initial install fails tests Initial Comment: I installed the cluster w/o LAM. MPICH was default MPI. I installed LAM, and selected it to be the default MPI. Now, when I run the tests, I get a failure. It seems like it's looking for GM. Unfortunately, VMWare doesn't have a virtual GM device yet ;) Here are lamtest.out and .err: [root@vm-rh90 lam]# cat lamtest.out Running LAM/MPI test --> MPI C bindings test: TEST FAILED! Commands: mpicc cpi.c -o lam-cpi && mpirun C lam-cpi && lamclean LAM 7.0/MPI 2 C++/ROMIO - Indiana University [root@vm-rh90 lam]# cat lamtest.err lamboot: error while loading shared libraries: libgm.so.0: cannot open shared object file: No such file or directory /usr/bin/ld: cannot find -lgm collect2: ld returned 1 exit status mpicc: No such file or directory ------------------------------------------------------ ----------------------- It seems that there is no lamd running on the host oscarnode1.ncsa.uiuc.edu. This indicates that the LAM/MPI runtime environment is not operating. The LAM/MPI runtime environment is necessary for the "lamhalt" command. Please run the "lamboot" command the start the LAM/MPI runtime environment. See the LAM/MPI documentation for how to invoke "lamboot" across multiple machines. ------------------------------------------------------ ----------------------- ---------------------------------------------------------------------- >Comment By: John (muglerj) Date: 2003-11-25 21:28 Message: Logged In: YES user_id=505737 As per RM instructions i'm closing this out. See also PackageInUn.pm 1.20 for fixes that were made to implement this. ---------------------------------------------------------------------- Comment By: Jeff Squyres (jsquyres) Date: 2003-11-25 14:33 Message: Logged In: YES user_id=11722 Per discussion on the call today, for 3.0, let's do the following: - print out an error message at the time that the error occurs - print out any accumulated error messages before running the post installs - print out any accumulated error messages *again* after the whole process is done The intent here is 1) ensure that the user *knows* that an error occurred, and 2) still run all the post install scripts so that we try to leave OSCAR in a known state (rather than a printf saying "we failed; please go run all the post install scripts"). ---------------------------------------------------------------------- Comment By: Jeff Squyres (jsquyres) Date: 2003-11-23 08:20 Message: Logged In: YES user_id=11722 So how should we mark this bug? I'm guessing that this specific issue will become moot as more of the OSCAR 4 architecture is implemented. The same issue may exist (what to do if install on the image fails), but probably in a very different way. ---------------------------------------------------------------------- Comment By: John (muglerj) Date: 2003-11-20 16:57 Message: Logged In: YES user_id=505737 Ok, heres your current situation: 1. the install to the clients succeeded. 2. the install to the server succeeded. 3. the install to your image failed. 4. the post_install script never ran for lam. 5. the installed bit is still set to 0, and the system thinks the package is not installed. I'm not sure about the "to_be_installed" bit. 6. You will not be allowed to run the uninstall scripts from the GUI. If we fail on an install action (we try to detect everything), we immediately fail gracefully and print a decent debug statement. There is no attempt at system cleanup, currently. To fix this, we need to go to a system that can operate on the different chunks of the system, retain enough information to know what is installed and where, and automatically do the right thing. This will require more database mods, more GUI mods...basically just more of everything. To fix things (get lam installed) in your situation, you have 2 options (this is after you fix the disk issue on your VM): 1. clean the cruft out of /tmp on the image, manually run the uninstall scripts for lam...and then retry thru the GUI...preferred. 2. manually finish installing to the image, set the installed bit by hand, rerun all packages post_install scripts by hand. ---------------------------------------------------------------------- Comment By: Jason Brechin (brechin) Date: 2003-11-20 15:09 Message: Logged In: YES user_id=274641 Not a VMWare problem, per se, but it is due to a rather small disk size I've allowed it. At that point, is it not doing something? I would imagine it should either totally fail, or at least leave the server and clients in a working state. ---------------------------------------------------------------------- Comment By: John (muglerj) Date: 2003-11-20 15:03 Message: Logged In: YES user_id=505737 Looks like the rpm -Uvh command failed on the image. Here's the actual error message i dug out of the blob: 'installing package lam-with-gm-oscar-7.0-2 needs 8MB on the / filesystem' Is this a vmware problem? I tried a response via normal mail earlier...sourceforge would not let me log in until afternoon. ---------------------------------------------------------------------- Comment By: Jason Brechin (brechin) Date: 2003-11-20 12:44 Message: Logged In: YES user_id=274641 Upon futher investigation, installed and should_be_installed were not set properly after the whole ordeal... I guess it didn't install properly? As a side note... I wish Neil would have fixed those "uninitialized" errors. Xlib: extension "GLX" missing on display "localhost:10.0". Xlib: extension "GLX" missing on display "localhost:10.0". --> PLEASE WAIT! Bring up the Configurator... ============================================== =============================== == Running step 2 of the OSCAR wizard: Configure selected OSCAR packages ============================================== =============================== --> About to run /opt/oscar/packages/kernel_picker/scripts/pre_configure for kernel_picker warning: /tftpboot/rpm/redhat-release-9-3.i386.rpm: V3 DSA signature: NOKEY, key ID db42a60e [OSCAR::PackageBest :: Line 407] Reading package directory [OSCAR::PackageBest :: Line 419] Reading cache file. [OSCAR::PackageBest :: Line 432] Comparing cache to directory. [OSCAR::PackageBest :: Line 457] Writing new cache file. 62750 blocks 65913 blocks --> About to run /opt/oscar/packages/switcher/scripts/pre_configure for switcher --> About to run /opt/oscar/packages/switcher/scripts/post_configure for switcher Setting default for tag mpi ("lam-7.0") Attribute successfully set; new attribute setting will be effective for future shells --> Step 2: Completed successfully executing:/opt/c3-4/cexec --pipe c3cmd-filter hostname ============================================== =============================== == Running OSCAR package install ============================================== =============================== Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. --> Running install on clients RPMS that need to be installed: - /tftpboot/rpm/libaio-devel-0.3.93-4.i386.rpm - /tftpboot/rpm/libaio-0.3.93-4.i386.rpm - /opt/oscar/packages/lam/RPMS/lam-with-gm-oscar-module- 7.0-2.i586.rpm - /opt/oscar/packages/lam/RPMS/lam-with-gm-oscar-7.0- 2.i586.rpm - /opt/oscar/packages/lam/RPMS/lam-oscar-module-7.0- 2.i586.rpm - /opt/oscar/packages/lam/RPMS/lam-oscar-7.0-2.i586.rpm RPMS that will be installed: - /tftpboot/rpm/libaio-devel-0.3.93-4.i386.rpm - /opt/oscar/packages/lam/RPMS/lam-with-gm-oscar-module- 7.0-2.i586.rpm - /opt/oscar/packages/lam/RPMS/lam-oscar-module-7.0- 2.i586.rpm - /opt/oscar/packages/lam/RPMS/lam-with-gm-oscar-7.0- 2.i586.rpm - /tftpboot/rpm/libaio-0.3.93-4.i386.rpm - /opt/oscar/packages/lam/RPMS/lam-oscar-7.0-2.i586.rpm executing:/opt/c3-4/cexec --pipe c3cmd-filter mkdir - p /tmp/tmpinstallrpm/ executing:/opt/c3-4/cpush /tftpboot/rpm/libaio-devel-0.3.93- 4.i386.rpm /tmp/tmpinstallrpm/ executing:/opt/c3- 4/cpush /opt/oscar/packages/lam/RPMS/lam-with-gm-oscar- module-7.0-2.i586.rpm /tmp/tmpinstallrpm/ executing:/opt/c3- 4/cpush /opt/oscar/packages/lam/RPMS/lam-oscar-module- 7.0-2.i586.rpm /tmp/tmpinstallrpm/ executing:/opt/c3- 4/cpush /opt/oscar/packages/lam/RPMS/lam-with-gm-oscar- 7.0-2.i586.rpm /tmp/tmpinstallrpm/ executing:/opt/c3-4/cpush /tftpboot/rpm/libaio-0.3.93- 4.i386.rpm /tmp/tmpinstallrpm/ executing:/opt/c3- 4/cpush /opt/oscar/packages/lam/RPMS/lam-oscar-7.0- 2.i586.rpm /tmp/tmpinstallrpm/ executing:/opt/c3-4/cexec --pipe c3cmd-filter rpm - Uvh /tmp/tmpinstallrpm/libaio-devel-0.3.93- 4.i386.rpm /tmp/tmpinstallrpm/lam-with-gm-oscar-module-7.0- 2.i586.rpm /tmp/tmpinstallrpm/lam-oscar-module-7.0- 2.i586.rpm /tmp/tmpinstallrpm/lam-with-gm-oscar-7.0- 2.i586.rpm /tmp/tmpinstallrpm/libaio-0.3.93- 4.i386.rpm /tmp/tmpinstallrpm/lam-oscar-7.0-2.i586.rpm executing:/opt/c3-4/cexec --pipe c3cmd-filter rm - rf /tmp/tmpinstallrpm/ finished with rpms --> Completed install on clients --> Starting run_install_server RPMS that need to be installed: - /tftpboot/rpm/libaio-devel-0.3.93-4.i386.rpm - /tftpboot/rpm/libaio-0.3.93-4.i386.rpm - /opt/oscar/packages/lam/RPMS/lam-with-gm-oscar-module- 7.0-2.i586.rpm - /opt/oscar/packages/lam/RPMS/lam-with-gm-oscar-7.0- 2.i586.rpm - /opt/oscar/packages/lam/RPMS/lam-oscar-module-7.0- 2.i586.rpm - /opt/oscar/packages/lam/RPMS/lam-oscar-7.0-2.i586.rpm RPMS that will be installed: - /tftpboot/rpm/libaio-devel-0.3.93-4.i386.rpm - /opt/oscar/packages/lam/RPMS/lam-with-gm-oscar-module- 7.0-2.i586.rpm - /opt/oscar/packages/lam/RPMS/lam-oscar-module-7.0- 2.i586.rpm - /opt/oscar/packages/lam/RPMS/lam-with-gm-oscar-7.0- 2.i586.rpm - /tftpboot/rpm/libaio-0.3.93-4.i386.rpm - /opt/oscar/packages/lam/RPMS/lam-oscar-7.0-2.i586.rpm executing:rpm -Uvh /tftpboot/rpm/libaio-devel-0.3.93- 4.i386.rpm /opt/oscar/packages/lam/RPMS/lam-with-gm- oscar-module-7.0- 2.i586.rpm /opt/oscar/packages/lam/RPMS/lam-oscar-module- 7.0-2.i586.rpm /opt/oscar/packages/lam/RPMS/lam-with-gm- oscar-7.0-2.i586.rpm /tftpboot/rpm/libaio-0.3.93- 4.i386.rpm /opt/oscar/packages/lam/RPMS/lam-oscar-7.0- 2.i586.rpm finished with rpms executed post_server_install phase on server executed post_clients phase on server --> Completed run_install_server --> Starting run_install_image. RPMS that need to be installed: - /tftpboot/rpm/libaio-devel-0.3.93-4.i386.rpm - /tftpboot/rpm/libaio-0.3.93-4.i386.rpm - /opt/oscar/packages/lam/RPMS/lam-with-gm-oscar-module- 7.0-2.i586.rpm - /opt/oscar/packages/lam/RPMS/lam-with-gm-oscar-7.0- 2.i586.rpm - /opt/oscar/packages/lam/RPMS/lam-oscar-module-7.0- 2.i586.rpm - /opt/oscar/packages/lam/RPMS/lam-oscar-7.0-2.i586.rpm RPMS that will be installed: - /tftpboot/rpm/libaio-devel-0.3.93-4.i386.rpm - /opt/oscar/packages/lam/RPMS/lam-with-gm-oscar-module- 7.0-2.i586.rpm - /opt/oscar/packages/lam/RPMS/lam-oscar-module-7.0- 2.i586.rpm - /opt/oscar/packages/lam/RPMS/lam-with-gm-oscar-7.0- 2.i586.rpm - /tftpboot/rpm/libaio-0.3.93-4.i386.rpm - /opt/oscar/packages/lam/RPMS/lam-oscar-7.0-2.i586.rpm Use of uninitialized value in concatenation (.) or string at /opt/oscar/lib/OSCAR/PackageInUn.pm line 619. executing:/bin/mkdir - p /var/lib/systemimager/images/oscarimage/tmp/tmpinstallrpm/ executing:/bin/cp /tftpboot/rpm/libaio-devel-0.3.93- 4.i386.rpm /opt/oscar/packages/lam/RPMS/lam-with-gm- oscar-module-7.0- 2.i586.rpm /opt/oscar/packages/lam/RPMS/lam-oscar-module- 7.0-2.i586.rpm /opt/oscar/packages/lam/RPMS/lam-with-gm- oscar-7.0-2.i586.rpm /tftpboot/rpm/libaio-0.3.93- 4.i386.rpm /opt/oscar/packages/lam/RPMS/lam-oscar-7.0- 2.i586.rpm /var/lib/systemimager/images/oscarimage/tmp/tmpi nstallrpm/ executing:/bin/rpm -Uvh -- root /var/lib/systemimager/images/oscarimage /var/lib/syste mimager/images/oscarimage/tmp/tmpinstallrpm/libaio-devel- 0.3.93- 4.i386.rpm /var/lib/systemimager/images/oscarimage/tmp/tmpi nstallrpm/lam-with-gm-oscar-module-7.0- 2.i586.rpm /var/lib/systemimager/images/oscarimage/tmp/tmpi nstallrpm/lam-oscar-module-7.0- 2.i586.rpm /var/lib/systemimager/images/oscarimage/tmp/tmpi nstallrpm/lam-with-gm-oscar-7.0- 2.i586.rpm /var/lib/systemimager/images/oscarimage/tmp/tmpi nstallrpm/libaio-0.3.93- 4.i386.rpm /var/lib/systemimager/images/oscarimage/tmp/tmpi nstallrpm/lam-oscar-7.0-2.i586.rpm Error executing:/bin/rpm -Uvh -- root /var/lib/systemimager/images/oscarimage /var/lib/syste mimager/images/oscarimage/tmp/tmpinstallrpm/libaio-devel- 0.3.93- 4.i386.rpm /var/lib/systemimager/images/oscarimage/tmp/tmpi nstallrpm/lam-with-gm-oscar-module-7.0- 2.i586.rpm /var/lib/systemimager/images/oscarimage/tmp/tmpi nstallrpm/lam-oscar-module-7.0- 2.i586.rpm /var/lib/systemimager/images/oscarimage/tmp/tmpi nstallrpm/lam-with-gm-oscar-7.0- 2.i586.rpm /var/lib/systemimager/images/oscarimage/tmp/tmpi nstallrpm/libaio-0.3.93- 4.i386.rpm /var/lib/systemimager/images/oscarimage/tmp/tmpi nstallrpm/lam-oscar-7.0-2.i586.rpm :warning: /var/lib/systemimager/images/oscarimage/tmp/tmpin stallrpm/libaio-devel-0.3.93-4.i386.rpm: V3 DSA signature: NOKEY, key ID db42a60e Preparing... ######################################### ######### installing package lam-with-gm-oscar-7.0-2 needs 8MB on the / filesystem Error: cannot install to the image:1 Error: package (lam) failed to install. Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. --> About to run /opt/oscar/packages/oda/scripts/post_install for oda generating the /etc/odaserver file on all oscar clients . /etc/profile.d/c3.sh && cexec 'echo oscar_server > /etc/odaserver' ************************* oscar_cluster ************************* --------- oscarnode1.ncsa.uiuc.edu--------- executed post_install phase on server for package: (oda) Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. --> About to run /opt/oscar/packages/switcher/scripts/post_install for switcher Setting default for tag mpi ("lam-7.0") Attribute successfully set; new attribute setting will be effective for future shells building file list ... done switcher.ini wrote 239 bytes read 42 bytes 80.29 bytes/sec total size is 189 speedup is 0.67 executed post_install phase on server for package: (switcher) Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. executed post_install phase on server for package: (sis) Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. executed post_install phase on server for package: (pvm) Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. --> About to run /opt/oscar/packages/pfilter/scripts/post_install for pfilter (re)starting the pfilter firewall service on this server /etc/init.d/pfilter restart Restarting pfilter: [ OK ] pushing out the clients pfilter firewall configuration file . /etc/profile.d/c3.sh && cpush /etc/pfilter.conf.clients /etc/pfilter.conf building file list ... done wrote 59 bytes read 20 bytes 22.57 bytes/sec total size is 867 speedup is 10.97 (re)starting the pfilter firewall service on the clients . /etc/profile.d/c3.sh && cexec /etc/init.d/pfilter restart ************************* oscar_cluster ************************* --------- oscarnode1.ncsa.uiuc.edu--------- Restarting pfilter:[ OK ] executed post_install phase on server for package: (pfilter) Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. executed post_install phase on server for package: (perl-Qt) Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. --> About to run /opt/oscar/packages/pbs/scripts/post_install for pbs PBS mom config file updated with clienthost: vm- rh90.ncsa.uiuc.edu Pushing config file to clients... building file list ... done config wrote 79 bytes read 42 bytes 34.57 bytes/sec total size is 123 speedup is 1.02 Sending SIGHUP to all moms... ************************* oscar_cluster ************************* --------- oscarnode1.ncsa.uiuc.edu--------- Shutting down PBS Server: [ OK ] Starting PBS Server: [ OK ] Updating pbs_server nodes set node oscarnode1.ncsa.uiuc.edu np = 1 Creating pbs workq queue... Max open servers: 4 set queue workq resources_max.ncpus = 1 set queue workq resources_max.nodect = 1 set queue workq resources_available.nodect = 1 set server resources_available.ncpus = 1 set server resources_available.nodect = 1 set server resources_available.nodes = 1 set server resources_max.ncpus = 1 set server resources_max.nodes = 1 set server scheduler_iteration = 60 Shutting down MAUI Scheduler: [ OK ] Starting MAUI Scheduler: [ OK ] executed post_install phase on server for package: (pbs) Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. --> About to run /opt/oscar/packages/opium/scripts/post_install for opium building file list ... done gshadow wrote 80 bytes read 42 bytes 34.86 bytes/sec total size is 474 speedup is 3.89 building file list ... done wrote 51 bytes read 20 bytes 28.40 bytes/sec total size is 189 speedup is 2.66 building file list ... done passwd wrote 416 bytes read 54 bytes 188.00 bytes/sec total size is 1505 speedup is 3.20 building file list ... done group wrote 383 bytes read 42 bytes 170.00 bytes/sec total size is 578 speedup is 1.36 building file list ... done shadow wrote 336 bytes read 48 bytes 153.60 bytes/sec total size is 932 speedup is 2.43 executed post_install phase on server for package: (opium) Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. --> About to run /opt/oscar/packages/ntpconfig/scripts/post_install for ntpconfig ************************* oscar_cluster ************************* --------- oscarnode1.ncsa.uiuc.edu--------- Shutting down ntpd: [ OK ] ntpd: Synchronizing with time server: [ OK ] Starting ntpd: [ OK ] executed post_install phase on server for package: (ntpconfig) Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. executed post_install phase on server for package: (networking) Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. executed post_install phase on server for package: (mpich) Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. executed post_install phase on server for package: (maui) Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. --> About to run /opt/oscar/packages/loghost/scripts/post_install for loghost ************************* oscar_cluster ************************* --------- oscarnode1.ncsa.uiuc.edu--------- oscar_loghost already set executed post_install phase on server for package: (loghost) Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. executed post_install phase on server for package: (kernel_picker) Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. executed post_install phase on server for package: (hdf5) Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. executed post_install phase on server for package: (disable- services) Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. executed post_install phase on server for package: (c3) Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. executed post_install phase on server for package: (base) Use of uninitialized value in pattern match (m//) at /usr/lib/perl5/site_perl/oda.pm line 2930. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109368&aid=845971&group_id=9368 |