On sles11.1 x86_64
with 2.7.1 new build:
It failed with error:
Retrieving package mlnx-ofa_kernel-kmp-default-1.5.3_2.6.32.12_0.7-OFED.1.5.3.3.0.0.sles11sp1.x86_64 (376/382), 10.6 MiB (60.4 MiB unpacked)
Installing: mlnx-ofa_kernel-kmp-default-1.5.3_2.6.32.12_0.7-OFED.1.5.3.3.0.0.sles11sp1 [.......error]
Installation of mlnx-ofa_kernel-kmp-default-1.5.3_2.6.32.12_0.7-OFED.1.5.3.3.0.0.sles11sp1 failed:
(with --nodeps --force) Error: Subprocess failed. Error: RPM failed:
Kernel image: /boot/vmlinuz-2.6.32.12-0.7-default
Initrd image: /boot/initrd-2.6.32.12-0.7-default
node name not found
Root device (/dev/sda2) not found
error: %post(mlnx-ofa_kernel-kmp-default-1.5.3_2.6.32.12_0.7-OFED.1.5.3.3.0.0.sles11sp1.x86_64) scriptlet failed, exit status 1
Abort, retry, ignore? [a/r/i] (a): a
Problem occured during or after installation or removal of packages:
Installation aborted by user
Please see the above error message for a hint.
TARGETS = halt dbus earlysyslog random sciv10 reboot haldaemon network boot.clock syslog splash_early rpcbind nfs splash network-remotefs gmond sshd single postfix gpfs gettyset cron xcatpostinit
installroot=/install/netboot/sles11.1/x86_64/compute/rootimg
ofeddir=/install/post/otherpkgs/sles11.1/x86_64/ofed/
NODESETSTATE=genimage
/install/postscripts/mlnxofed_ib_install
OS=uname
uname
++ uname
OFED_DIR=$ofeddir
if [ -z "$OFED_DIR" ]; then
# try to default
OFED_DIR=$INSTALL_DIR/post/otherpkgs/$OSVER/$ARCH/ofed
fi
if [ $NODESETSTATE != "genimage" ]; then
# running as a postscript in a full-disk install or AIX diskless install
installroot=""
fi
if [ $OS != "AIX" ]; then
if [ $NODESETSTATE == "install" ] || [ $NODESETSTATE == "boot" ]; then
# Being run from a stateful install postscript
# Copy rpms directly from the xCAT management node and install
mkdir -p /tmp/ofed
rm -f -R /tmp/ofed/*
cd /tmp/ofed
download_dir=`echo $OFED_DIR | cut -d '/' -f3-`
wget -l inf -N -r --waitretry=10 --random-wait --retry-connrefused -t 10 -T 60 -nH --cut-dirs=5 ftp://$SITEMASTER/$download_dir/ 2> /tmp/wget.log
#rpm -Uvh --force libibverbs-devel*.rpm
perl -x mlnxofedinstall --without-32bit --force
rm -Rf /tmp/ofed
fi
if [ $NODESETSTATE == "genimage" ]; then
# Being called from <image>.postinstall script
# Assume we are on the same machine
#if [[ $OS = sles* ]] || [[ $OS = suse* ]] || [[ -f /etc/SuSE-release ]]; then
# For SLES, assume zypper is available on the system running genimage
mkdir $installroot/tmp/ofed_install
cp -r $OFED_DIR $installroot/tmp/ofed_install/
chroot $installroot perl -x /tmp/ofed_install/ofed/mlnxofedinstall --without-32bit --force
rm -rf $installroot/tmp/ofed_install
#fi
fi
fi
'[' Linux '!=' AIX ']'
'[' genimage == install ']'
'[' genimage == boot ']'
'[' genimage == genimage ']'
mkdir /install/netboot/sles11.1/x86_64/compute/rootimg/tmp/ofed_install
cp -r /install/post/otherpkgs/sles11.1/x86_64/ofed/ /install/netboot/sles11.1/x86_64/compute/rootimg/tmp/ofed_install/
chroot /install/netboot/sles11.1/x86_64/compute/rootimg perl -x /tmp/ofed_install/ofed/mlnxofedinstall --without-32bit --force
df: Warning: cannot read table of mounted file systems
This program will install the MLNX_OFED_LINUX package on your machine.
Note that all other Mellanox, OEM, OFED, or Distribution IB packages will be removed.
Failed to uninstall ofa_kernel KMP RPMs
Hi Cao Li,
Would you please verify it using the /opt/xcat/share/xcat/ib/scripts/Mellanox/mlnxofed_ib_install script and /opt/xcat/share/xcat/ib/netboot/sles/ib.sles11.1.x86_64.pkglist from xCAT 2.7.1 new build, instead of the old scripts and old configuration in your environment ?
And please refer to the doc :
https://sourceforge.net/apps/mediawiki/xcat/index.php?title=Managing_the_Mellanox_Infiniband_Network#Mellanox_IB_Interface_Configuration
Any problems, please let me know.
Thanks.
I used the scripts and pkglist from the xCAT 2.71. new build, and run succeessfully to genimage/packimage.
But I re-run the genimage based on the last successful rootimg. The mellanox script could not unistall the package mlnx-ofa_kernel-kmp-default .
I will contact with IB team, and there is a workaround that before run genimage, please clean up the /install/netboot/sles11.1/x86_64/compute/rootimg,
Thanks.
When run genimage twice based on the the last successful rootimg, the secondary will fail. I added a special case for sles11sp1 in the mlnxofed_ib_install. The error does not come out in the rhels6.1 x86_64 environment.
I have checked the code into 2.7 revision 12085 and trunk revision 12086.
This bug has been fixed. close it for xCAT 2.7.2