We ran into a problem with our OFED kit. It was developed for PCM 4.1 (xCAT 2.8) , but is not working with PCM/PHPC 4.1.1 (xCAT 2.8.2).
Here are technical details of the issue:
In the OFED kit buildkit.conf, we have
preuninstall=kit_mlnx_ofed_uninstall.sh
This script relies on the OFED ISO to be in the /opt/xcat/kits on the compute node. ISO gets there during installation of the mellanox-OFED-component.
In 4.1, buildkit was creating a single mellanox-OFED-component RPM, so the kit_mlnx_ofed_uninstall.sh was invoked before OFED ISO was removed.
in 4.1.1 buildkit creates two RPMS:
mellanox-OFED-component <= adding/removig of the ISO file from /opt/xcat/ is here
prep_mellanox-OFED-component <= kit_mlnx_ofed_uninstall.sh is here.
During uninstall, once we get down to
yum -y remove mellanox-OFED-component prep_mellanox-OFED-component.,
or
yum -y remove prep_mellanox-OFED-component mellanox-OFED-component
(the order does not matter)
mellanox-OFED-component is erased first, causing kit_mlnx_ofed_uninstall.sh to fail, because ISO is already removed.
Is there a way to make sure that prep_mellanox-OFED-component is removed by yum before mellanox-OFED-component, using RPM dependencies or the like in the buildkit.conf? If there is no such way, we will have to re-implement the uninstall procedure.
fixed with 5565f in master, 322a8 in 2.8.2-pcm and 14d49 in 2.8
Although we remove prep_mellanox-OFED-component first. this issue is still exist due to dependency issue:
[root@compute001 ~]# rpm -qa |grep mellanox
mellanox-OFED-component-1.0-2.noarch
prep_mellanox-OFED-component-1.0-2.noarch
[root@compute001 ~]# yum -y remove prep_mellanox-OFED-component-1.0-2.noarch
Loaded plugins: product-id, security, subscription-manager
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Setting up Remove Process
Resolving Dependencies
--> Running transaction check
---> Package prep_mellanox-OFED-component.noarch 0:1.0-2 will be erased
--> Processing Dependency: prep_mellanox-OFED-component for package: mellanox-OFED-component-1.0-2.noarch
--> Running transaction check
---> Package mellanox-OFED-component.noarch 0:1.0-2 will be erased
--> Finished Dependency Resolution
Dependencies Resolved
=======================================================================================================
Package Arch Version Repository Size
=======================================================================================================
Removing:
prep_mellanox-OFED-component noarch 1.0-2 @xcat-otherpkgs0 0.0
Removing for dependencies:
mellanox-OFED-component noarch 1.0-2 @xcat-otherpkgs0 201 M
=======================================================================================================
Remove 2 Package(s)
Installed size: 201 M
Downloading Packages:
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
Warning: RPMDB altered outside of yum.
Erasing : mellanox-OFED-component-1.0-2.noarch 1/2
Error in PREUN scriptlet in rpm package prep_mellanox-OFED-component
prep_mellanox-OFED-component-1.0-2.noarch was supposed to be removed but is not!
Verifying : prep_mellanox-OFED-component-1.0-2.noarch 1/2
Verifying : mellanox-OFED-component-1.0-2.noarch 2/2
Dependency Removed:
mellanox-OFED-component.noarch 0:1.0-2
Failed:
prep_mellanox-OFED-component.noarch 0:1.0-2
Complete!
Confirmed and reopen this bug, the preuninstall script should be put into kitcomponent meta rpm instead of prerequisite rpm, to be first run before removing anything for the kitcomponent.
Moving bug to 2.8.4 to get fixed in next release.
Get a fix and try to find a OFED system to verify it do fixed the problem.
fixed with c94454 in 2.8 and a2aee1 in master