4.4.9. Changes to Kernel Crash Collection (Kdump)
The kernel crash collection tool, kdump, previously generated an initial ramdisk (initrd) for the kdump capture kernel with a custom mkdumprd script. In Red Hat Enterprise Linux 7 the initial ramdisk is generated with dracut, making the process of generating the initial ramdisk easier to maintain.
The doc goes on to list many changes to kdump. A couple that I thought may impact us:
xCAT/postscripts/enablekdump: echo "net $KDIP:$KDPATH" > /etc/kdump.conf
xCAT-server/lib/xcat/plugins/anaconda.pm: # if kdump service is enbaled, add "crashkernel=" and "kdtarget="
xCAT-server/lib/xcat/plugins/anaconda.pm: my $kdump = '';
xCAT-server/lib/xcat/plugins/anaconda.pm: $kdump = $dump;
xCAT-server/lib/xcat/plugins/anaconda.pm: $kdump =~ s/(nfs:\/\/)(\/.*)/${1}${xcatmaster}${2}/;
xCAT-server/lib/xcat/plugins/anaconda.pm: $kcmdline .= " fadump=on fadump_reserve_mem=$crashkernelsize fadump_target=$fadump f
xCAT-server/lib/xcat/plugins/anaconda.pm: $kcmdline .= " fadump=on fadump_reserve_mem=512M fadump_target=$fadump fadump_de
We opened RedHat bugzilla bug 111844, waiting for responses from RedHat.
Gong Jie said he found some method to workaround the problem. Yang Song, please work with Gong Jie on documenting the workaround in the release notes.
The Redhat bug is targetted to RHEL 7.1.
Currently, kdump on redhat 7 stateless is supported, kdump on redhat 7 statelite is not supported yet.
Redhat 7 kdump might fail with "kdump: wrong kdumpnic: eth2 kdump: get_host_ip exited with non-zero status!", due to a Redhat7 kexec-tools bug:
https://bugzilla.linux.ibm.com/show_bug.cgi?id=111844
The detail info on this defect is attached below:
Kdump via network issue on RHEL 7
With Red Hat Enterprise Linux 7 on IBM Powere System, when LHEA (Logical Host Ethernet Adapter) and PowerVM virtual ethernet adapter coexist in the same LPAR (Logical Partition), and Linux Kdump via network was performed, the Linux Kdump may failed in such a situation. The error messages looked like the followings.
This is a bug against Red Hat Enterprise Linux 7 on IBM Power System, not xCAT.
Details and Explaination
One Red Hat Enterprise Linux 7, new network interfaces naming schemes were used. With these new schemes, network interfaces are named based on their physical slot numbers. On IBM Power System with PowerVM, when LHEA or virtual etherenet adapter were used, since they are virtualized devices, and did not have physical slot numbers, Red Hat Enterprise Linux 7 will name them with the conventional Linux kernel ethX style name.
While perform Linux Kdump via network, with kexec() system call, another Linux kernel and initrd image were loaded, which are differenet from the ordinary runtime environment. In the Linux Kdump initrd image, there is chance that the Linux kernel modules of LHEA and virtual network adapter are loaded in a different sequence from the ordinary runtime environment. In such a situation, the name of the network interfaces will change.
Workaround Method
Rename the network adapter which is used for Linux Kdump to a particular new name. As recommended, the new name should begin with "en". Thus, the new name is in a way compatible with the network interfaces naming schemes of Red Hat Enterprise Linux 7. Otherwise, you may suffer compatibility problem with some postscripts of xCAT.
In the following example, network interface eth2 is connected to the management network and will be used for Linux Kdump. We will rename network interface eth2 to enx998.
Then reboot the operating system. After the rebooting, please verify the renaming is done successfully.
Patch and Fixed RPM Package
A patch is available on the Internet. This patch will be integerated into the Errata of Red Hat Enterprise Linux 7, and will be released soon. For advanced user, the patched kexec-tools rpm package is attached. It can be patched with the following steps:
(1) after "genimage", run
to update the kexec-tools
(2) run packimage/liteimg,then nodeset...
Last edit: yangsong 2014-09-03
kdump support on nfs-based statelite is finished
checked in to 2.9:
commit 1a5de449ef866845e4d89123cb90193d10ccd72a
Author: immarvin yangsbj@cn.ibm.com
Date: Thu Sep 11 20:52:23 2014 -0700
diff --git a/xCAT/postscripts/enablekdump b/xCAT/postscripts/enablekdump
index 1a17d2e..3e711af 100755
--- a/xCAT/postscripts/enablekdump
+++ b/xCAT/postscripts/enablekdump
@@ -99,8 +99,10 @@ if [ ! -z "$DUMP" ]; then
# workaround for RHEL6
# the $KDIP:$KDPATH directory will be used to generate the initrd for kdump service
MOUNTPATH=""
MOUNTPATH="/tmp"
else
MOUNTPATH="/var/tmp"
fi
@@ -214,7 +216,8 @@ EOF
fi
else
if (pmatch $OSVER "rhel7") || (pmatch $OSVER "rhels7");then
/bin/mount -o nolock $KDIP:$KDPATH $MOUNTPATH
RHEL7 kdump is working now, the only exception is the ramdisk based statelite. We are trying to limit the effort with statelite support, I documented this restriction in 2.9 release notes. Per discussion in the team meeting, we agreed that we are not planning to fix this kdump problem with rhels 7 ramdisk-based statelite, until we hear customer requirements. Closing this bug.