For those using SATA or doing SATA RAID if one of the SATA drives run
into problems the array will hang. This is not an OpenSSI issue as is
inherent in Linux but I've heard this does not happen if your SATA
controller supports hotplug.
Do take note that in OpenSSI the SATA related hang can affect CFS
clients in the cluster which means a hang at any CFS server can
possibly hang the whole cluster. I've seen this in SSI-1.9.3-pre.
To maintain cluster availability the workaround is to run a watchdog
to take down the local node whenever it hangs when your SATA drives
fail. If you use a software watchdog (softdog.ko), you can choose to
use, among others, the watchdog daemon maintained at
However this watchdog is not OpenSSI friendly and will kill init and
all the processes along with the whole OpenSSI cluster - essentially
rebooting the whole cluster. I've made some minor modifications for
OpenSSI so this watchdog reboots only the local node. It has been
tested on OpenSSI-1.9.3-pre FC3.
You can find my patch for watchdog-5.2.6 attached here and at
Untar the original watchdog-5.2.6.tar.gz and apply the patch.
# patch -p1 -d watchdog-5.2.6 < watchdog-5.2.6-ssi.diff
Get latest updates about Open Source Projects, Conferences and News.