From: Marcio T. <mar...@nu...> - 2013-12-10 01:11:36
|
Hi, I am experimenting with an OpenSharedRoot 5.0 cluster running RHEL 6.3. I’ve noticed a situation where sometimes when a node is booting, it fails to mount the cluster root and it breaks into a root shell for troubleshooting. I would like to disable this. I have done some digging around in the root shell and I think I’ve determined where in the scripts the failure is occurring, and it appears as if the root shell for troubleshooting is there by design. It appears as if there might be ways to configure both the NFS timeout and retry values, to maybe reduce the likelihood of the problem, or to even change the shell that gets executed. I imagine I can configure those things somewhere, but I do not know where. Before I try to just hack the scripts in the initrd, I thought I would ask here for recommendations. Here is what the screen looks like just prior to the root shell: Osr(notice): Starting service rcpbind NET: Registered protocol family 10 lo: Disabled Privacy Extensions [OK] Osr(notice): Starting service rpc.statd [OK] Osr(notice): Mounting ginseng-nfs:/export/cluster-root on /mnt/newroot… Mount.nfs: Connection timed out [FAILED] bash: cannot set terminal process group (-1): Inappropriate ioctl for device ;@node30:~[root@node30/]# The problem is that now I can view all the local disks on the system as root. This is problematic in our environment. If I type “exit”, I get the following: Osr(notice): Back to work.. So this led me to believe this behavior was by design. I started grepping for strings in the root shell and found a couple promising scripts. I believe the error originally happens in “clusterfs_mount” in “/etc/clusterfs-lib.sh”. In there, there appears to be a timeout value and a number of retries. I might be able to mitigate the problem if I could set the number of retries and timeout to an extremely large number, as this way either the system would eventually boot, or it simply would appear to be hung. So one solution would be to do that, but I don’t know which config file might adjust those values. On to the second option. I found that the “Back to work” message is printed by “breakp” in /etc/boot-lib.sh. There is a $shell variable in there. If there is a way that I could change that to 1) either something that prompts for a root password, or 2) something that hangs or reboots the system, that would be good too. Again, the problem is that I do not know which config file might adjust those parameters. I would like to disable the troubleshooting shell. Ideally the system would just hang or prompt for the root password. On a similar vein, it would be nice to be able to disable the option to boot interactively, namely, the functionality which says: Osr(notice): Press ‘I’ to enter interactive startup In GRUB, this is a feature that can be disabled, and it would be nice to disable this in OSR as well. Any guidance on these questions would very much be appreciated. Thank you, — Marcio |