From: Marc G. <gr...@at...> - 2012-11-12 14:12:21
|
This looks perfectly ok. The failed after activation of the vgs is because there are clustered vgs present (which again is perfectly ok). Then the bootup continues as expected as can be seen in the logs. I think I don't understand the problem you are talking about. Perhaps you could try to explain your problem in more detail. Thanks Marc. ----- Original Message ----- From: "Jorge Silva" <me...@je...> To: "Marc Grimme" <gr...@at...> Sent: Monday, November 12, 2012 3:04:03 PM Subject: Re: Problem with VG activation clvmd runs at 100% Marc Hi, thanks for you help, I got rid of some of the clustered volumes for clarity, appologies for the unorthodox screen log, I must get console logging done via serial... I booted into emergency mode, and ls -l /etc/rc3.d/S* lrwxrwxrwx 1 root root 11 Nov 7 17:21 /etc/rc3.d/S99local -> ../rc.local.comoonics Edited rc.sysinit Line, on mine is line 205 and continued. Attached is output from set-x Thanks Jorge On Mon, Nov 12, 2012 at 6:01 AM, Marc Grimme < gr...@at... > wrote: Hi Jorge, try to boot the cluster into emergency mode by adding a "1" to the boot prompt. With this you should end up in a console. Then issue the following commands and send me the output: ls -l /etc/rc3.d/S* Also add the following line before lvm is started (rc.sysinit Line 199): + set -x + Then we should see more at the next bootup. Thanks Marc. ----- Original Message ----- From: "Jorge Silva" < me...@je... > To: "Marc Grimme" < gr...@at... > Sent: Monday, November 12, 2012 4:43:22 AM Subject: Problem with VG activation clvmd runs at 100% Marc Hi, apologise for not getting back to you and it has been some time since we communicated. I am an Equity derivatives trader, and at the time I was helping a friend set up a trading Equity platform as a proof of concept, it was pretty low priority and was more of a learning tool for me, so I didn't spend too much time on it . I was forced to upgrade recently as this has moved from proof of concept to the next step. I apologise for bothering, but I have spent the last few days trying to get an OSR cluster running on Centos 6.3 +gfs2 and I believe I am almost there, but I am stuck, I am unsure what is going on. The cluster seems to be working ok, but climbed is running at 100% and I can restart it and still the same result. Attached is a screen shot of the final phase of booting showing the error. The cluster is quorate and shut-down to works OK. Thanks Jorge an output of vgscan: vgscan connect() failed on local socket: No such file or directory Internal cluster locking initialisation failed. WARNING: Falling back to local file-based locking. Volume Groups with the clustered attribute will be inaccessible. Reading all physical volumes. This may take a while... Skipping clustered volume group VG_OSROOT Found volume group "VG_DBDISKS" using metadata type lvm2 Skipping clustered volume group VG_SDATA Found volume group "vg_osroot" using metadata type lvm2 These are the lvm2 : lvm2-2.02.95-10.el6_3.2.x86_64 lvm2-cluster-2.02.95-10.el6_3.2.x86_64 I think these are what is causing the problem, but I'm not sure... lrwxrwxrwx 1 root root 41 Nov 11 21:54 /var/run/cman_admin -> /var/comoonics/chroot//var/run/cman_admin lrwxrwxrwx 1 root root 42 Nov 11 21:54 /var/run/cman_client -> /var/comoonics/chroot//var/run/cman_client I have tried the re-ordering /etc/cdsltab, it currently is : bind /.cluster/cdsl/%(nodeid)s/var/run /var/run __initrd bind /.cluster/cdsl/%(nodeid)s/var/lock /var/lock __initrd I have tried : rm -fr /var/cache/comoonics-bootimage/* ;rm -fr /var/cache/comoonics-repository/*; mkinitrd -V /boot/initrd_sr-$(uname -r).img $(uname -r) My cluster conf for the nodes look like: <clusternode name="smc01b" nodeid="2" votes="1"> <multicast addr="229.192.0.2" interface="bond0.1762"/> <fence> <method name="single"> <device ipaddr="172.17.50.16" name="ipmi"/> </method> </fence> <com_info> <eth mac="00:30:48:F0:10:54" master="bond0" name="eth0" slave="yes"/> <eth mac="00:30:48:F0:10:55" master="bond0" name="eth1" slave="yes"/> <eth name="bond0"> <properties> <property name="BONDING_OPTS">BONDING_OPTS="miimon=100 mode=4 xmit_hash_policy=2 "</property> </properties> </eth> <eth name="bond0.1762" ip="172.17.60.12" mask="255.255.255.0" gateway=""> <properties> <property name="VLAN">VLAN="yes"</property> </properties> </eth> </com_info> </clusternode> I have tried re-installing the packages below: comoonics-base-py-5.0-2_rhel6.noarch comoonics-bootimage-listfiles-rhel6-fencelib-5.0-1_rhel6.noarch comoonics-cdsl-py-5.0-3_rhel6.noarch comoonics-imsd-py-5.0-1_rhel6.noarch comoonics-cluster-py-5.0-2_rhel6.noarch comoonics-bootimage-listfiles-rhel6-5.0-4_rhel6.noarch comoonics-bootimage-imsd-5.0-5_rhel6.noarch comoonics-bootimage-listfiles-firmware-5.0-2_rhel6.noarch comoonics-release-5.0-2_rhel6.noarch comoonics-tools-py-5.0-2_rhel6.noarch comoonics-bootimage-initscripts-5.0-10_rhel6.noarch comoonics-imsd-plugins-py-5.0-1_rhel6.noarch comoonics-bootimage-extras-network-5.0-2_rhel6.noarch comoonics-cluster-tools-py-5.0-3_rhel6.noarch comoonics-bootimage-5.0-19_rhel6.noarch comoonics-bootimage-listfiles-rhel6-gfs2-5.0-3_rhel6.noarch comoonics-bootimage-extras-localconfigs-5.0-9_rhel6.noarch comoonics-bootimage-listfiles-all-5.0-4_rhel6.noarch |