Re: [OSR-users] Problem with VG activation clvmd runs at 100%

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Jorge,
sorry for the delay but I was quite busy on the last days.
Nevertheless I'm don't understand the problem.
Let's first start at the point I think could lead to problems during
shutdown and friends.
Are the control files in /var/run/cman* being created from the bootsr
initscript or do you still have to create them manually.
If they are not created I would still be very interested in the output of
bash -x /etc/init.d/bootsr start
after a node has been started.

If not we need to dig deeper into the problems during shutdown.
I would then also change the clustered flag for the other volume group.
Again as long as you don't change the size it wont hurt.
And it's only for better understanding the problem.

Another command I'd like to see is a cman_tool services on the other
node (say node 2) while the shutdown node is being stuck (say node 1).

Thanks Marc.
Am 15.11.2012 19:08, schrieb Jorge Silva:
> Marc
>
> Hi, I believe the problem is related to the clsuter services not
> shutting down.  init 0, will not work with 1 or more nodes, init 6
> will only work when 1 node is present.  When more than 1 node is
> present the node with the init 6  will have to be fenced as it will
> not shut down.  I believe the cluster components aren't shutting down
> (this also happens with init 6 when more than one node is present)  -
> I still see cluster traffic on the network, this is periodic.
>
> 12:42:00.547615 IP 172.17.62.12.hpoms-dps-lstn >
> 229.192.0.2.netsupport: UDP, length 119
>
> At the point that the system will not shut down, it still is a cluster
> member and there is still cluster traffic.
>
> 1 node :
> [root@bwccs302 ~]# init 0
>
> Can't connect to default. Skipping.
> Shutting down Cluster Module - cluster monitor: [  OK  ]
> Shutting down ricci: [  OK  ]
> Shutting down Avahi daemon: [  OK  ]
> Shutting down oddjobd: [  OK  ]
> Stopping saslauthd: [  OK  ]
> Stopping sshd: [  OK  ]
> Shutting down sm-client: [  OK  ]
> Shutting down sendmail: [  OK  ]
> Stopping imsd via sshd: [  OK  ]
> Stopping snmpd: [  OK  ]
> Stopping crond: [  OK  ]
> Stopping HAL daemon: [  OK  ]
> Shutting down ntpd: [  OK  ]
> Deactivating clustered VG(s):   0 logical volume(s) in volume group
> "VG_SDATA" now active
> [  OK  ]
> Signaling clvmd to exit [  OK  ]
> clvmd terminated[  OK  ]
> Stopping lldpad: [  OK  ]
> Stopping system message bus: [  OK  ]
> Stopping multipathd daemon: [  OK  ]
> Stopping rpcbind: [  OK  ]
> Stopping auditd: [  OK  ]
> Stopping nslcd: [  OK  ]
> Shutting down system logger: [  OK  ]
> Stopping sssd: [  OK  ]
> Stopping gfs dependent services osr(notice) ..bindmounts.. [  OK  ]
> Stopping gfs2 dependent services Starting clvmd:
> Activating VG(s):   2 logical volume(s) in volume group "VG_SDATA" now
> active
>   1 logical volume(s) in volume group "vg_osroot" now active
> [  OK  ]
> osr(notice) ..bindmounts.. [  OK  ]
> Stopping monitoring for VG vg_osroot:   1 logical volume(s) in volume
> group "vg_osroot" unmonitored
> [  OK  ]
> Sending all processes the TERM signal... [  OK  ]
> Sending all processes the KILL signal... [  OK  ]
> Saving random seed:  [  OK  ]
> Syncing hardware clock to system time [  OK  ]
> Turning off quotas:  quotaoff: Cannot change state of GFS2 quota.
> quotaoff: Cannot change state of GFS2 quota.
> [FAILED]
> Unmounting file systems:  [  OK  ]
> init: Re-executing /sbin/init
> Halting system...
> osr(notice) Scanning for Bootparameters...
> osr(notice) Starting ATIX exitrd
> osr(notice) Comoonics-Release
> osr(notice) comoonics Community Release 5.0 (Gumpn)
> osr(notice) Internal Version $Revision: 1.18 $ $Date: 2011-02-11
> 15:09:53 $
> osr(debug) Calling cmd /sbin/halt -d -p
> osr(notice) Preparing chrootcp: cannot stat
> `/mnt/newroot/dev/initctl': No such file or directory
> [  OK  ]
> osr(notice) com-realhalt: detected distribution: rhel6, clutype: gfs,
> rootfs: gfs2
> osr(notice) Restarting init process in chroot[  OK  ]
> osr(notice) Moving dev filesystem[  OK  ]
> osr(notice) Umounting filesystems in oldroot ( /mnt/newroot/sys
> /mnt/newroot/proc)
> osr(notice) Umounting /mnt/newroot/sys[  OK  ]
> osr(notice) Umounting /mnt/newroot/proc[  OK  ]
> osr(notice) Umounting filesystems in oldroot (/mnt/newroot/var/run
> /mnt/newroot/var/lock /mnt/newroot/.cdsl.local)
> osr(notice) Umounting /mnt/newroot/var/runinit: Re-executing /sbin/init
> [  OK  ]
> osr(notice) Umounting /mnt/newroot/var/lock[  OK  ]
> osr(notice) Umounting /mnt/newroot/.cdsl.local[  OK  ]
> osr(notice) Umounting oldroot /mnt/newroot[  OK  ]
> osr(notice) Breakpoint "halt_umountoldroot" detected forking a shell
> bash: no job control in this shell
>
> Type help to get more information..
> Type exit to continue work..
> -------------------------------------------------------------
>
> comoonics 1 > cman_tool: unknown option cman_tool
> comoonics 2 > comoonics 2 > Version: 6.2.0
> Config Version: 1
> Cluster Name: ProdCluster01
> Cluster Id: 11454
> Cluster Member: Yes
> Cluster Generation: 4
> Membership state: Cluster-Member
> Nodes: 1
> Expected votes: 4
> Quorum device votes: 3
> Total votes: 4
> Node votes: 1
> Quorum: 3
> Active subsystems: 10
> Flags:
> Ports Bound: 0 11 178
> Node name: smc01b
> Node ID: 2
> Multicast addresses: 229.192.0.2
> Node addresses: 172.17.62.12
> comoonics 3 > fence domain
> member count  1
> victim count  0
> victim now    0
> master nodeid 2
> wait state    none
> members       2
>
> dlm lockspaces
> name          clvmd
> id            0x4104eefa
> flags         0x00000000
> change        member 1 joined 1 remove 0 failed 0 seq 1,1
> members       2
>
> comoonics 4 > bash: exitt: command not found
> comoonics 5 > exit
> osr(notice) Back to work..
> Deactivating clustered VG(s):   0 logical volume(s) in volume group
> "VG_SDATA" now active
>
> It hung at the point above - so I re-ran with the edit set -x in line 207.
> 1 -node:
> [root@bwccs302 ~]# init 0
> [root@bwccs302 ~
> Can't connect to default. Skipping.
> Shutting down Cluster Module - cluster monitor: [  OK  ]
> Shutting down ricci: [  OK  ]
> Shutting down Avahi daemon: [  OK  ]
> Shutting down oddjobd: [  OK  ]
> Stopping saslauthd: [  OK  ]
> Stopping sshd: [  OK  ]
> Shutting down sm-client: [  OK  ]
> Shutting down sendmail: [  OK  ]
> Stopping imsd via sshd: [  OK  ]
> Stopping snmpd: [  OK  ]
> Stopping crond: [  OK  ]
> Stopping HAL daemon: [  OK  ]
> Shutting down ntpd: [  OK  ]
> Deactivating clustered VG(s):   0 logical volume(s) in volume group
> "VG_SDATA" n                       ow active
> [  OK  ]
> Signaling clvmd to exit [  OK  ]
> clvmd terminated[  OK  ]
> Stopping lldpad: [  OK  ]
> Stopping system message bus: [  OK  ]
> Stopping multipathd daemon: [  OK  ]
> Stopping rpcbind: [  OK  ]
> Stopping auditd: [  OK  ]
> Stopping nslcd: [  OK  ]
> Shutting down system logger: [  OK  ]
> Stopping sssd: [  OK  ]
> Stopping gfs dependent services osr(notice) ..bindmounts.. [  OK  ]
> Stopping gfs2 dependent services Starting clvmd:
> Activating VG(s):   1 logical volume(s) in volume group "vg_osroot"
> now active
>   2 logical volume(s) in volume group "VG_SDATA" now active
> [  OK  ]
> osr(notice) ..bindmounts.. [  OK  ]
> Stopping monitoring for VG vg_osroot:   1 logical volume(s) in volume
> group "vg_                       osroot" unmonitored
> [  OK  ]
> Sending all processes the TERM signal... [  OK  ]
> Sending all processes the KILL signal... [  OK  ]
> Saving random seed:  [  OK  ]
> Syncing hardware clock to system time [  OK  ]
> Turning off quotas:  quotaoff: Cannot change state of GFS2 quota.
> quotaoff: Cannot change state of GFS2 quota.
> [FAILED]
> Unmounting file systems:  [  OK  ]
> init: Re-executing /sbin/init
> Halting system...
> osr(notice) Scanning for Bootparameters...
> osr(notice) Starting ATIX exitrd
> osr(notice) Comoonics-Release
> osr(notice) comoonics Community Release 5.0 (Gumpn)
> osr(notice) Internal Version $Revision: 1.18 $ $Date: 2011-02-11
> 15:09:53 $
> osr(notice) Preparing chrootcp: cannot stat
> `/mnt/newroot/dev/initctl': No such file or directory [  OK  ]
> osr(notice) com-realhalt: detected distribution: rhel6, clutype: gfs,
> rootfs: gfs2
> osr(notice) Restarting init process in chroot[  OK  ]
> osr(notice) Moving dev filesystem[  OK  ]
> osr(notice) Umounting filesystems in oldroot ( /mnt/newroot/sys
> /mnt/newroot/proc)
> osr(notice) Umounting /mnt/newroot/sys[  OK  ]
> osr(notice) Umounting /mnt/newroot/proc[  OK  ]
> osr(notice) Umounting filesystems in oldroot (/mnt/newroot/var/run
> /mnt/newroot/var/lock /mnt/newroot/.cdsl.local)
> osr(notice) Umounting /mnt/newroot/var/runinit: Re-executing
> /sbin/init [  OK  ]
> osr(notice) Umounting /mnt/newroot/var/lock[  OK  ]
> osr(notice) Umounting /mnt/newroot/.cdsl.local[  OK  ]
> osr(notice) Umounting oldroot /mnt/newroot[  OK  ]
> + clusterfs_services_stop '' '' 0
> ++ repository_get_value rootfs
> +++ repository_normalize_value rootfs
> ++ local key=rootfs
> ++ local default=
> ++ local repository=
> ++ '[' -z '' ']'
> ++ repository=comoonics
> ++ local value=
> ++ '[' -f /var/cache/comoonics-repository/comoonics.rootfs ']'
> +++ cat /var/cache/comoonics-repository/comoonics.rootfs
> ++ value=gfs2
> ++ echo gfs2
> ++ return 0
> + local rootfs=gfs2
> + gfs2_services_stop '' '' 0
> + local chroot_path=
> + local lock_method=
> + local lvm_sup=0
> + '[' -n 0 ']'
> + '[' 0 -eq 0 ']'
> + /etc/init.d/clvmd stop
> Deactivating clustered VG(s):   0 logical volume(s) in volume group
> "VG_SDATA" now active
>
> with 2 nodes + quorate when init 6 is issued:
>
> [root@bwccs304 ~]# init 6
> [root@bwccs304 ~
> Can't connect to default. Skipping.
> Shutting down Cluster Module - cluster monitor: [  OK  ]
> Shutting down ricci: [  OK  ]
> Shutting down Avahi daemon: [  OK  ]
> Shutting down oddjobd: [  OK  ]
> Stopping saslauthd: [  OK  ]
> Stopping sshd: [  OK  ]
> Shutting down sm-client: [  OK  ]
> Shutting down sendmail: [  OK  ]
> Stopping imsd via sshd: [  OK  ]
> Stopping snmpd: [  OK  ]
> Stopping crond: [  OK  ]
> Stopping HAL daemon: [  OK  ]
> Shutting down ntpd: [  OK  ]
> Deactivating clustered VG(s):   0 logical volume(s) in volume group
> "VG_SDATA" now active
> [  OK  ]
> Signaling clvmd to exit [  OK  ]
> clvmd terminated[  OK  ]
> Stopping lldpad: [  OK  ]
> Stopping system message bus: [  OK  ]
> Stopping multipathd daemon: [  OK  ]
> Stopping rpcbind: [  OK  ]
> Stopping auditd: [  OK  ]
> Stopping nslcd: [  OK  ]
> Shutting down system logger: [  OK  ]
> Stopping sssd: [  OK  ]
> Stopping gfs dependent services osr(notice) ..bindmounts.. [  OK  ]
> Stopping gfs2 dependent services Starting clvmd:
> Activating VG(s):   1 logical volume(s) in volume group "vg_osroot"
> now active
>   2 logical volume(s) in volume group "VG_SDATA" now active
> [  OK  ]
> osr(notice) ..bindmounts.. [  OK  ]
> Stopping monitoring for VG vg_osroot:   1 logical volume(s) in volume
> group "vg_osroot" unmonitored
> [  OK  ]
> Sending all processes the TERM signal... [  OK  ]
> qdiskd[15713]: Unregistering quorum device.
>
> Sending all processes the KILL signal... dlm: clvmd: no userland
> control daemon, stopping lockspace
> dlm: OSRoot: no userland control daemon, stopping lockspace
> [  OK  ]
>  - stops here and will not die...  Still have full cluster coms
>
> Thanks
> jorge
>
> On Tue, Nov 13, 2012 at 9:32 AM, Marc Grimme <gr...@at...
> <mailto:gr...@at...>> wrote:
>
>     Hi Jorge,
>     because of the "init 0".
>     Please issue the following commands prior to init 0.
>     # Make it a little more chatty
>     $ com-chroot setparameter debug
>     # Break after before cluster will be stopped
>     $ com-chroot setparameter step halt_umountoldroot
>
>     Then issue a init 0.
>     This should lead you to a breakpoint during shutdown (hopefully,
>     cause sometimes the console gets confused).
>     In side the breakpoint issue:
>     $ cman_tool status
>     $ cman_tool services
>     # Continue shutdown
>     $ exit
>     Then send me the output.
>
>     If this fails also do as follows:
>     $ com-chroot vi com-realhalt.sh
>     # go to line 207 (before clusterfs_services_stop) is called and
>     add a set -x
>     $ init 0
>
>     Send the output.
>     Thanks Marc.
>
>     ----- Original Message -----
>     From: "Jorge Silva" <me...@je... <mailto:me...@je...>>
>     To: "Marc Grimme" <gr...@at... <mailto:gr...@at...>>
>     Cc: ope...@li...
>     <mailto:ope...@li...>
>     Sent: Tuesday, November 13, 2012 3:22:37 PM
>     Subject: Re: Problem with VG activation clvmd runs at 100%
>
>     Marc
>
>
>     Hi, thanks for the info, it helps. I have also noticed that gfs2
>     entries in the fstab get ignored on boot, I have added in
>     rc.local. I have done a bit more digging and the issue I described
>     below:
>
>
>     "I am still a bit stuck when nodes with gfs2 mounted don't restart
>     if instructed to do so, but I will read some more."
>
>
>     If I issue a init 6 on a nodes they will restart. If I issue init
>     0, then I have the problem the node start to shut down, but will
>     stay in the cluster. I have to shut it off, it will not shut down,
>     this is the log.
>
>
>
>     [root@bwccs304 ~]# init 0
>
>
>     Can't connect to default. Skipping.
>     Shutting down Cluster Module - cluster monitor: [ OK ]
>     Shutting down ricci: [ OK ]
>     Shutting down oddjobd: [ OK ]
>     Stopping saslauthd: [ OK ]
>     Stopping sshd: [ OK ]
>     Shutting down sm-client: [ OK ]
>     Shutting down sendmail: [ OK ]
>     Stopping imsd via sshd: [ OK ]
>     Stopping snmpd: [ OK ]
>     Stopping crond: [ OK ]
>     Stopping HAL daemon: [ OK ]
>     Stopping nscd: [ OK ]
>     Shutting down ntpd: [ OK ]
>     Deactivating clustered VG(s): 0 logical volume(s) in volume group
>     "VG_SDATA" now active
>     [ OK ]
>     Signaling clvmd to exit [ OK ]
>     clvmd terminated[ OK ]
>     Stopping lldpad: [ OK ]
>     Stopping system message bus: [ OK ]
>     Stopping multipathd daemon: [ OK ]
>     Stopping rpcbind: [ OK ]
>     Stopping auditd: [ OK ]
>     Stopping nslcd: [ OK ]
>     Shutting down system logger: [ OK ]
>     Stopping sssd: [ OK ]
>     Stopping gfs dependent services osr(notice) ..bindmounts.. [ OK ]
>     Stopping gfs2 dependent services Starting clvmd:
>     Activating VG(s): 2 logical volume(s) in volume group "VG_SDATA"
>     now active
>     1 logical volume(s) in volume group "vg_osroot" now active
>     [ OK ]
>     osr(notice) ..bindmounts.. [ OK ]
>     Stopping monitoring for VG VG_SDATA: 1 logical volume(s) in volume
>     group "VG_SDATA" unmonitored
>     [ OK ]
>     Stopping monitoring for VG vg_osroot: 1 logical volume(s) in
>     volume group "vg_osroot" unmonitored
>     [ OK ]
>     Sending all processes the TERM signal... [ OK ]
>     Sending all processes the KILL signal... [ OK ]
>     Saving random seed: [ OK ]
>     Syncing hardware clock to system time [ OK ]
>     Turning off quotas: quotaoff: Cannot change state of GFS2 quota.
>     quotaoff: Cannot change state of GFS2 quota.
>     [FAILED]
>     Unmounting file systems: [ OK ]
>     init: Re-executing /sbin/init
>     Halting system...
>     osr(notice) Scanning for Bootparameters...
>     osr(notice) Starting ATIX exitrd
>     osr(notice) Comoonics-Release
>     osr(notice) comoonics Community Release 5.0 (Gumpn)
>     osr(notice) Internal Version $Revision: 1.18 $ $Date: 2011-02-11
>     15:09:53 $
>     osr(notice) Preparing chrootcp: cannot stat
>     `/mnt/newroot/dev/initctl': No such file or directory
>     [ OK ]
>     osr(notice) com-realhalt: detected distribution: rhel6, clutype:
>     gfs, rootfs: gfs2
>     osr(notice) Restarting init process in chroot[ OK ]
>     osr(notice) Moving dev filesystem[ OK ]
>     osr(notice) Umounting filesystems in oldroot ( /mnt/newroot/sys
>     /mnt/newroot/proc)
>     osr(notice) Umounting /mnt/newroot/sys[ OK ]
>     osr(notice) Umounting /mnt/newroot/proc[ OK ]
>     osr(notice) Umounting filesystems in oldroot (/mnt/newroot/var/run
>     /mnt/newroot/var/lock /mnt/newroot/.cdsl.local)
>     osr(notice) Umounting /mnt/newroot/var/runinit: Re-executing
>     /sbin/init
>     [ OK ]
>     osr(notice) Umounting /mnt/newroot/var/lock[ OK ]
>     osr(notice) Umounting /mnt/newroot/.cdsl.local[ OK ]
>     osr(notice) Umounting oldroot /mnt/newroot[ OK ]
>     Deactivating clustered VG(s): 0 logical volume(s) in volume group
>     "VG_SDATA" now active
>
>
>
>
>
>     On Tue, Nov 13, 2012 at 2:43 AM, Marc Grimme < gr...@at...
>     <mailto:gr...@at...> > wrote:
>
>
>     Jorge,
>     you don't need to be doubtful about the fact that the volume group
>     for the root file system is not flagged as clustered. This has no
>     implications whatsoever on the gfs2 file system.
>
>     It will only be a problem whenever the lvm settings of the
>     vg_osroot change (size, number of lvs etc.).
>
>     Nevertheless while thinking about your problem I think I had the
>     idea on how to fix this problem on being able to have the root vg
>     clustered also. I will provide new packages in the next days that
>     should deal with the problem.
>
>     Keep in mind that there is a difference between cman_tool services
>     and the lvm usage.
>     clvmd only uses the locktable clvmd shown by cman_tool services
>     and the other locktables are relevant to the file systems and
>     other services (fenced, rgmanager, ..). This is a complete
>     different use case.
>
>     Try to elaborate a bit more on the fact
>
>     "I am still a bit stuck when nodes with gfs2 mounted don't restart
>     if instructed to do so, but I will read some more."
>     What do you mean with it? How does this happen? This sounds like
>     something you should have a look at.
>
>
>     "Once thing that I can confirm is
>     osr(notice): Detecting nodeid & nodename
>     This does not always display the correct info, but it doesn't seem
>     to be a problem either ?"
>
>     You should always look at the nodeid the nodename is (more or
>     less) only descriptive and might not be set as expected. But the
>     nodeid should always be consistent. Does this help?
>
>     About your notes (I only take the relevant ones):
>
>     1. osr(notice): Creating clusterfiles /var/run/cman_admin
>     /var/run/cman_client.. [OK]
>     This message should not be misleading but only tells the these
>     control files are being created inside the ramdisk. This has
>     nothing to do with these files on your root file system.
>     Nevertheless /etc/init.d/bootsr should take over this part and
>     create the files. Please send me another
>     bash -x /etc/init.d/bootsr start
>     output. Please when those files are not existant.
>
>     2. vgs
>
>     VG #PV #LV #SN Attr VSize VFree
>     VG_SDATA 1 2 0 wz--nc 1000.00g 0
>     vg_osroot 1 1 0 wz--n- 60.00g 0
>
>     This is perfectly ok. This only means the vg is not clustered. But
>     the filesystem IS. This does not have any connection.
>
>     Hope this helps.
>     Let me know about the open issues.
>
>     Regards
>
>     Marc.
>
>
>     ----- Original Message -----
>     From: "Jorge Silva" < me...@je... <mailto:me...@je...> >
>     To: "Marc Grimme" < gr...@at... <mailto:gr...@at...> >
>
>     Sent: Tuesday, November 13, 2012 2:15:23 AM
>     Subject: Re: Problem with VG activation clvmd runs at 100%
>
>
>     Marc
>
>
>     Hi - I believe I have solved my problem, with your help, thank
>     you. Yet, I'm not sure how I caused it - but the root volume group
>     as you pointed out had the clustered attribute(and I had to have
>     done something silly along the way). I re-installed from scratch
>     see notes below and then just to prove that is a problem, I
>     changed the attribute of the rootfs- vgchange -cy and rebooted and
>     I ran into trouble, I changed it back and it is fine so that does
>     cause problems on start-up, I'm not sure I understand why as there
>     is an active quorum for the clvm to join and take part..
>
>
>     Despite it not being marked as a cluster volume cman_tool services
>     show it as being, but clvmd status doesn't ? Is it safe to write
>     to it with multiple nodes mounted?
>
>
>     I am still a bit stuck when nodes with gfs2 mounted don't restart
>     if instructed to do so, but I will read some more.
>
>
>
>
>     Once thing that I can confirm is
>     osr(notice): Detecting nodeid & nodename
>
>
>     This does not always display the correct info, but it doesn't seem
>     to be a problem either ?
>
>
>
>
>     Thanks
>     Jorge
>
>
>     Notes:
>     I decided to start from scratch and I blew away the rootfs and
>     started from scratch as per the website. My assumption - that I
>     edited something and messed it up (I did look at a lot of the
>     scripts to try to "figure out and fix" the problem, I can send the
>     history if you want or I can edit and contribute).
>
>
>     I rebooted the server and I had an issue - I didn't disable
>     selinux so I had to intervene in the boot stage. That completed,
>     but I noticed that :
>
>
>
>     osr(notice): Starting network configuration for lo0 [OK]
>     osr(notice): Detecting nodeid & nodename
>
>
>     Is blank, but somehow the correct nodeid and name was deduced.
>
>
>     I had to rebuild the ram disk to fix the selinux disabled. I also
>     added the following
>
>     yum install pciutils - the mkinitrd warned about this so, I
>     installed it.
>     I also installed :
>     yum install cluster-snmp
>     yum install rgmanager
>     in lvm
>
>
>     On this reboot I noticed that despite this message
>
>     sr(notice): Creating clusterfiles /var/run/cman_admin
>     /var/run/cman_client.. [OK]
>
>
>     Starting clvmd: dlm: Using TCP for communications
>
>
>     Activating VG(s): File descriptor 3 (/dev/console) leaked on
>     vgchange invocation. Parent PID 15995: /bin/bash
>     File descriptor 4 (/dev/console) leaked on vgchange invocation.
>     Parent PID 15995: /bin/bash
>     Skipping clustered volume group VG_SDATA
>     1 logical volume(s) in volume group "vg_osroot" now active
>
>
>     the links weren't created and I did this manually
>
>
>
>     ln -sf /var/comoonics/chroot//var/run/cman_admin /var/run/cman_admin
>     ln -sf /var/comoonics/chroot//var/run/cman_client /var/run/cman_client
>
>
>     I could then get clusterstatus etc, and clvmd was running ok
>
>
>     I looked in /etc/lvm/lvm.conf and locking_type = 4 ?
>
>
>     I then issued
>
>
>     lvmconf --enable cluster - and this changed /etc/lvm/lvm.conf
>     locking_type = 3.
>
>
>     vgscan correctly showed up clusterd volumes and was working ok.
>
>
>
>
>     I did not rebuild the ramdisk (I can confirm that the lvm .conf in
>     the ramdisk has locking_type=4) I have rebooted and everything is
>     working.
>
>     Starting clvmd: dlm: Using TCP for communications
>
>
>     Activating VG(s): File descriptor 3 (/dev/console) leaked on
>     vgchange invocation. Parent PID 15983: /bin/bash
>     File descriptor 4 (/dev/console) leaked on vgchange invocation.
>     Parent PID 15983: /bin/bash
>     Skipping clustered volume group VG_SDATA
>     1 logical volume(s) in volume group "vg_osroot" now active
>
>
>
>
>
>
>     I have rebooted a number of times and am confident that things are ok,
>
>
>     I decided to add two other nodes to the mix and I can confirm that
>     everytime a new node is added these files are missing :
>
>
>     /var/run/cman_admin
>     /var/run/cman_client
>     But I can see from the logs:
>
>
>
>     osr(notice): Creating clusterfiles /var/run/cman_admin
>     /var/run/cman_client.. [OK]
>
>
>     despite the above message, also, the information below is not
>     always detected, but still the nodeid etc is correct...
>
>
>     osr(notice): Detecting nodeid & nodename
>
>
>
>
>     So now I have 3 nodes in the cluster and things look ok:
>
>
>
>     [root@bwccs302 ~]# cman_tool services
>     fence domain
>     member count 3
>     victim count 0
>     victim now 0
>     master nodeid 2
>     wait state none
>     members 2 3 4
>
>
>     dlm lockspaces
>     name home
>     id 0xf8ee17aa
>     flags 0x00000008 fs_reg
>     change member 3 joined 1 remove 0 failed 0 seq 3,3
>     members 2 3 4
>
>
>     name clvmd
>     id 0x4104eefa
>     flags 0x00000000
>     change member 3 joined 1 remove 0 failed 0 seq 15,15
>     members 2 3 4
>
>
>     name OSRoot
>     id 0xab5404ad
>     flags 0x00000008 fs_reg
>     change member 3 joined 1 remove 0 failed 0 seq 7,7
>     members 2 3 4
>
>
>     gfs mountgroups
>     name home
>     id 0x686e3fc4
>     flags 0x00000048 mounted
>     change member 3 joined 1 remove 0 failed 0 seq 3,3
>     members 2 3 4
>
>
>     name OSRoot
>     id 0x659f7afe
>     flags 0x00000048 mounted
>     change member 3 joined 1 remove 0 failed 0 seq 7,7
>     members 2 3 4
>
>
>
>     service clvmd status
>     clvmd (pid 25771) is running...
>     Clustered Volume Groups: VG_SDATA
>     Active clustered Logical Volumes: LV_HOME LV_DEVDB
>
>
>     it doesn't believe that the root file-system is clustered despite
>     the output from the above.
>
>
>
>     [root@bwccs302 ~]# vgs
>     VG #PV #LV #SN Attr VSize VFree
>     VG_SDATA 1 2 0 wz--nc 1000.00g 0
>     vg_osroot 1 1 0 wz--n- 60.00g 0
>
>
>     The above got me thinking on what you wanted me to do to diable
>     the clusterd flag on the root volume - with it left on I was
>     having problems (not sure how it got turned) on.
>
>
>     With everything working ok, I remade ramdisk and now lvm.conf=3..
>
>
>     The systems start up and things look ok.
>
>

-- 

Marc Grimme

Tel: +49 (0)89 452 35 38-140
Fax: +49 (0)89 452 35 38-290 
E-Mail: gr...@at...

ATIX Informationstechnologie und Consulting AG | Einsteinstrasse 10 |
85716 Unterschleissheim | www.atix.de | www.comoonics.org

Registergericht: Amtsgericht Muenchen, Registernummer: HRB 168930, USt.-Id.: 
DE209485962 | Vorstand: Marc Grimme, Mark Hlawatschek, Thomas Merz (Vors.) |
Vorsitzender des Aufsichtsrats: Dr. Martin Buss