From: Stefano E. <ste...@so...> - 2009-08-31 13:23:28
|
Hi Mark, excuse me if I answer only now but I was on vacation. I am writing to confirm that the problem was that I didn't change the file /etc/hosts....... ...... as we say in Italy...... I'm lost in a glass of water !! Thanks, Stefano Date: Wed, 1 Jul 2009 17:33:48 +0200 From: Mark Hlawatschek <hla...@at...> Subject: Re: [OSR-users] Problem with rgmanager To: ope...@li... Message-ID: <200...@at...> Content-Type: text/plain; charset="utf-8" Stefano, could you please give us an overview of your network setup ? # cat /etc/hosts # ip addr Could you also send me the output of the following command: # cman_tool status # cman_tool nodes Thanks, Mark Ing. Stefano Elmopi Gruppo Darco - Area ICT Sistemi Via Ostiense 131/L Corpo B, 00154 Roma cell. 3466147165 tel. 0657060500 email:ste...@so... Il giorno 01/lug/09, alle ore 14:16, Stefano Elmopi ha scritto: > > > Hi, > > I am happening a strange thing. I created a cluster with two nodes, > clu01 and clu02, > with the Shared-Root on a SAN. The node clu01 has the IP address > 10.43.100.203 > > <clusternode name="clu01" votes="1" nodeid="1"> > <com_info> > <syslog name="clu01"/> > <rootvolume name="/dev/sda2" fstype="ocfs2"/> > <eth name="eth0" ip="10.43.100.203" mac="00:15:60:56:75:FD"/> > <fenceackserver user="root" passwd="test123"/> > </com_info> > </clusternode> > > I also configured the service Httpd on the cluster and everything > worked well. > I had to change IP address (10.43.105.10) to the node_1 and so I > preferred to do the procedure again, > formatting the Shared-Root but not the server clu01. > The cluster starts with the new IP address and when I am starting > rgmanager: > > /etc/init.d/rgmanager strat > > everything seems ok > but in the log file I read: > > Jun 27 10:13:14 clu01 kernel: dlm: Using TCP for communications > Jun 27 10:13:14 clu01 kernel: dlm: Can't create listening comms socket > Jun 27 10:13:14 clu01 kernel: dlm: cannot start dlm lowcomms -98 > > and the output of command : > > clustat > Cluster Status for cluOCFS2 @ Wed Jul 1 13:35:10 2009 > Member Status: Quorate > > Member Name > ID Status > ------ ---- > ---- ------ > clu01 > 1 Online, Local > clu02 > 2 Offline > > > missing part on the service. > if I try to make the restart of rgmanager, the log is: > > Jun 28 04:02:08 clu01 syslogd 1.4.1: restart. > Jul 1 13:37:31 clu01 kernel: dlm: Using TCP for communications > Jul 1 13:37:31 clu01 kernel: dlm: Can't create listening comms socket > Jul 1 13:37:41 clu01 kernel: BUG: soft lockup - CPU#0 stuck for > 10s! [clurgmgrd:13230] > Jul 1 13:37:41 clu01 kernel: > Jul 1 13:37:41 clu01 kernel: Pid: 13230, comm: clurgmgrd > Jul 1 13:37:41 clu01 kernel: EIP: 0060:[<c0608d90>] CPU: 0 > Jul 1 13:37:41 clu01 kernel: EIP is at _spin_lock+0x7/0xf > Jul 1 13:37:41 clu01 kernel: EFLAGS: 00000286 Tainted: G > (2.6.18-92.1.22.el5PAE #1) > Jul 1 13:37:41 clu01 kernel: EAX: f1d93a98 EBX: f1d93a94 ECX: > 00000000 EDX: e1958000 > Jul 1 13:37:41 clu01 kernel: ESI: f1d93a94 EDI: f1e31000 EBP: > e1958ebc DS: 007b ES: 007b > Jul 1 13:37:41 clu01 kernel: CR0: 8005003b CR2: b7f48000 CR3: > 37caef00 CR4: 000006f0 > Jul 1 13:37:41 clu01 kernel: [<c06080ef>] __mutex_lock_slowpath > +0x19/0x7c > Jul 1 13:37:41 clu01 kernel: [<c0608161>] .text.lock.mutex+0xf/0x14 > Jul 1 13:37:41 clu01 kernel: [<f8c2ff6b>] close_connection > +0x11/0x5a [dlm] > Jul 1 13:37:41 clu01 kernel: [<f8c308fd>] dlm_lowcomms_start+0x53e/ > 0x59c [dlm] > Jul 1 13:37:41 clu01 kernel: [<c06076a4>] schedule+0x920/0x9cd > Jul 1 13:37:41 clu01 kernel: [<f8c2e879>] dlm_new_lockspace > +0x87/0x742 [dlm] > Jul 1 13:37:41 clu01 kernel: [<f8c33d38>] device_write+0x310/0x4b6 > [dlm] > Jul 1 13:37:41 clu01 kernel: [<f8c33a28>] device_write+0x0/0x4b6 > [dlm] > Jul 1 13:37:41 clu01 kernel: [<c0470283>] vfs_write+0xa1/0x143 > Jul 1 13:37:41 clu01 kernel: [<c0470875>] sys_write+0x3c/0x63 > Jul 1 13:37:41 clu01 kernel: [<c0404eff>] syscall_call+0x7/0xb > Jul 1 13:37:41 clu01 kernel: ======================= > > > > if I change the file cluster.conf, put back the old IP > (10.43.100.203) and create a new initrd, > rgmanager works well. > This happens even with the same IP subnet 10.43.100, in practice it > seems that it works only > with the single IP address with which it was originally created the > cluster ! > > > Thanks. > > > > Ing. Stefano Elmopi > Gruppo Darco - Area ICT Sistemi > Via Ostiense 131/L Corpo B, 00154 Roma > > cell. 3466147165 > tel. 0657060500 > email:ste...@so... > |