From: Mark H. <hla...@at...> - 2009-08-31 13:56:10
|
Stefano, is your problem solved now ? -Mark On Monday 31 August 2009 15:00:06 Stefano Elmopi wrote: > Hi Mark, > > excuse me if I answer only now but I was on vacation. > I am writing to confirm that the problem was that I didn't change the > file /etc/hosts....... > ...... as we say in Italy...... I'm lost in a glass of water !! > > > Thanks, > > Stefano > > > > Date: Wed, 1 Jul 2009 17:33:48 +0200 > From: Mark Hlawatschek <hla...@at...> > Subject: Re: [OSR-users] Problem with rgmanager > To: ope...@li... > Message-ID: <200...@at...> > Content-Type: text/plain; charset="utf-8" > > Stefano, > > could you please give us an overview of your network setup ? > # cat /etc/hosts > # ip addr > > Could you also send me the output of the following command: > # cman_tool status > # cman_tool nodes > > Thanks, > > Mark > > > > Ing. Stefano Elmopi > Gruppo Darco - Area ICT Sistemi > Via Ostiense 131/L Corpo B, 00154 Roma > > cell. 3466147165 > tel. 0657060500 > email:ste...@so... > > Il giorno 01/lug/09, alle ore 14:16, Stefano Elmopi ha scritto: > > Hi, > > > > I am happening a strange thing. I created a cluster with two nodes, > > clu01 and clu02, > > with the Shared-Root on a SAN. The node clu01 has the IP address > > 10.43.100.203 > > > > <clusternode name="clu01" votes="1" nodeid="1"> > > <com_info> > > <syslog name="clu01"/> > > <rootvolume name="/dev/sda2" fstype="ocfs2"/> > > <eth name="eth0" ip="10.43.100.203" mac="00:15:60:56:75:FD"/> > > <fenceackserver user="root" passwd="test123"/> > > </com_info> > > </clusternode> > > > > I also configured the service Httpd on the cluster and everything > > worked well. > > I had to change IP address (10.43.105.10) to the node_1 and so I > > preferred to do the procedure again, > > formatting the Shared-Root but not the server clu01. > > The cluster starts with the new IP address and when I am starting > > rgmanager: > > > > /etc/init.d/rgmanager strat > > > > everything seems ok > > but in the log file I read: > > > > Jun 27 10:13:14 clu01 kernel: dlm: Using TCP for communications > > Jun 27 10:13:14 clu01 kernel: dlm: Can't create listening comms socket > > Jun 27 10:13:14 clu01 kernel: dlm: cannot start dlm lowcomms -98 > > > > and the output of command : > > > > clustat > > Cluster Status for cluOCFS2 @ Wed Jul 1 13:35:10 2009 > > Member Status: Quorate > > > > Member Name > > ID Status > > ------ ---- > > ---- ------ > > clu01 > > 1 Online, Local > > clu02 > > 2 Offline > > > > > > missing part on the service. > > if I try to make the restart of rgmanager, the log is: > > > > Jun 28 04:02:08 clu01 syslogd 1.4.1: restart. > > Jul 1 13:37:31 clu01 kernel: dlm: Using TCP for communications > > Jul 1 13:37:31 clu01 kernel: dlm: Can't create listening comms socket > > Jul 1 13:37:41 clu01 kernel: BUG: soft lockup - CPU#0 stuck for > > 10s! [clurgmgrd:13230] > > Jul 1 13:37:41 clu01 kernel: > > Jul 1 13:37:41 clu01 kernel: Pid: 13230, comm: clurgmgrd > > Jul 1 13:37:41 clu01 kernel: EIP: 0060:[<c0608d90>] CPU: 0 > > Jul 1 13:37:41 clu01 kernel: EIP is at _spin_lock+0x7/0xf > > Jul 1 13:37:41 clu01 kernel: EFLAGS: 00000286 Tainted: G > > (2.6.18-92.1.22.el5PAE #1) > > Jul 1 13:37:41 clu01 kernel: EAX: f1d93a98 EBX: f1d93a94 ECX: > > 00000000 EDX: e1958000 > > Jul 1 13:37:41 clu01 kernel: ESI: f1d93a94 EDI: f1e31000 EBP: > > e1958ebc DS: 007b ES: 007b > > Jul 1 13:37:41 clu01 kernel: CR0: 8005003b CR2: b7f48000 CR3: > > 37caef00 CR4: 000006f0 > > Jul 1 13:37:41 clu01 kernel: [<c06080ef>] __mutex_lock_slowpath > > +0x19/0x7c > > Jul 1 13:37:41 clu01 kernel: [<c0608161>] .text.lock.mutex+0xf/0x14 > > Jul 1 13:37:41 clu01 kernel: [<f8c2ff6b>] close_connection > > +0x11/0x5a [dlm] > > Jul 1 13:37:41 clu01 kernel: [<f8c308fd>] dlm_lowcomms_start+0x53e/ > > 0x59c [dlm] > > Jul 1 13:37:41 clu01 kernel: [<c06076a4>] schedule+0x920/0x9cd > > Jul 1 13:37:41 clu01 kernel: [<f8c2e879>] dlm_new_lockspace > > +0x87/0x742 [dlm] > > Jul 1 13:37:41 clu01 kernel: [<f8c33d38>] device_write+0x310/0x4b6 > > [dlm] > > Jul 1 13:37:41 clu01 kernel: [<f8c33a28>] device_write+0x0/0x4b6 > > [dlm] > > Jul 1 13:37:41 clu01 kernel: [<c0470283>] vfs_write+0xa1/0x143 > > Jul 1 13:37:41 clu01 kernel: [<c0470875>] sys_write+0x3c/0x63 > > Jul 1 13:37:41 clu01 kernel: [<c0404eff>] syscall_call+0x7/0xb > > Jul 1 13:37:41 clu01 kernel: ======================= > > > > > > > > if I change the file cluster.conf, put back the old IP > > (10.43.100.203) and create a new initrd, > > rgmanager works well. > > This happens even with the same IP subnet 10.43.100, in practice it > > seems that it works only > > with the single IP address with which it was originally created the > > cluster ! > > > > > > Thanks. > > > > > > > > Ing. Stefano Elmopi > > Gruppo Darco - Area ICT Sistemi > > Via Ostiense 131/L Corpo B, 00154 Roma > > > > cell. 3466147165 > > tel. 0657060500 > > email:ste...@so... |