From: Stefano E. <ste...@so...> - 2009-12-17 11:40:19
|
Hi Mark, Yes, now the problem is solved. I had forgotten to change the file /etc/hosts. Thanks. Bye Stefano Stefano, is your problem solved now ? -Mark On Monday 31 August 2009 15:00:06 Stefano Elmopi wrote: > Hi Mark, > > excuse me if I answer only now but I was on vacation. > I am writing to confirm that the problem was that I didn't change the > file /etc/hosts....... > ...... as we say in Italy...... I'm lost in a glass of water !! > > > Thanks, > > Stefano > > > > Date: Wed, 1 Jul 2009 17:33:48 +0200 > From: Mark Hlawatschek <hla...@at...> > Subject: Re: [OSR-users] Problem with rgmanager > To: ope...@li... > Message-ID: <200...@at...> > Content-Type: text/plain; charset="utf-8" > > Stefano, > > could you please give us an overview of your network setup ? > # cat /etc/hosts > # ip addr > > Could you also send me the output of the following command: > # cman_tool status > # cman_tool nodes > > Thanks, > > Mark > > > > Ing. Stefano Elmopi > Gruppo Darco - Area ICT Sistemi > Via Ostiense 131/L Corpo B, 00154 Roma > > cell. 3466147165 > tel. 0657060500 > email:ste...@so... > > Il giorno 01/lug/09, alle ore 14:16, Stefano Elmopi ha scritto: >> Hi, >> >> I am happening a strange thing. I created a cluster with two nodes, >> clu01 and clu02, >> with the Shared-Root on a SAN. The node clu01 has the IP address >> 10.43.100.203 >> >> <clusternode name="clu01" votes="1" nodeid="1"> >> <com_info> >> <syslog name="clu01"/> >> <rootvolume name="/dev/sda2" fstype="ocfs2"/> >> <eth name="eth0" ip="10.43.100.203" mac="00:15:60:56:75:FD"/> >> <fenceackserver user="root" passwd="test123"/> >> </com_info> >> </clusternode> >> >> I also configured the service Httpd on the cluster and everything >> worked well. >> I had to change IP address (10.43.105.10) to the node_1 and so I >> preferred to do the procedure again, >> formatting the Shared-Root but not the server clu01. >> The cluster starts with the new IP address and when I am starting >> rgmanager: >> >> /etc/init.d/rgmanager strat >> >> everything seems ok >> but in the log file I read: >> >> Jun 27 10:13:14 clu01 kernel: dlm: Using TCP for communications >> Jun 27 10:13:14 clu01 kernel: dlm: Can't create listening comms >> socket >> Jun 27 10:13:14 clu01 kernel: dlm: cannot start dlm lowcomms -98 >> >> and the output of command : >> >> clustat >> Cluster Status for cluOCFS2 @ Wed Jul 1 13:35:10 2009 >> Member Status: Quorate >> >> Member Name >> ID Status >> ------ ---- >> ---- ------ >> clu01 >> 1 Online, Local >> clu02 >> 2 Offline >> >> >> missing part on the service. >> if I try to make the restart of rgmanager, the log is: >> >> Jun 28 04:02:08 clu01 syslogd 1.4.1: restart. >> Jul 1 13:37:31 clu01 kernel: dlm: Using TCP for communications >> Jul 1 13:37:31 clu01 kernel: dlm: Can't create listening comms >> socket >> Jul 1 13:37:41 clu01 kernel: BUG: soft lockup - CPU#0 stuck for >> 10s! [clurgmgrd:13230] >> Jul 1 13:37:41 clu01 kernel: >> Jul 1 13:37:41 clu01 kernel: Pid: 13230, comm: clurgmgrd >> Jul 1 13:37:41 clu01 kernel: EIP: 0060:[<c0608d90>] CPU: 0 >> Jul 1 13:37:41 clu01 kernel: EIP is at _spin_lock+0x7/0xf >> Jul 1 13:37:41 clu01 kernel: EFLAGS: 00000286 Tainted: G >> (2.6.18-92.1.22.el5PAE #1) >> Jul 1 13:37:41 clu01 kernel: EAX: f1d93a98 EBX: f1d93a94 ECX: >> 00000000 EDX: e1958000 >> Jul 1 13:37:41 clu01 kernel: ESI: f1d93a94 EDI: f1e31000 EBP: >> e1958ebc DS: 007b ES: 007b >> Jul 1 13:37:41 clu01 kernel: CR0: 8005003b CR2: b7f48000 CR3: >> 37caef00 CR4: 000006f0 >> Jul 1 13:37:41 clu01 kernel: [<c06080ef>] __mutex_lock_slowpath >> +0x19/0x7c >> Jul 1 13:37:41 clu01 kernel: [<c0608161>] .text.lock.mutex+0xf/0x14 >> Jul 1 13:37:41 clu01 kernel: [<f8c2ff6b>] close_connection >> +0x11/0x5a [dlm] >> Jul 1 13:37:41 clu01 kernel: [<f8c308fd>] dlm_lowcomms_start+0x53e/ >> 0x59c [dlm] >> Jul 1 13:37:41 clu01 kernel: [<c06076a4>] schedule+0x920/0x9cd >> Jul 1 13:37:41 clu01 kernel: [<f8c2e879>] dlm_new_lockspace >> +0x87/0x742 [dlm] >> Jul 1 13:37:41 clu01 kernel: [<f8c33d38>] device_write+0x310/0x4b6 >> [dlm] >> Jul 1 13:37:41 clu01 kernel: [<f8c33a28>] device_write+0x0/0x4b6 >> [dlm] >> Jul 1 13:37:41 clu01 kernel: [<c0470283>] vfs_write+0xa1/0x143 >> Jul 1 13:37:41 clu01 kernel: [<c0470875>] sys_write+0x3c/0x63 >> Jul 1 13:37:41 clu01 kernel: [<c0404eff>] syscall_call+0x7/0xb >> Jul 1 13:37:41 clu01 kernel: ======================= >> >> >> >> if I change the file cluster.conf, put back the old IP >> (10.43.100.203) and create a new initrd, >> rgmanager works well. >> This happens even with the same IP subnet 10.43.100, in practice it >> seems that it works only >> with the single IP address with which it was originally created the >> cluster ! >> >> >> Thanks. >> >> >> >> Ing. Stefano Elmopi >> Gruppo Darco - Area ICT Sistemi >> Via Ostiense 131/L Corpo B, 00154 Roma >> >> cell. 3466147165 >> tel. 0657060500 >> email:ste...@so... Ing. Stefano Elmopi Gruppo Darco - Resp. ICT Sistemi Via Ostiense 131/L Corpo B, 00154 Roma cell. 3466147165 tel. 0657060500 email:ste...@so... |