Re: [OSR-users] Relocate a service

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Stefano,

your changes are breaking the logic of the ip.sh resource agent.

for example:
your changes:

#CHANGED !!!
	/sbin/ip -o -f inet addr | awk '{print $1,$2,$4}' | while read idx dev 
ifaddr; do
		isSlave $dev
		if [ $? -ne 2 ]; then
			continue
		fi

		idx=${idx/:/}
		echo $dev ${ifaddr/\/*/} ${ifaddr/*\//}
	#done < <(/sbin/ip -o -f inet addr | awk '{print $1,$2,$4}')
	done

In the while loop, the redirection operator < <(cmd) provides the stdin for 
the read command.
Please note, that the redirection requires the /dev/fd/XX files. (See my 
previous mail) 

To verify the redirection mechanism try something like this:

# cat < <(ls -l /etc/)

-Mark

On Wednesday 03 June 2009 14:48:59 Stefano Elmopi wrote:
> Hi Mark,
>
> I changed two lines of the script /usr/share/cluster/ip.sh,
> I have attached the script and the lines that I have changed are
> immediately below the written CHANGED.
> Now the service httpd start on the new ip address (10.43.100.204), and
> if the nodo_1 goes down,
> the service is relocated on nodo_2.
> when I start the service, in the log messages I have:
>
> Jun  3 13:56:58 clu01 clurgmgrd[14899]: <notice> Starting disabled
> service service:RHTTPD
> Jun  3 13:56:59 clu01 in.rdiscd[15391]: setsockopt
> (IP_ADD_MEMBERSHIP): Address already in use
> Jun  3 13:56:59 clu01 in.rdiscd[15391]: Failed joining addresses
> Jun  3 13:57:00 clu01 clurgmgrd[14899]: <notice> Service
> service:RHTTPD started
>
> but despite this, the service httpd works.
> I hope that the information that I am writing you, you are useful.
>
>
> Bye.
>
>
>
>
>
> Ing. Stefano Elmopi
> Gruppo Darco - Area ICT Sistemi
> Via Ostiense 131/L Corpo B, 00154 Roma
>
> cell. 3466147165
> tel.  0657060500
> email:ste...@so...
>
> Il giorno 01/giu/09, alle ore 11:40, Stefano Elmopi ha scritto:
> > Hi Mark,
> >
> > my cluster.conf is:
> >
> > <?xml version="1.0"?>
> > <cluster config_version="5" name="cluOCFS2" type="ocfs2">
> >
> >   <cman expected_votes="1" two_node="1"/>
> >
> >     <clusternodes>
> >
> >        <clusternode name="clu01" votes="1" nodeid="1">
> >            <com_info>
> >              <syslog name="clu01"/>
> >              <rootvolume name="/dev/sda2" fstype="ocfs2"/>
> >              <eth name="eth0" ip="10.43.100.203"
> > mac="00:15:60:56:75:FD"/>
> >              <fenceackserver user="root" passwd="test123"/>
> >            </com_info>
> >        </clusternode>
> >
> >        <clusternode name="clu02" votes="1" nodeid="2">
> >            <com_info>
> >              <syslog name="clu01"/>
> >              <rootvolume name="/dev/sda2" fstype="ocfs2"/>
> >              <eth name="eth0" ip="10.43.105.15"
> > mac="00:15:60:56:77:11"/>
> >              <fenceackserver user="root" passwd="test123"/>
> >            </com_info>
> >         </clusternode>
> >
> >        <rm log_level="7" log_facility="local4">
> >                 <failoverdomains>
> >                         <failoverdomain name="failover" ordered="1">
> >                                 <failoverdomainnode name="clu01"
> > priority="1"/>
> >                                 <failoverdomainnode name="clu02"
> > priority="2"/>
> >                         </failoverdomain>
> >                  </failoverdomains>
> >                  <resources>
> >                         <ip address="10.43.100.204" monitor_link="1"/>
> >                         <script file="/etc/init.d/httpd"
> > name="rhttpd"/>
> >                  </resources>
> >                 <service autostart="0" domain="failover"
> > name="RHTTPD">
> >                         <ip ref="10.43.100.204"/>
> >                         <script ref="rhttpd"/>
> >                 </service>
> >        </rm>
> >
> >      </clusternodes>
> >
> > </cluster>
> >
> > and I added the line from your email:
> >
> > local4.debug      /var/log/rgmanager.log to /etc/syslog.conf
> >
> > then I rebooted syslog but in the file rgmanager.log is logged only
> > when CMAN start,
> > while rgmanager is logged only in the file /va/log/messages but
> > there is no additional information.
> > Perhaps additional information can come from this tool, I hope:
> >
> > rg_test test /etc/cluster/cluster.conf start service RHTTPD
> > Running in test mode.
> > Starting RHTTPD...
> > /usr/share/cluster/ip.sh: line 583: /dev/fd/62: No such file or
> > directory
> > /usr/share/cluster/ip.sh: line 673: /dev/fd/62: No such file or
> > directory
> > Failed to start RHTTPD
> > /usr/share/cluster/ip.sh: line 583: /dev/fd/61: No such file or
> > directory
> > +++ Memory table dump +++
> >   0xb77306e4 (8 bytes) allocation trace:
> >   0xb7734e74 (8 bytes) allocation trace:
> >   0xb774aa6c (16 bytes) allocation trace:
> >   0xb774b8d0 (16 bytes) allocation trace:
> >   0xb77357f0 (16 bytes) allocation trace:
> >   0xb774a9f4 (52 bytes) allocation trace:
> >   0xb7741194 (912 bytes) allocation trace:
> > --- End Memory table dump ---
> >
> >
> >
> >
> > Bye
> >
> >
> > Ing. Stefano Elmopi
> > Gruppo Darco - Area ICT Sistemi
> > Via Ostiense 131/L Corpo B, 00154 Roma
> >
> > cell. 3466147165
> > tel.  0657060500
> > email:ste...@so...
> >
> > Il giorno 28/mag/09, alle ore 21:00, Marc Grimme ha scritto:
> >> On Thursday 28 May 2009 17:14:49 Stefano Elmopi wrote:
> >>> Hi Mark,
> >>>
> >>> I have changed the service element from:
> >>>
> >>> <service autostart="0" domain="failover" name="RHTTPD">
> >>>    <ip ref="10.43.100.204"/>
> >>>    <script ref="/etc/init.d/httpd"/>
> >>> </service>
> >>>
> >>> to:
> >>>
> >>> <service autostart="0" domain="failover" name="RHTTPD">
> >>>    <ip ref="10.43.100.204"/>
> >>>    <script ref="httpd"/>
> >>> </service>
> >>>
> >>> but does not change the result, if I type clusvcadm -e RHTTPD the
> >>> service fails and the messeges log:
> >>>
> >>> May 28 15:09:00 clu01 clurgmgrd[15046]: <notice> Starting disabled
> >>> service service:RHTTPD
> >>> May 28 15:09:00 clu01 clurgmgrd[15046]: <notice> start on ip
> >>> "10.43.100.204" returned 1 (generic error)
> >>
> >> Hmm, you could extend logging by catching debug messages from
> >> rgmanager by
> >> adding the line
> >> local4.debug      /var/log/rgmanager.log
> >> to /etc/syslog.conf then restart syslog.
> >> See if you can get more information from this file.
> >>
> >>> May 28 15:09:00 clu01 clurgmgrd[15046]: <warning> #68: Failed to
> >>> start
> >>> service:RHTTPD; return value: 1
> >>> May 28 15:09:00 clu01 clurgmgrd[15046]: <notice> Stopping service
> >>> service:RHTTPD
> >>> May 28 15:09:00 clu01 clurgmgrd[15046]: <notice> Service
> >>> service:RHTTPD is recovering
> >>> May 28 15:09:00 clu01 clurgmgrd[15046]: <warning> #71: Relocating
> >>> failed service service:RHTTPD
> >>> May 28 15:09:00 clu01 clurgmgrd[15046]: <notice> Service
> >>> service:RHTTPD is stopped
> >>>
> >>> a consideration, when rgmanager start, I should not ping the IP
> >>> address 10.43.100.204 ??
> >>>
> >>> the result of tool rg_test is:
> >>>
> >>> [root@clu01 ~]# rg_test test /etc/cluster/cluster.conf
> >>> Running in test mode.
> >>> Loaded 22 resource rules
> >>> === Resources List ===
> >>> Resource type: script
> >>> Agent: script.sh
> >>> Attributes:
> >>>   name = httpd [ primary unique ]
> >>>   file = /etc/init.d/httpd [ unique required ]
> >>>   service_name [ inherit("service%name") ]
> >>>
> >>> Resource type: ip
> >>> Instances: 1/1
> >>> Agent: ip.sh
> >>> Attributes:
> >>>   address = 10.43.100.204 [ primary unique ]
> >>>   monitor_link = 1
> >>>   nfslock [ inherit("service%nfslock") ]
> >>>
> >>> Resource type: service [INLINE]
> >>> Instances: 1/1
> >>> Agent: service.sh
> >>> Attributes:
> >>>   name = RHTTPD [ primary unique required ]
> >>>   domain = failover [ reconfig ]
> >>>   autostart = 0 [ reconfig ]
> >>>   hardrecovery = 0 [ reconfig ]
> >>>   exclusive = 0 [ reconfig ]
> >>>   nfslock = 0
> >>>   recovery = restart [ reconfig ]
> >>>   depend_mode = hard
> >>>   max_restarts = 0
> >>>   restart_expire_time = 0
> >>>
> >>> === Resource Tree ===
> >>> service {
> >>>   name = "RHTTPD";
> >>>   domain = "failover";
> >>>   autostart = "0";
> >>>   hardrecovery = "0";
> >>>   exclusive = "0";
> >>>   nfslock = "0";
> >>>   recovery = "restart";
> >>>   depend_mode = "hard";
> >>>   max_restarts = "0";
> >>>   restart_expire_time = "0";
> >>>   ip {
> >>>     address = "10.43.100.204";
> >>>     monitor_link = "1";
> >>>     nfslock = "0";
> >>>   }
> >>>   script {
> >>>     name = "httpd";
> >>>     file = "/etc/init.d/httpd";
> >>>     service_name = "RHTTPD";
> >>>   }
> >>> }
> >>> === Failover Domains ===
> >>> Failover domain: failover
> >>> Flags: Ordered
> >>>   Node clu01 (id 1, priority 1)
> >>>   Node clu02 (id 2, priority 2)
> >>> === Event Triggers ===
> >>> Event Priority Level 100:
> >>>   Name: Default
> >>>     (Any event)
> >>>     File: /usr/share/cluster/default_event_script.sl
> >>> +++ Memory table dump +++
> >>>   0xb77756e4 (8 bytes) allocation trace:
> >>>   0xb7779e74 (8 bytes) allocation trace:
> >>>   0xb778fce4 (52 bytes) allocation trace:
> >>> --- End Memory table dump ---
> >>>
> >>>
> >>> if I add the line:
> >>>
> >>> <eth name="eth1" ip="10.43.100.204" mac="00:15:60:56:75:FC"/>
> >>>
> >>> to section <com_info> of the clu01, the service start:
> >>>
> >>> /etc/init.d/rgmanager start
> >>> Starting Cluster Service Manager:                          [  OK  ]
> >>>
> >>> the log is:
> >>>
> >>> May 28 16:59:21 clu01 kernel: dlm: Using TCP for communications
> >>> May 28 16:59:30 clu01 clurgmgrd[15209]: <notice> Resource Group
> >>> Manager Starting
> >>> May 28 16:59:31 clu01 clurgmgrd: [15209]: <err> Failed to remove
> >>> 10.43.100.204
> >>> May 28 16:59:31 clu01 clurgmgrd[15209]: <notice> stop on ip
> >>> "10.43.100.204" returned 1 (generic error)
> >>
> >> That's clear. This ip is already setup by the bootprocess. So it
> >> cannot be
> >> setup.
> >>
> >>> clustat
> >>> Cluster Status for cluOCFS2 @ Thu May 28 17:00:22 2009
> >>> Member Status: Quorate
> >>>
> >>>  Member Name                                                     ID
> >>> Status
> >>>  ------ ----
> >>> ----
> >>> ------
> >>>  clu01
> >>> 1 Online, Local, rgmanager
> >>>  clu02
> >>> 2 Offline
> >>>
> >>>  Service Name
> >>> Owner (Last)
> >>> State
> >>>  ------- ----
> >>> ----- ------
> >>> -----
> >>>  service:RHTTPD
> >>> (none)
> >>> disabled
> >>>
> >>>  and:
> >>>
> >>>  clusvcadm -e RHTTPD
> >>> Local machine trying to enable service:RHTTPD...Success
> >>> service:RHTTPD is now running on clu01
> >>>
> >>> but in this case the service does not relocate with the same ip !!
> >>>
> >>>
> >>>
> >>> Bye
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Ing. Stefano Elmopi
> >>> Gruppo Darco - Area ICT Sistemi
> >>> Via Ostiense 131/L Corpo B, 00154 Roma
> >>>
> >>> cell. 3466147165
> >>> tel.  0657060500
> >>> email:ste...@so...
> >>
> >> --
> >> Gruss / Regards,
> >>
> >> Marc Grimme
> >> http://www.atix.de/               http://www.open-sharedroot.org/