[OSR-users] Relocate a service

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Mark,

I changed two lines of the script /usr/share/cluster/ip.sh,
I have attached the script and the lines that I have changed are  
immediately below the written CHANGED.
Now the service httpd start on the new ip address (10.43.100.204), and  
if the nodo_1 goes down,
the service is relocated on nodo_2.
when I start the service, in the log messages I have:

Jun  3 13:56:58 clu01 clurgmgrd[14899]: <notice> Starting disabled  
service service:RHTTPD
Jun  3 13:56:59 clu01 in.rdiscd[15391]: setsockopt  
(IP_ADD_MEMBERSHIP): Address already in use
Jun  3 13:56:59 clu01 in.rdiscd[15391]: Failed joining addresses
Jun  3 13:57:00 clu01 clurgmgrd[14899]: <notice> Service  
service:RHTTPD started

but despite this, the service httpd works.
I hope that the information that I am writing you, you are useful.

Bye.

Ing. Stefano Elmopi
Gruppo Darco - Area ICT Sistemi
Via Ostiense 131/L Corpo B, 00154 Roma

cell. 3466147165
tel.  0657060500
email:ste...@so...

Il giorno 01/giu/09, alle ore 11:40, Stefano Elmopi ha scritto:

>
>
> Hi Mark,
>
> my cluster.conf is:
>
> <?xml version="1.0"?>
> <cluster config_version="5" name="cluOCFS2" type="ocfs2">
>
>   <cman expected_votes="1" two_node="1"/>
>
>     <clusternodes>
>
>        <clusternode name="clu01" votes="1" nodeid="1">
>            <com_info>
>              <syslog name="clu01"/>
>              <rootvolume name="/dev/sda2" fstype="ocfs2"/>
>              <eth name="eth0" ip="10.43.100.203"  
> mac="00:15:60:56:75:FD"/>
>              <fenceackserver user="root" passwd="test123"/>
>            </com_info>
>        </clusternode>
>
>        <clusternode name="clu02" votes="1" nodeid="2">
>            <com_info>
>              <syslog name="clu01"/>
>              <rootvolume name="/dev/sda2" fstype="ocfs2"/>
>              <eth name="eth0" ip="10.43.105.15"  
> mac="00:15:60:56:77:11"/>
>              <fenceackserver user="root" passwd="test123"/>
>            </com_info>
>         </clusternode>
>
>        <rm log_level="7" log_facility="local4">
>                 <failoverdomains>
>                         <failoverdomain name="failover" ordered="1">
>                                 <failoverdomainnode name="clu01"  
> priority="1"/>
>                                 <failoverdomainnode name="clu02"  
> priority="2"/>
>                         </failoverdomain>
>                  </failoverdomains>
>                  <resources>
>                         <ip address="10.43.100.204" monitor_link="1"/>
>                         <script file="/etc/init.d/httpd"  
> name="rhttpd"/>
>                  </resources>
>                 <service autostart="0" domain="failover"  
> name="RHTTPD">
>                         <ip ref="10.43.100.204"/>
>                         <script ref="rhttpd"/>
>                 </service>
>        </rm>
>
>      </clusternodes>
>
> </cluster>
>
> and I added the line from your email:
>
> local4.debug      /var/log/rgmanager.log to /etc/syslog.conf
>
> then I rebooted syslog but in the file rgmanager.log is logged only  
> when CMAN start,
> while rgmanager is logged only in the file /va/log/messages but  
> there is no additional information.
> Perhaps additional information can come from this tool, I hope:
>
> rg_test test /etc/cluster/cluster.conf start service RHTTPD
> Running in test mode.
> Starting RHTTPD...
> /usr/share/cluster/ip.sh: line 583: /dev/fd/62: No such file or  
> directory
> /usr/share/cluster/ip.sh: line 673: /dev/fd/62: No such file or  
> directory
> Failed to start RHTTPD
> /usr/share/cluster/ip.sh: line 583: /dev/fd/61: No such file or  
> directory
> +++ Memory table dump +++
>   0xb77306e4 (8 bytes) allocation trace:
>   0xb7734e74 (8 bytes) allocation trace:
>   0xb774aa6c (16 bytes) allocation trace:
>   0xb774b8d0 (16 bytes) allocation trace:
>   0xb77357f0 (16 bytes) allocation trace:
>   0xb774a9f4 (52 bytes) allocation trace:
>   0xb7741194 (912 bytes) allocation trace:
> --- End Memory table dump ---
>
>
>
>
> Bye
>
>
> Ing. Stefano Elmopi
> Gruppo Darco - Area ICT Sistemi
> Via Ostiense 131/L Corpo B, 00154 Roma
>
> cell. 3466147165
> tel.  0657060500
> email:ste...@so...
>
> Il giorno 28/mag/09, alle ore 21:00, Marc Grimme ha scritto:
>
>> On Thursday 28 May 2009 17:14:49 Stefano Elmopi wrote:
>>> Hi Mark,
>>>
>>> I have changed the service element from:
>>>
>>> <service autostart="0" domain="failover" name="RHTTPD">
>>>    <ip ref="10.43.100.204"/>
>>>    <script ref="/etc/init.d/httpd"/>
>>> </service>
>>>
>>> to:
>>>
>>> <service autostart="0" domain="failover" name="RHTTPD">
>>>    <ip ref="10.43.100.204"/>
>>>    <script ref="httpd"/>
>>> </service>
>>>
>>> but does not change the result, if I type clusvcadm -e RHTTPD the
>>> service fails and the messeges log:
>>>
>>> May 28 15:09:00 clu01 clurgmgrd[15046]: <notice> Starting disabled
>>> service service:RHTTPD
>>> May 28 15:09:00 clu01 clurgmgrd[15046]: <notice> start on ip
>>> "10.43.100.204" returned 1 (generic error)
>> Hmm, you could extend logging by catching debug messages from  
>> rgmanager by
>> adding the line
>> local4.debug      /var/log/rgmanager.log
>> to /etc/syslog.conf then restart syslog.
>> See if you can get more information from this file.
>>
>>> May 28 15:09:00 clu01 clurgmgrd[15046]: <warning> #68: Failed to  
>>> start
>>> service:RHTTPD; return value: 1
>>> May 28 15:09:00 clu01 clurgmgrd[15046]: <notice> Stopping service
>>> service:RHTTPD
>>> May 28 15:09:00 clu01 clurgmgrd[15046]: <notice> Service
>>> service:RHTTPD is recovering
>>> May 28 15:09:00 clu01 clurgmgrd[15046]: <warning> #71: Relocating
>>> failed service service:RHTTPD
>>> May 28 15:09:00 clu01 clurgmgrd[15046]: <notice> Service
>>> service:RHTTPD is stopped
>>>
>>> a consideration, when rgmanager start, I should not ping the IP
>>> address 10.43.100.204 ??
>>>
>>> the result of tool rg_test is:
>>>
>>> [root@clu01 ~]# rg_test test /etc/cluster/cluster.conf
>>> Running in test mode.
>>> Loaded 22 resource rules
>>> === Resources List ===
>>> Resource type: script
>>> Agent: script.sh
>>> Attributes:
>>>   name = httpd [ primary unique ]
>>>   file = /etc/init.d/httpd [ unique required ]
>>>   service_name [ inherit("service%name") ]
>>>
>>> Resource type: ip
>>> Instances: 1/1
>>> Agent: ip.sh
>>> Attributes:
>>>   address = 10.43.100.204 [ primary unique ]
>>>   monitor_link = 1
>>>   nfslock [ inherit("service%nfslock") ]
>>>
>>> Resource type: service [INLINE]
>>> Instances: 1/1
>>> Agent: service.sh
>>> Attributes:
>>>   name = RHTTPD [ primary unique required ]
>>>   domain = failover [ reconfig ]
>>>   autostart = 0 [ reconfig ]
>>>   hardrecovery = 0 [ reconfig ]
>>>   exclusive = 0 [ reconfig ]
>>>   nfslock = 0
>>>   recovery = restart [ reconfig ]
>>>   depend_mode = hard
>>>   max_restarts = 0
>>>   restart_expire_time = 0
>>>
>>> === Resource Tree ===
>>> service {
>>>   name = "RHTTPD";
>>>   domain = "failover";
>>>   autostart = "0";
>>>   hardrecovery = "0";
>>>   exclusive = "0";
>>>   nfslock = "0";
>>>   recovery = "restart";
>>>   depend_mode = "hard";
>>>   max_restarts = "0";
>>>   restart_expire_time = "0";
>>>   ip {
>>>     address = "10.43.100.204";
>>>     monitor_link = "1";
>>>     nfslock = "0";
>>>   }
>>>   script {
>>>     name = "httpd";
>>>     file = "/etc/init.d/httpd";
>>>     service_name = "RHTTPD";
>>>   }
>>> }
>>> === Failover Domains ===
>>> Failover domain: failover
>>> Flags: Ordered
>>>   Node clu01 (id 1, priority 1)
>>>   Node clu02 (id 2, priority 2)
>>> === Event Triggers ===
>>> Event Priority Level 100:
>>>   Name: Default
>>>     (Any event)
>>>     File: /usr/share/cluster/default_event_script.sl
>>> +++ Memory table dump +++
>>>   0xb77756e4 (8 bytes) allocation trace:
>>>   0xb7779e74 (8 bytes) allocation trace:
>>>   0xb778fce4 (52 bytes) allocation trace:
>>> --- End Memory table dump ---
>>>
>>>
>>> if I add the line:
>>>
>>> <eth name="eth1" ip="10.43.100.204" mac="00:15:60:56:75:FC"/>
>>>
>>> to section <com_info> of the clu01, the service start:
>>>
>>> /etc/init.d/rgmanager start
>>> Starting Cluster Service Manager:                          [  OK  ]
>>>
>>> the log is:
>>>
>>> May 28 16:59:21 clu01 kernel: dlm: Using TCP for communications
>>> May 28 16:59:30 clu01 clurgmgrd[15209]: <notice> Resource Group
>>> Manager Starting
>>> May 28 16:59:31 clu01 clurgmgrd: [15209]: <err> Failed to remove
>>> 10.43.100.204
>>> May 28 16:59:31 clu01 clurgmgrd[15209]: <notice> stop on ip
>>> "10.43.100.204" returned 1 (generic error)
>> That's clear. This ip is already setup by the bootprocess. So it  
>> cannot be
>> setup.
>>>
>>> clustat
>>> Cluster Status for cluOCFS2 @ Thu May 28 17:00:22 2009
>>> Member Status: Quorate
>>>
>>>  Member Name                                                     ID
>>> Status
>>>  ------ ----                                                      
>>> ----
>>> ------
>>>  clu01
>>> 1 Online, Local, rgmanager
>>>  clu02
>>> 2 Offline
>>>
>>>  Service Name
>>> Owner (Last)                                                      
>>> State
>>>  ------- ----
>>> ----- ------                                                      
>>> -----
>>>  service:RHTTPD
>>> (none)
>>> disabled
>>>
>>>  and:
>>>
>>>  clusvcadm -e RHTTPD
>>> Local machine trying to enable service:RHTTPD...Success
>>> service:RHTTPD is now running on clu01
>>>
>>> but in this case the service does not relocate with the same ip !!
>>>
>>>
>>>
>>> Bye
>>>
>>>
>>>
>>>
>>>
>>> Ing. Stefano Elmopi
>>> Gruppo Darco - Area ICT Sistemi
>>> Via Ostiense 131/L Corpo B, 00154 Roma
>>>
>>> cell. 3466147165
>>> tel.  0657060500
>>> email:ste...@so...
>>
>>
>>
>> -- 
>> Gruss / Regards,
>>
>> Marc Grimme
>> http://www.atix.de/               http://www.open-sharedroot.org/
>>
>