From: Mark H. <hla...@at...> - 2009-06-04 10:11:15
|
Hi Stefano, your changes are breaking the logic of the ip.sh resource agent. for example: your changes: #CHANGED !!! /sbin/ip -o -f inet addr | awk '{print $1,$2,$4}' | while read idx dev ifaddr; do isSlave $dev if [ $? -ne 2 ]; then continue fi idx=${idx/:/} echo $dev ${ifaddr/\/*/} ${ifaddr/*\//} #done < <(/sbin/ip -o -f inet addr | awk '{print $1,$2,$4}') done In the while loop, the redirection operator < <(cmd) provides the stdin for the read command. Please note, that the redirection requires the /dev/fd/XX files. (See my previous mail) To verify the redirection mechanism try something like this: # cat < <(ls -l /etc/) -Mark On Wednesday 03 June 2009 14:48:59 Stefano Elmopi wrote: > Hi Mark, > > I changed two lines of the script /usr/share/cluster/ip.sh, > I have attached the script and the lines that I have changed are > immediately below the written CHANGED. > Now the service httpd start on the new ip address (10.43.100.204), and > if the nodo_1 goes down, > the service is relocated on nodo_2. > when I start the service, in the log messages I have: > > Jun 3 13:56:58 clu01 clurgmgrd[14899]: <notice> Starting disabled > service service:RHTTPD > Jun 3 13:56:59 clu01 in.rdiscd[15391]: setsockopt > (IP_ADD_MEMBERSHIP): Address already in use > Jun 3 13:56:59 clu01 in.rdiscd[15391]: Failed joining addresses > Jun 3 13:57:00 clu01 clurgmgrd[14899]: <notice> Service > service:RHTTPD started > > but despite this, the service httpd works. > I hope that the information that I am writing you, you are useful. > > > Bye. > > > > > > Ing. Stefano Elmopi > Gruppo Darco - Area ICT Sistemi > Via Ostiense 131/L Corpo B, 00154 Roma > > cell. 3466147165 > tel. 0657060500 > email:ste...@so... > > Il giorno 01/giu/09, alle ore 11:40, Stefano Elmopi ha scritto: > > Hi Mark, > > > > my cluster.conf is: > > > > <?xml version="1.0"?> > > <cluster config_version="5" name="cluOCFS2" type="ocfs2"> > > > > <cman expected_votes="1" two_node="1"/> > > > > <clusternodes> > > > > <clusternode name="clu01" votes="1" nodeid="1"> > > <com_info> > > <syslog name="clu01"/> > > <rootvolume name="/dev/sda2" fstype="ocfs2"/> > > <eth name="eth0" ip="10.43.100.203" > > mac="00:15:60:56:75:FD"/> > > <fenceackserver user="root" passwd="test123"/> > > </com_info> > > </clusternode> > > > > <clusternode name="clu02" votes="1" nodeid="2"> > > <com_info> > > <syslog name="clu01"/> > > <rootvolume name="/dev/sda2" fstype="ocfs2"/> > > <eth name="eth0" ip="10.43.105.15" > > mac="00:15:60:56:77:11"/> > > <fenceackserver user="root" passwd="test123"/> > > </com_info> > > </clusternode> > > > > <rm log_level="7" log_facility="local4"> > > <failoverdomains> > > <failoverdomain name="failover" ordered="1"> > > <failoverdomainnode name="clu01" > > priority="1"/> > > <failoverdomainnode name="clu02" > > priority="2"/> > > </failoverdomain> > > </failoverdomains> > > <resources> > > <ip address="10.43.100.204" monitor_link="1"/> > > <script file="/etc/init.d/httpd" > > name="rhttpd"/> > > </resources> > > <service autostart="0" domain="failover" > > name="RHTTPD"> > > <ip ref="10.43.100.204"/> > > <script ref="rhttpd"/> > > </service> > > </rm> > > > > </clusternodes> > > > > </cluster> > > > > and I added the line from your email: > > > > local4.debug /var/log/rgmanager.log to /etc/syslog.conf > > > > then I rebooted syslog but in the file rgmanager.log is logged only > > when CMAN start, > > while rgmanager is logged only in the file /va/log/messages but > > there is no additional information. > > Perhaps additional information can come from this tool, I hope: > > > > rg_test test /etc/cluster/cluster.conf start service RHTTPD > > Running in test mode. > > Starting RHTTPD... > > /usr/share/cluster/ip.sh: line 583: /dev/fd/62: No such file or > > directory > > /usr/share/cluster/ip.sh: line 673: /dev/fd/62: No such file or > > directory > > Failed to start RHTTPD > > /usr/share/cluster/ip.sh: line 583: /dev/fd/61: No such file or > > directory > > +++ Memory table dump +++ > > 0xb77306e4 (8 bytes) allocation trace: > > 0xb7734e74 (8 bytes) allocation trace: > > 0xb774aa6c (16 bytes) allocation trace: > > 0xb774b8d0 (16 bytes) allocation trace: > > 0xb77357f0 (16 bytes) allocation trace: > > 0xb774a9f4 (52 bytes) allocation trace: > > 0xb7741194 (912 bytes) allocation trace: > > --- End Memory table dump --- > > > > > > > > > > Bye > > > > > > Ing. Stefano Elmopi > > Gruppo Darco - Area ICT Sistemi > > Via Ostiense 131/L Corpo B, 00154 Roma > > > > cell. 3466147165 > > tel. 0657060500 > > email:ste...@so... > > > > Il giorno 28/mag/09, alle ore 21:00, Marc Grimme ha scritto: > >> On Thursday 28 May 2009 17:14:49 Stefano Elmopi wrote: > >>> Hi Mark, > >>> > >>> I have changed the service element from: > >>> > >>> <service autostart="0" domain="failover" name="RHTTPD"> > >>> <ip ref="10.43.100.204"/> > >>> <script ref="/etc/init.d/httpd"/> > >>> </service> > >>> > >>> to: > >>> > >>> <service autostart="0" domain="failover" name="RHTTPD"> > >>> <ip ref="10.43.100.204"/> > >>> <script ref="httpd"/> > >>> </service> > >>> > >>> but does not change the result, if I type clusvcadm -e RHTTPD the > >>> service fails and the messeges log: > >>> > >>> May 28 15:09:00 clu01 clurgmgrd[15046]: <notice> Starting disabled > >>> service service:RHTTPD > >>> May 28 15:09:00 clu01 clurgmgrd[15046]: <notice> start on ip > >>> "10.43.100.204" returned 1 (generic error) > >> > >> Hmm, you could extend logging by catching debug messages from > >> rgmanager by > >> adding the line > >> local4.debug /var/log/rgmanager.log > >> to /etc/syslog.conf then restart syslog. > >> See if you can get more information from this file. > >> > >>> May 28 15:09:00 clu01 clurgmgrd[15046]: <warning> #68: Failed to > >>> start > >>> service:RHTTPD; return value: 1 > >>> May 28 15:09:00 clu01 clurgmgrd[15046]: <notice> Stopping service > >>> service:RHTTPD > >>> May 28 15:09:00 clu01 clurgmgrd[15046]: <notice> Service > >>> service:RHTTPD is recovering > >>> May 28 15:09:00 clu01 clurgmgrd[15046]: <warning> #71: Relocating > >>> failed service service:RHTTPD > >>> May 28 15:09:00 clu01 clurgmgrd[15046]: <notice> Service > >>> service:RHTTPD is stopped > >>> > >>> a consideration, when rgmanager start, I should not ping the IP > >>> address 10.43.100.204 ?? > >>> > >>> the result of tool rg_test is: > >>> > >>> [root@clu01 ~]# rg_test test /etc/cluster/cluster.conf > >>> Running in test mode. > >>> Loaded 22 resource rules > >>> === Resources List === > >>> Resource type: script > >>> Agent: script.sh > >>> Attributes: > >>> name = httpd [ primary unique ] > >>> file = /etc/init.d/httpd [ unique required ] > >>> service_name [ inherit("service%name") ] > >>> > >>> Resource type: ip > >>> Instances: 1/1 > >>> Agent: ip.sh > >>> Attributes: > >>> address = 10.43.100.204 [ primary unique ] > >>> monitor_link = 1 > >>> nfslock [ inherit("service%nfslock") ] > >>> > >>> Resource type: service [INLINE] > >>> Instances: 1/1 > >>> Agent: service.sh > >>> Attributes: > >>> name = RHTTPD [ primary unique required ] > >>> domain = failover [ reconfig ] > >>> autostart = 0 [ reconfig ] > >>> hardrecovery = 0 [ reconfig ] > >>> exclusive = 0 [ reconfig ] > >>> nfslock = 0 > >>> recovery = restart [ reconfig ] > >>> depend_mode = hard > >>> max_restarts = 0 > >>> restart_expire_time = 0 > >>> > >>> === Resource Tree === > >>> service { > >>> name = "RHTTPD"; > >>> domain = "failover"; > >>> autostart = "0"; > >>> hardrecovery = "0"; > >>> exclusive = "0"; > >>> nfslock = "0"; > >>> recovery = "restart"; > >>> depend_mode = "hard"; > >>> max_restarts = "0"; > >>> restart_expire_time = "0"; > >>> ip { > >>> address = "10.43.100.204"; > >>> monitor_link = "1"; > >>> nfslock = "0"; > >>> } > >>> script { > >>> name = "httpd"; > >>> file = "/etc/init.d/httpd"; > >>> service_name = "RHTTPD"; > >>> } > >>> } > >>> === Failover Domains === > >>> Failover domain: failover > >>> Flags: Ordered > >>> Node clu01 (id 1, priority 1) > >>> Node clu02 (id 2, priority 2) > >>> === Event Triggers === > >>> Event Priority Level 100: > >>> Name: Default > >>> (Any event) > >>> File: /usr/share/cluster/default_event_script.sl > >>> +++ Memory table dump +++ > >>> 0xb77756e4 (8 bytes) allocation trace: > >>> 0xb7779e74 (8 bytes) allocation trace: > >>> 0xb778fce4 (52 bytes) allocation trace: > >>> --- End Memory table dump --- > >>> > >>> > >>> if I add the line: > >>> > >>> <eth name="eth1" ip="10.43.100.204" mac="00:15:60:56:75:FC"/> > >>> > >>> to section <com_info> of the clu01, the service start: > >>> > >>> /etc/init.d/rgmanager start > >>> Starting Cluster Service Manager: [ OK ] > >>> > >>> the log is: > >>> > >>> May 28 16:59:21 clu01 kernel: dlm: Using TCP for communications > >>> May 28 16:59:30 clu01 clurgmgrd[15209]: <notice> Resource Group > >>> Manager Starting > >>> May 28 16:59:31 clu01 clurgmgrd: [15209]: <err> Failed to remove > >>> 10.43.100.204 > >>> May 28 16:59:31 clu01 clurgmgrd[15209]: <notice> stop on ip > >>> "10.43.100.204" returned 1 (generic error) > >> > >> That's clear. This ip is already setup by the bootprocess. So it > >> cannot be > >> setup. > >> > >>> clustat > >>> Cluster Status for cluOCFS2 @ Thu May 28 17:00:22 2009 > >>> Member Status: Quorate > >>> > >>> Member Name ID > >>> Status > >>> ------ ---- > >>> ---- > >>> ------ > >>> clu01 > >>> 1 Online, Local, rgmanager > >>> clu02 > >>> 2 Offline > >>> > >>> Service Name > >>> Owner (Last) > >>> State > >>> ------- ---- > >>> ----- ------ > >>> ----- > >>> service:RHTTPD > >>> (none) > >>> disabled > >>> > >>> and: > >>> > >>> clusvcadm -e RHTTPD > >>> Local machine trying to enable service:RHTTPD...Success > >>> service:RHTTPD is now running on clu01 > >>> > >>> but in this case the service does not relocate with the same ip !! > >>> > >>> > >>> > >>> Bye > >>> > >>> > >>> > >>> > >>> > >>> Ing. Stefano Elmopi > >>> Gruppo Darco - Area ICT Sistemi > >>> Via Ostiense 131/L Corpo B, 00154 Roma > >>> > >>> cell. 3466147165 > >>> tel. 0657060500 > >>> email:ste...@so... > >> > >> -- > >> Gruss / Regards, > >> > >> Marc Grimme > >> http://www.atix.de/ http://www.open-sharedroot.org/ |