From: SourceForge.net <no...@so...> - 2012-05-18 02:17:44
|
Bugs item #3527139, was opened at 2012-05-15 19:34 Message generated for change (Settings changed) made by hanyf You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1006945&aid=3527139&group_id=208749 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. >Category: HA-SN Group: 2.7.2 Status: Open Resolution: None >Priority: 8 Private: No Submitted By: yan feng han (hanyf) Assigned to: Norm Nott (nott) Summary: HASN:AIX multi-path changes can't be saved after rebooting Initial Comment: The cluster is P7 IH with 12 drawers, the EMS is c250mgrs27-pvt (frame 12), it is AIX 71D SP3 cluster, we set up HA SN on the cluster. The xcat is 2.7.2 05/04 build: [c250mgrs27-pvt][/]> rpm -qa|grep -i xcat perl-xCAT-2.7.2-snap201205030303 openslp-xcat-1.2.1-1 xCAT-dfm-2.7.0-13 xCAT-IBMhpc-2.7.2-snap201205030304 xCAT-2.7.2-snap201205030304 xCAT-client-2.7.2-snap201205030303 xCAT-rmc-2.7.2-snap201205011649 xCAT-server-2.7.2-snap201205030303 [c250mgrs27-pvt][/]> xdsh service "rpm -qa|grep -i xcat"|xcoll ==================================== service ==================================== openslp-xcat-1.2.1-1 xCATsn-2.7.2-snap201205030304 xCAT-rmc-2.7.2-snap201205011649 perl-xCAT-2.7.2-snap201205030303 xCAT-client-2.7.2-snap201205030303 xCAT-server-2.7.2-snap201205030303 We set up HA SN on the cluster.The SNs are c250f12c10ap01 and c250f12c12ap01, we split the CNs to two groups according to the primary SNs they are using: SN10group: primary SN is c250f12c10ap01, backup is c250f12c12ap01 SN12group: primary SN is c250f12c12ap01, backup is c250f12c10ap01 And for the two storage nodes: primary SN is c250f12c12ap01, backup is c250f12c10ap01 [c250mgrs27-pvt][/]> lsdef -l storage Object name: c250f12c04ap29-hf0 cons=fsp conserver=10.0.0.137 groups=storage,lpar,all,mc04,SN12storage hcp=f12cec04 hidden=0 hwtype=lpar id=29 mac=0200001f0004|0200001f0005|0200001f0006 mgt=fsp monserver=c250f12c10ap01,c250f12c10ap01-hf0 nodetype=ppc,osi os=AIX parent=f12cec04 postbootscripts=otherpkgs postscripts=syslog,aixremoteshell,syncfiles,configrmcnode,setupntp,confighfi,percs_basic_set_up,add_sec_ids,add_sec_groups,add_sec_ids.CR,paging_on_HASN,setupnfsv4replication profile=71Ddskls_CSP5_1_IO provmethod=71Ddskls_CSP5_1_IO servicenode=c250f12c12ap01,c250f12c10ap01 status=booted statustime=05-15-2012 13:35:07 xcatmaster=20.12.12.1 Object name: c250f12c06ap29-hf0 cons=fsp conserver=10.0.0.137 groups=storage,lpar,all,mc04,SN12storage hcp=f12cec06 hidden=0 hwtype=lpar id=29 mac=0200002f0004|0200002f0005|0200002f0006 mgt=fsp monserver=c250f12c10ap01,c250f12c10ap01-hf0 nodetype=ppc,osi os=AIX parent=f12cec06 postbootscripts=otherpkgs postscripts=syslog,aixremoteshell,syncfiles,configrmcnode,setupntp,confighfi,percs_basic_set_up,add_sec_ids,add_sec_groups,add_sec_ids.CR,paging_on_HASN,setupnfsv4replication profile=71Ddskls_CSP5_1_IO provmethod=71Ddskls_CSP5_1_IO servicenode=c250f12c12ap01,c250f12c10ap01 status=booted statustime=05-15-2012 06:41:02 xcatmaster=20.12.12.1 The litefile and statelite tables: [c250mgrs27-pvt][/]> tabdump litefile #image,file,options,comments,disable "ALL","/etc/microcode/","rw",, "ALL","/gpfslog/","persistent",, "ALL","/var/adm/ras/gpfslog/","persistent",, "ALL","/var/adm/ras/errlog","persistent",, "ALL","/var/mmfs/","persistent",, "ALL","/var/spool/cron/","persistent",, "GOLD_71Ddskls_SP3_IO","/etc/basecust","persistent",, "71Ddskls_SP41_IO","/etc/basecust","persistent",, "71Ddskls_SP41_IO_1","/etc/basecust","persistent",, "71Ddskls_CSP5_1_IO","/etc/basecust","persistent",, [c250mgrs27-pvt][/]> tabdump statelite|grep storage "storage",,"$noderes.xcatmaster:/install/statelite_data","vers=4",, The problem was after I booted up storage nodes and did "updatenode storage disableMP_gpfs", the AIX multi-path was disabled(lspv showed only local hdisks), but after I rebooted the storage nodes, the AIX multi-path was enabled again(lspv showed all the hdisks). [c250mgrs27-pvt][/]> cat /install/postscripts/disableMP_gpfs /usr/sbin/lsdev -t 001072001410ea0 -F name | xargs -n1 rmdev -Rdl /usr/bin/manage_disk_drivers -d SAS_SCSD -o AIX_non_MPIO /usr/sbin/cfgmgr Norm had worked on the issue for AIX multi-path couldn't been disabled on 03/23. And at that time, the rc.dd_root in the osimage was not correct, so I compared the rc.dd_boot of 71Ddskls_CSP5_1_IO to GOLD_71Ddskls_SP41_IO which had worked well to keep the ODM after disable the multi-path, but they were same: [c250mgrs27-pvt][/]> ls -l /install/nim/spot/GOLD_71Ddskls_SP41_IO/usr/lib/boot/network/rc.dd_boot -r-xr-xr-x 1 root system 21277 Apr 23 13:17 /install/nim/spot/GOLD_71Ddskls_SP41_IO/usr/lib/boot/network/rc.dd_boot [c250mgrs27-pvt][/]> ls -l /install/nim/spot/71Ddskls_CSP5_1_IO/usr/lib/boot/network/rc.dd_boot -r-xr-xr-x 1 root system 21277 May 15 01:53 /install/nim/spot/71Ddskls_CSP5_1_IO/usr/lib/boot/network/rc.dd_boot [c250mgrs27-pvt][/]> sum install/nim/spot/GOLD_71Ddskls_Src.dd_boot/lib/boot/network/ 50629 21 install/nim/spot/GOLD_71Ddskls_SP41_IO/usr/lib/boot/network/rc.dd_boot [c250mgrs27-pvt][/]> sum /install/nim/spot/71Ddskls_CSP5_1_IO/usr/lib/boot/network/rc.dd_boot 50629 21 /install/nim/spot/71Ddskls_CSP5_1_IO/usr/lib/boot/network/rc.dd_boot So it may related with HA SN? Please check it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1006945&aid=3527139&group_id=208749 |