#2638 updatenode servicenode make cfgloc lost on sn

2.7
closed
5
2012-09-19
2012-03-01
BaiYuan
No

After I execute updatenode p7hv16s32p15 servicenode -V,I find there is no cfgloc /etc/xcat/ in SN;
I manually create a cfgloc file in /etc/xcat/ SN, then I execute updatenode p7hv16s32p15 servicenode -V on MN again,I find cfgloc on SN lost again.
[root@p7hv16s32p03 postscripts]# lsxcatd -v
Version 2.7 (svn r, built Wed Feb 22 01:35:37 EST 2012)

The workaround in my MN is :
change xcatserver:
if [ $? -ne 0 ]; then
sed s/host=[^\|]*/host=$MASTER/ /etc/xcat/cfgloc > /etc/xcat/cfgloc.new
mv /etc/xcat/cfgloc.new /etc/xcat/cfgloc
else
mv /etc/xcat/cfgloc /etc/xcat/cfgloc.db2
fi

to :

if [ $? -ne 0 ]; then
  sed s/host=[^\|]*/host=$MASTER/ /etc/xcat/cfgloc > /etc/xcat/cfgloc.new
  mv /etc/xcat/cfgloc.new /etc/xcat/cfgloc
else
  cp /etc/xcat/cfgloc /etc/xcat/cfgloc.db2
fi

If you want to look into it ,you can login my MN 9.114.34.125

Discussion

  • Guang Cheng Li

    Guang Cheng Li - 2012-03-01

    I worked with Bai Yuan to debug this problem, I found that the servicenode postscript call "xcatserver -d", the following code in xcatserver deletes the /etc/xcat/cfgloc:

    getcredentials.awk xcat_cfgloc | grep -v '<'|sed -e 's/<//' -e 's/&/&/' -e 's/&quot/"/' -e "s/'/'/" > /etc/xcat/cfgloc
    # if not DB2
    grep "DB2" /etc/xcat/cfgloc 2>&1 1> /dev/null
    if [ $? -ne 0 ]; then
    sed s/host=[^\|]/host=$MASTER/ /etc/xcat/cfgloc > /etc/xcat/cfgloc.new
    mv /etc/xcat/cfgloc.new /etc/xcat/cfgloc
    else
    mv /etc/xcat/cfgloc /etc/xcat/cfgloc.db2
    fi
    chmod 600 /etc/xcat/cfgloc

    In DB2 environment the code "mv /etc/xcat/cfgloc /etc/xcat/cfgloc.db2" moves the file to cfgloc.db2, the only two places using cfgloc.db2 are postscripts db2install and db2sqlsetup, however, the db2install and db2sqlsetup were run before the servicenode postscript was run.

    Bai Yuan will open a bug to address this problem.

    A straight forward way to fix this problem is to change "mv /etc/xcat/cfgloc /etc/xcat/cfgloc.db2" to "cp /etc/xcat/cfgloc /etc/xcat/cfgloc.db2", I do not know if this is what we want.

     
  • Lissa Valletta

    Lissa Valletta - 2012-03-01

    Full installation should not fail if you the postscripts are setup correctly for the servicenode. db2install is run after servicenode. You are correct, you cannot just run the servicenode postscript by it self in the DB2 environment The db2install script must run after and it will copy down a new cfgloc file to replace it. The idea was we could update any cfgloc file changes at this time. There is a change password command in db2sqlsetup.
    The problem with not moving it, is during install if it is copied down we have not setup DB2 and we will fail everything. We cannot use cfgloc until after db2install is run and the database is setup, so cfgloc cannot exist.

    I can see one option, If you could add the check for UPDATENODE=1 env variable, where we know that it is updatenode and not install. Then I think we could do a copy in that case and servicenode might work. The only thing is if there has not been a successful install and installation of DB2, and they are just trying to use servicenode to fix that up it won't work.
    Please if you change the logic test the following because this was a difficult setup of postscripts to make work.
    Do a new install of a Servicenode on AIX and Linux with DB2.
    Run your updatenode -P serivcenode

     
  • Lissa Valletta

    Lissa Valletta - 2012-03-01

    The questiion is why would you just run updatenode -P servicenode. I think the proper thing to do would be run updatenode <SN> -P to run all the postscripts. If you did that it would not happen.

     
  • Lissa Valletta

    Lissa Valletta - 2012-03-01

    Also you postscripts are setup incorrect for Linux, from the DB2 doc
    node,postscripts,postbootscripts,comments,disable
    "xcatdefaults","syslog,remoteshell,syncfiles","otherpkgs",,
    "service","servicenode,xcatserver,xcatclient","db2install,odbcsetup",,

    You have
    "service","servicenode","db2install,servicenode,odbcsetup",,
    "xcatdefaults","syslog,remoteshell,syncfiles","otherpkgs",,
    "p7hv16s32p14",,,,
    "p7hv16s32p15",,"db2install,servicenode,odbcsetup,llserver.sh",,

    You have changed the order and the order is critical. servicenode should not be in postbootscripts

     
  • Lissa Valletta

    Lissa Valletta - 2012-03-01

    One more test case once the postscipts are correct. So it runs all the postscripts
    updatenode <servicenode> -P

     
  • Lissa Valletta

    Lissa Valletta - 2012-03-01

    So I put a fix out on your server in /install/postscripts/xcatserver.lissa.
    Runnin updatenode p7hv16s32p15 -P servicenode worked ok.
    you need to try and install.
    I put the original back because when I ran updatenode p7hv16s32p15 -P with either the original xcatserver or my new one. I get this error

    p7hv16s32p15: Running postscript: db2install
    Error: A fatal error was encountered, the following information may help identify a bug: Not a subroutine reference at /usr/lib64/perl5/IO/Select.pm line 105, <$pfd> line 153.
    ERROR/WARNING: communication with the xCAT server seems to have been ended prematurely

     
  • Lissa Valletta

    Lissa Valletta - 2012-03-01

    We can check in xcatserver.lissa code if it install, We need to get rid of that error though. As I said it also occurs with the old xcatserver

     
  • Lissa Valletta

    Lissa Valletta - 2012-03-01

    One other thing I fixed notice the comment was put in the wrong field of the service node table.
    Object name: p7hv16s32p15
    arch=ppc64
    cons=hmc
    .
    .
    .
    setupipforward=Starts all services on all service nodes

    .
    .
    .

     
  • Lissa Valletta

    Lissa Valletta - 2012-03-01

    Got rid of a lot of entries also, every service was set . Like ldap

     
  • Lissa Valletta

    Lissa Valletta - 2012-03-01

    The setupipforward=Starts all services on all service nodes was causing the Error: A fatal error was encountered, the following information may help
    identify a bug: Not a subroutine reference at /usr/lib64/perl5/IO/Select.pm
    line 105, <$pfd> line 153.
    ERROR/WARNING: communication with the xCAT server seems to have been ended
    prematurely

    probably out of AAsn.pm

     
  • Lissa Valletta

    Lissa Valletta - 2012-03-01

    Ran install, updatenode -P servicenode and updatenode -P
    Code seems to work
    Checked in xcatserver revision 11724 to trunk

     
  • Lissa Valletta

    Lissa Valletta - 2012-03-02

    Fixed 2.6 revision 11728

     
  • BaiYuan

    BaiYuan - 2012-03-09

    Lissa,
    I verified it on xcat2.7 svn r11798 and xcat2.6.11 svn r11798; I don't hit it any more.
    I closed it .
    Thanks.

     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks