Menu

#1157 IBMhpc on AIX diskfull: Requisite Failures on PE,ESSL,HPC

v2.4.1
closed
5
2012-09-19
2010-05-28
No

The cluster is c711, xcat is 2.4.1 0518 build.

When I tested HPC integration for ESSL, some pessl lpps couldn't be installed because of "Requisite Failures" for ppe.poe, and one compiler msg lpp couldn't be installed for requiring bos.loc.utf.EN_US.

Updatenode c711f1ec03 -S installp_bundle=compilers,essl installp_flags="-agXY" -V

...etc...
c711f1ec03: FAILURES
c711f1ec03: --------
c711f1ec03: Filesets listed in this section failed pre-installation verification
c711f1ec03: and will not be installed.
c711f1ec03:
c711f1ec03: Requisite Failures
c711f1ec03: ------------------
c711f1ec03: SELECTED FILESETS: The following is a list of filesets that you asked to
c711f1ec03: install. They cannot be installed until all of their requisite filesets
c711f1ec03: are also installed. See subsequent lists for details of requisites.
c711f1ec03: NOTE: One or more fileset updates in this list are flagged with "*".
c711f1ec03: These updates supersede (i.e., replace) updates which you
c711f1ec03: selected. Newer level fileset updates will always be
c711f1ec03: automatically chosen instead of the fileset updates they
c711f1ec03: supersede when the "auto-install" option (i.e., "AUTOMATICALLY
c711f1ec03: install requisite software" or -g flag) is specified.
c711f1ec03: pessl.rte.mp 3.3.0.0 # mp - Parallel ESSL SMP Libra
...
c711f1ec03: pessl.rte.rs1 3.3.0.0 # rs1 - Parallel ESSL Serial L
...
c711f1ec03: pessl.rte.rs1 3.3.0.2 * # rs1 - Parallel ESSL Serial L
...
c711f1ec03: (supersedes: 3.3.0.1)
c711f1ec03: pessl.rte.rs2 3.3.0.0 # rs2 - Control files to link
...
c711f1ec03: pessl.rte.smp 3.3.0.0 # smp - Parallel ESSL SMP Libr
...
c711f1ec03: pessl.rte.smp 3.3.0.2 * # smp - Parallel ESSL SMP Libr
...
c711f1ec03: (supersedes: 3.3.0.1)
c711f1ec03: pessl.rte.up 3.3.0.0 # up - Parallel ESSL Serial Li
...
c711f1ec03: xlfrte.msg.EN_US 13.1.0.0 # XL Fortran Runtime Environme
...
c711f1ec03:
c711f1ec03: MISSING REQUISITES: The following filesets are required by one or more
c711f1ec03: of the selected filesets listed above. They are not currently installed
c711f1ec03: and could not be found on the installation media.
c711f1ec03:
c711f1ec03: ppe.poe 4.3.0.0 # Base Level Fileset
c711f1ec03: MISCELLANEOUS FAILING REQUISITES: The following filesets are requisites
c711f1ec03: of one or more of the selected filesets listed above. Various problems
c711f1ec03: associated with these requisites are preventing the selected filesets
c711f1ec03: from installing. See the "Requisite Failure Key" for failure reasons and
c711f1ec03: possible recovery hints.
c711f1ec03:
c711f1ec03: I bos.loc.utf.EN_US 6.1.0.0 # Base System Locale UTF Code
...
c711f1ec03: Requisite Failure Key:
c711f1ec03: "I" fileset must be installed prior to or at the same time as its
c711f1ec03: dependent filesets. This requisite cannot be installed automatically
c711f1ec03: (using -g option). It must be explicitly selected for installation.
c711f1ec03:
c711f1ec03: AVAILABLE REQUISITES: You specified that requisite software should be
c711f1ec03: automatically installed. Additional filesets (1) would have been
c711f1ec03: installed automatically had the selected filesets passed all requisite
c711f1ec03: checks.

...etc...

c711f1ec03: Pre-installation Failure/Warning Summary
c711f1ec03: ----------------------------------------
c711f1ec03: Name Level Pre-installation Failure/Warning
c711f1ec03: ----------------------------------------------------------------------------


c711f1ec03: xlfrte.msg.EN_US 13.1.0.0 Requisite failure
c711f1ec03: pessl.rte.up 3.3.0.0 Requisite failure
c711f1ec03: pessl.rte.smp 3.3.0.2 Requisite failure
c711f1ec03: pessl.rte.smp 3.3.0.0 Requisite failure
c711f1ec03: pessl.rte.rs2 3.3.0.0 Requisite failure
c711f1ec03: pessl.rte.rs1 3.3.0.2 Requisite failure
c711f1ec03: pessl.rte.rs1 3.3.0.0 Requisite failure
c711f1ec03: pessl.rte.mp 3.3.0.0 Requisite failure
c711f1ec03: pessl.rte.smp 3.3.0.1 To be superseded by 3.3.0.2
c711f1ec03: pessl.rte.rs1 3.3.0.1 To be superseded by 3.3.0.2
c711f1ec03: xlC.rte 10.1.0.3 Already superseded by 11.1.0.0
c711f1ec03: pessl.rte.hv 3.3.0.1 Already superseded by 3.3.0.2
c711f1ec03: pessl.rte.common 3.3.0.1 Already superseded by 3.3.0.2
c711f1ec03: xlC.rte 11.1.0.0 Already installed
c711f1ec03: xlfrte.msg.en_US 13.1.0.0 Already installed
c711f1ec03: xlfrte.aix53 13.1.0.0 Already installed
c711f1ec03: xlfrte 13.1.0.0 Already installed
c711f1ec03: xlsmp.aix53.rte 2.1.0.0 Already installed
c711f1ec03: xlsmp.msg.en_US.rte 2.1.0.0 Already installed
c711f1ec03: xlsmp.rte 2.1.0.0 Already installed
c711f1ec03: essl.rte.up 5.1.0.0 Already installed
c711f1ec03: essl.rte.mp 5.1.0.0 Already installed
c711f1ec03: essl.rte.common 5.1.0.0 Already installed
c711f1ec03: essl.rte.rs1 5.1.0.0 Already installed
c711f1ec03: essl.rte.rs2 5.1.0.0 Already installed
c711f1ec03: essl.rte.smp 5.1.0.0 Already installed
c711f1ec03: essl.msg.en_US 5.1.0.0 Already installed
c711f1ec03: essl.loc.license 5.1.0.0 Already installed
c711f1ec03: pessl.rte.hv 3.3.0.0 Already installed
c711f1ec03: pessl.rte.hv 3.3.0.2 Already installed
c711f1ec03: pessl.rte.common 3.3.0.0 Already installed
c711f1ec03: pessl.rte.common 3.3.0.2 Already installed
c711f1ec03: pessl.msg.en_US 3.3.0.0 Already installed
c711f1ec03: pessl.loc.license 3.3.0.0 Already installed

So ESSL need ppe.poe and bos.loc.utf.EN_US too.When we ran updatenode with "-agXY" or "-agQXY", nim will search the "Requisite" lpps from lpp_source, so one way to resolve this issue is to modify the guide at the part of "Add the packages to the lpp_source used to build your image", add the ppe.poe and bos.loc.utf.EN_US(in AIX CD) to lpp_source too.

Discussion

  • yan feng han

    yan feng han - 2010-05-28

    This issue only happened when I isntalled ESSL following "Setting up ESSL and PESSL in a Stateful Cluster " guide, and the similar issue in the process of installing PE folloowing "Setting up PE in a Stateful Cluster "- it missing bos.loc.utf.EN_US for the xlfrte.msg.EN_US lpp. So fix this for PE together.

     
  • yan feng han

    yan feng han - 2010-05-28

    I tried this "updatenode" for PE and ESSL on a clean node (it is installed no any HPC bundles with OS) and found comilpers lpps xlfrte.msg.en_US and xlsmp.msg.en_US.rte requires bos.loc.iso.en_US. So both bos.loc.iso.en_US and bos.loc.utf.EN_US were not installed with OS automatically, and when ran "updatenode" for PE or ESSL which including compilers bundle, the "-g" couldn't install the two bos lpps too, the reason was the two bos lpps are "I" instead of "U":

    c711f1ec01: Requisite Failure Key:
    c711f1ec01: "I" fileset must be installed prior to or at the same time as its
    c711f1ec01: dependent filesets. This requisite cannot be installed automatically
    c711f1ec01: (using -g option). It must be explicitly selected for installation.

    So I must add them into lpp_source and install them by "nim -o cust". It is not
    convenient for users, since HPC integration likes to install every thing related with HPC automatically, I think we need to make a better way for it. I suggest that we make a bos_msg bundle to include them, and add this bundle into the osimage.installp_bundle before we begin installing OS, and that will install these lpps with OS. I tried this way on my cluster it woked well. And a advantage of this way is we can easily add any other bos prereqs as needed later.

    The same failures happened when I installed PE(compilers,pe bundles), or ESSL(compilers,essl bundles) , or IBMhpc_all bundle, with the OS together, so we may need to add the bos_msg bundle before all hpc bundles before for installing a new node. Then the issue will be fixed. I tried it on my cluster too, it worked.

    Since all HPC integrations components related, the defect may effect some components. My suggstion is:
    1> for PE and ESSL, we need to modify "Setting up ESSL and PESSL in a Stateful Cluster " and "Setting up PE in a Stateful Cluster "guides for ppe.poe, to let user add it into lpp_source.
    2> for PE, ESSL and HPC together, we need to make a bos_msg bundle to include the two bos lpps and add it before any hpc bundles into the osimage.

    Thx.

     
  • Brian  Croswell

    Brian Croswell - 2010-05-28

    Linda, Han Yan ..

    Another possibility would be to include the bos.loc.utf.EN_US and the bos.loc.iso.en_US file sets as part of the IBMhpc_base.bnd which contains the I base requisite and is already a base requirement for the HPC LPPs ....
    It looks like the bos.loc.iso.en_US fileset is already part of the base AIX NIM LPPSOURCE.
    We will need to instruct the admin to copy/install the bos.loc.utf.EN_US file set into the AIX NIM LPPSOURCE with nim -o update .
    I also noticed that the bos.cpr fileset listed in the HPC bundle in also not part of the base AIX NIM LPPSOURCE and will need the admin to copy/install ..

     
  • yan feng han

    yan feng han - 2010-05-28

    When installing ppe.poe, two requisites are rsct.lapi.rte and mcr, so we need to put this two lpps into lpp_source, then with "-g", they can be installed. So need to modify PE and ESSL guide for them.

     
  • yan feng han

    yan feng han - 2010-05-28

    the mcr should be mcr.rte in the comments above.

     
  • Brian  Croswell

    Brian Croswell - 2010-05-28

    Linda, Han Yan,

    We need to include AIX bos.loc.com.utf file set to be included in LPPSOURCE and bundle since it is a bos.loc.utf.EN_US a co-req ..
    I had a good implementation with HPC compilers, pe, essl working with my FVT configuration xCAT MN hv32lpar05 (xcat20) and node hv32lpar11 ...
    Here was my update and setup as part of the HPC implementation ..

    1) As part of the HPC pre-req activity . I needed to include IBMhpc_base.bnd as part of the HPC implementation. I copied the IBMhpc_base bundle down to /install/nim/installp_bundle. I then made updates to this bundle file to include bos.loc.utf.EN_US and bos.loc.com.utf images . I also included bos.cpr image in this bundle file. I did the nim -o define to include bundle as IBMhpc_base
    and then included it as part of the xCAT AIX osimage used with LPAR
    [xcat20][/]> cat /install/nim/installp_bundle/IBMhpc_base.bnd

    AIX bundle file containing base AIX prereq for all HPC software

    I:bos.adt.debug
    I:bos.adt.libm
    I:bos.adt.syscalls
    I:bos.adt.prof
    I:bos.cpr
    I:bos.loc.iso.en_US
    I:bos.pmapi
    I:bos.loc.utf.EN_US
    I:bos.loc.com.utf
    I:perfagent.tools
    I:sysmgt.sguide.rte
    I:Java5

    [xcat20][/]> lsdef -t osimage -l
    Object name: AIX61JSP1
    bosinst_data=AIX61JSP1_bosinst_data
    imagetype=NIM
    installp_bundle=xCATaixSSL,xCATaixSSH,IBMhpc_base,compilers,pe,essl
    lpp_source=AIX61JSP1_lpp_source
    nimmethod=rte
    nimtype=standalone
    osname=AIX
    spot=AIX61JSP1

    I agree with Han Yan that we do need to make sure that rsct.lapi and mcr need to be included in the LPPSOURCE as part of the HPC ppe implementation
    We also should plan to include the pe bundle as part of the essl implementation. I am not sure if the pe_install script is required though.
    For my implementation I included compilers, pe and essl as one implementation set for the HPC node installation.
    [xcat20][/]> lsdef -l hv32lpar11

    Object name: hv32lpar11
    cons=hmc
    groups=lpar,all,computehpc
    hcp=w1m6hmc01
    id=12
    mac=00215EA68CDB
    mgt=hmc
    nodetype=lpar,osi
    os=AIX
    parent=c918f9fsp01-8236-E8C-SN1004E1P
    postbootscripts=otherpkgs
    postscripts=syslog,aixremoteshell,syncfiles,IBMhpc.postscript,compilers_license,pe_install,essl_install
    pprofile=hv32lpar11
    profile=AIX61JSP1
    provmethod=AIX61JSP1
    status=booted
    statustime=05-28-2010 11:43:38

     
  • Anonymous

    Anonymous - 2010-05-30

    Brian, Han Yan,
    Thank you both for all your investigation on this issue.
    I did not realize that PESSL requires PE for the MPI libraries.
    I have added PE to all the ESSL/PESSL wiki pages, and included the two bos.loc.utf entries in the AIX base bundle file.
    I see I also had forgotten to include the base bundle in all the instruction pages. This has been added.

    I think I have all the parts of this fixed now. Please reopen this defect if you still see problems.

     
  • yan feng han

    yan feng han - 2010-05-31

    Linda, is there a new build with the fix of this defect? I'd like to verify it now on my c711 cluster, this cluster may be taken away by other teams at some point, then I may have no cluster to verify it. If the build is not ready, can you put the efix on my c711? Thanks.

     
  • yan feng han

    yan feng han - 2010-06-09

    Verified in xcat 2.4.2 0608 build on c711, and all packages in the bundle files can be installed now.

     
MongoDB Logo MongoDB