From: Nivrutti K. <nkale@Brocade.com> - 2017-08-29 10:13:42
|
Hi Dheeroj, If you want to run the application and it is ready to run, please try "amf-adm repaired <Su Name> command. Thanks, Nivrutti -----Original Message----- From: Dheeroj Ram [mailto:DR00487751@TechMahindra.com] Sent: Tuesday, August 29, 2017 2:37 PM To: ope...@li... Subject: [users] osafamfnd: Script did not exit within time' TERMINATION_FAILED Hi All, I am new to opensaf. Need your help. Please find my Opensaf Setup as below: I am using Opensaf 4.4.2 Version and below is my opensaf status output: atcafs-n10s2:~# /etc/init.d/opensafd status safSISU=safSu=n10s2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed10,safApp=OpenSAF saAmfSISUHAState=ACTIVE(1) safSISU=safSu=n10s2\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF saAmfSISUHAState=ACTIVE(1) safSISU=safSu=SU-n10s2\,safSg=HenbGw-SG\,safApp=HenbGwApp,safSi=HenbGw,safApp=HenbGwApp saAmfSISUHAState=ACTIVE(1) safSISU=safSu=n10s1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF saAmfSISUHAState=ACTIVE(1) safSISU=safSu=n10s1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF saAmfSISUHAState=STANDBY(2) safSISU=safSu=SU-n10s1\,safSg=HenbGw-SG\,safApp=HenbGwApp,safSi=HenbGw,safApp=HenbGwApp saAmfSISUHAState=STANDBY(2) safSISU=safSu=SU-n10s5\,safSg=HenbGw-SG\,safApp=HenbGwApp_PL_n10s5,safSi=HenbGw,safApp=HenbGwApp_PL_n10s5 saAmfSISUHAState=ACTIVE(1) safSISU=safSu=SU-n10s4\,safSg=HenbGw-SG\,safApp=HenbGwApp_PL_n10s4,safSi=HenbGw,safApp=HenbGwApp_PL_n10s4 saAmfSISUHAState=ACTIVE(1) safSISU=safSu=n10s5\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4,safApp=OpenSAF saAmfSISUHAState=ACTIVE(1) safSISU=safSu=n10s4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF saAmfSISUHAState=ACTIVE(1) atcafs-n10s2:~# whereas n10s1, n10s2 are my controllers and n10s4,n105 are Payloads. Below applications are running on Payloads: atcafs-n10s4:~# ps -aef | grep ins root 3379 1 21 11:34 ? 00:21:36 /hegw/gsw/bin/hms instantiate root 3396 1 11 11:34 ? 00:11:49 /hegw/gsw/bin/mms instantiate root 3410 1 2 11:34 ? 00:02:05 /hegw/gsw/bin/dra instantiate root 3424 1 2 11:34 ? 00:02:15 /hegw/gsw/bin/bcm instantiate Problem Detail: When I killed the application (hms) with signal 11 "kill -11 3379 " , it generates a core ( about size 7GB). Opensaf trying to restart the process in 60s , but by that time my process was busy with writing the core and till then PID is active. So opensaf failed with below error: Aug 29 13:26:12 localhost kernel: grsec: From 172.16.10.1: signal 11 sent to /hegw/gsw/bin/hms[hms:11902] uid/euid:0/0 gid/egid:0/0, parent /sbin/init[init:1] uid/euid:0/0 gid/egid:0/0 by /bin/bash[bash:10442] uid/euid:0/0 gid/egid:0/0, parent /bin/login[login:10441] uid/euid:0/0 gid/egid:0/0 Aug 29 13:26:27 localhost osafamfnd[11779]: 'safComp=HMSComp_n10s4,safSu=SU-n10s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n10s4' faulted due to 'healthCheckcallbackTimeout' : Recovery is 'componentRestart' Aug 29 13:26:27 localhost AMF_DEMO: CMD=cleanup Aug 29 13:26:27 localhost AMF_DEMO_VAR: AMF_DEMO_VAR4=COMP1_VALUE4 Aug 29 13:26:27 localhost AMF_DEMO_VAR: AMF_DEMO_VAR1=CT_VALUE1 Aug 29 13:26:27 localhost AMF_DEMO_VAR: AMF_DEMO_VAR2=COMP1_OVERLOAD_VALUE2 Aug 29 13:26:27 localhost AMF_DEMO_VAR: AMF_DEMO_VAR3=COMP1_VALUE3 Aug 29 13:26:37 localhost osafamfnd[11779]: Cleanup of 'safComp=HMSComp_n10s4,safSu=SU-n10s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n10s4' failed Aug 29 13:26:37 localhost osafamfnd[11779]: Reason:'Script did not exit within time' Aug 29 13:26:37 localhost osafamfnd[11779]: SU Failover trigerred for 'safSu=SU-n10s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n10s4': Failed component: 'safComp=HMSComp_n10s4,safSu=SU-n10s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n10s4' Aug 29 13:26:37 localhost osafamfnd[11779]: 'safSu=SU-n10s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n10s4' Presence State INSTANTIATED => TERMINATION_FAILED Aug 29 13:26:37 localhost osafamfnd[11779]: Assigning 'safSi=HenbGw,safApp=HenbGwApp_PL_n10s4' QUIESCED to 'safSu=SU-n10s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n10s4' Aug 29 13:26:37 localhost osafamfnd[11779]: Assigned 'safSi=HenbGw,safApp=HenbGwApp_PL_n10s4' QUIESCED to 'safSu=SU-n10s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n10s4' Aug 29 13:26:37 localhost osafamfnd[11779]: Removing 'safSi=HenbGw,safApp=HenbGwApp_PL_n10s4' from 'safSu=SU-n10s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n10s4' Aug 29 13:26:37 localhost osafamfnd[11779]: Removed 'safSi=HenbGw,safApp=HenbGwApp_PL_n10s4' from 'safSu=SU-n10s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n10s4' I have given a try by modifying "OPENSAF_TERMTIMEOUT=1000" in nid.conf file. But it didn't work. Issue still exist. Please let me know if you need any more detail. Thanks Dheeraj ============================================================================================================================ Disclaimer: This message and the information contained herein is proprietary and confidential and subject to the Tech Mahindra policy statement, you may review the policy at https://urldefense.proofpoint.com/v2/url?u=http-3A__www.techmahindra.com_Disclaimer.html&d=DwICAg&c=IL_XqQWOjubgfqINi2jTzg&r=qssxjGQZARrEa_Yax-32kXOgWL2XHZgOPUvhIFaqP1k&m=OPPazbxhFa6a5QTSTidghR9FFFmCuwYHayS63RFCJ9o&s=yGiphSDV_lKdtXOMSigSqoW-IktBDh_-VIXWpTxBZlQ&e= <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.techmahindra.com_Disclaimer.html&d=DwICAg&c=IL_XqQWOjubgfqINi2jTzg&r=qssxjGQZARrEa_Yax-32kXOgWL2XHZgOPUvhIFaqP1k&m=OPPazbxhFa6a5QTSTidghR9FFFmCuwYHayS63RFCJ9o&s=yGiphSDV_lKdtXOMSigSqoW-IktBDh_-VIXWpTxBZlQ&e= > externally https://urldefense.proofpoint.com/v2/url?u=http-3A__tim.techmahindra.com_tim_disclaimer.html&d=DwICAg&c=IL_XqQWOjubgfqINi2jTzg&r=qssxjGQZARrEa_Yax-32kXOgWL2XHZgOPUvhIFaqP1k&m=OPPazbxhFa6a5QTSTidghR9FFFmCuwYHayS63RFCJ9o&s=EJ-KqgUSRVE-SKsxUWyViU8mVVaPLCe3aVESlkB-JAk&e= <https://urldefense.proofpoint.com/v2/url?u=http-3A__tim.techmahindra.com_tim_disclaimer.html&d=DwICAg&c=IL_XqQWOjubgfqINi2jTzg&r=qssxjGQZARrEa_Yax-32kXOgWL2XHZgOPUvhIFaqP1k&m=OPPazbxhFa6a5QTSTidghR9FFFmCuwYHayS63RFCJ9o&s=EJ-KqgUSRVE-SKsxUWyViU8mVVaPLCe3aVESlkB-JAk&e= > internally within TechMahindra. ============================================================================================================================ ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! https://urldefense.proofpoint.com/v2/url?u=http-3A__sdm.link_slashdot&d=DwICAg&c=IL_XqQWOjubgfqINi2jTzg&r=qssxjGQZARrEa_Yax-32kXOgWL2XHZgOPUvhIFaqP1k&m=OPPazbxhFa6a5QTSTidghR9FFFmCuwYHayS63RFCJ9o&s=YKbI5f-9K0RHvnYfzz7yD74MrQGO1SrUkLICCykLzt8&e= _______________________________________________ Opensaf-users mailing list Ope...@li... https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_opensaf-2Dusers&d=DwICAg&c=IL_XqQWOjubgfqINi2jTzg&r=qssxjGQZARrEa_Yax-32kXOgWL2XHZgOPUvhIFaqP1k&m=OPPazbxhFa6a5QTSTidghR9FFFmCuwYHayS63RFCJ9o&s=ow6UD-nmsa3ZBRCMWhtR8D_T6_GjL84sYepV1Eq5c7E&e= |