Menu

#2041 Msg: saMsgInitialize is returning continuous TRY_AGAINS after mqsv ndrestarts in backward compatability.

future
unassigned
nobody
None
defect
msg
-
5.1FC
major
2016-10-06
2016-09-16
No

Environment Details:
OS : Suse 64bit
Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled & 1PBE enabled ).

Backward Compatability:
Opensaf versions on nodes:
SC-1 (5.0), SC-2 (5.1 FC), PL-3 (5.0), PL-4(5.1FC).

Summary: saMsgInitialize is returning continuous TRY_AGAINS after mqnd_imm_initialize failed with ERR_TIMEOUT.

Steps followed & Observed behaviour:

Mqsv test application is being ran by continuously killing mqnd.

Observations:

saMsgInitialize failed with continuous TRY_AGAIN. Below is the snapshot.

100|0| Version : B.3.1
100|0| RETRY : saMsgInitialize with all valid parameters
100|0| Return Value : SA_AIS_ERR_TRY_AGAIN
100|0|
100|0|
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1 Retry Count : 10
100|0|
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1 Retry Count : 20
100|0|
100|0| Version : B.3.1
100|0| Version Sun Sep 18 11:51:19 IST 2016
100|0|Sun Sep 18 11:51:19 IST 2016
100|0|Sun Sep 18 11:51:59 IST 2016
100|0|Sun Sep 18 11:51:59 IST 2016
100|0|Sun Sep 18 11:52:39 IST 2016
100|0|Sun Sep 18 11:52:39 IST 2016

100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1 Retry Count : 30
100|0|
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1
100|0| Version : B.3.1 Retry Count : 40
100|0| Try again count exceeded**** TEST CASE FAILED ***

Below is the snippet of syslog of SC-1:

Sep 18 11:48:32 SCALE_SLOT-41 osafimmnd[19813]: NO Implementer (applier) connected: 2462 (@OpenSafImmReplicatorA) <20504, 2010f>
Sep 18 11:48:32 SCALE_SLOT-41 osafntfimcnd[19819]: NO Started
Sep 18 11:48:39 SCALE_SLOT-41 osafamfd[1816]: NO Re-initializing with IMM
Sep 18 11:48:39 SCALE_SLOT-41 osafimmnd[19813]: NO Implementer connected: 2463 (safAmfService) <20506, 2010f>
Sep 18 11:48:39 SCALE_SLOT-41 osafamfd[1816]: NO Finished re-initializing with IMM

Sep 18 11:48:39 SCALE_SLOT-41 osafmsgnd[19792]: ER mqnd_imm_initialize Failed: 5

Sep 18 11:48:39 SCALE_SLOT-41 osafamfnd[1826]: 'safComp=MQND,safSu=SC-1,safSg=NoRed,safApp=OpenSAF'unregistered
Sep 18 11:48:39 SCALE_SLOT-41 osafmsgnd[19792]: CR Destroying the shared memory segment failed
Sep 18 11:48:39 SCALE_SLOT-41 osafmsgnd[19792]: ER saAmfComponentUnregister Failed with error 9
Sep 18 11:48:39 SCALE_SLOT-41 osafmsgnd[19792]: ER Cb is NULL
Sep 18 11:48:49 SCALE_SLOT-41 osafimmnd[19813]: NO Implementer connected: 2464 (MsgQueueService131343) <20507, 2010f>
Sep 18 11:48:49 SCALE_SLOT-41 osafimmnd[19813]: NO Implementer locally disconnected. Marking it as doomed 2464 <20507, 2010f> (MsgQueueService131343)

Attachments:
1)Syslog of SC-1.

1 Attachments

Discussion

  • Ramesh

    Ramesh - 2016-09-20
    • Component: msg --> imm
     
  • Ramesh

    Ramesh - 2016-09-20

    Seems this failure need to be investigate from IMM context as "immutil_saImmOiInitialize_2()" is returning SA_AIS_ERR_TIMEOUT error code.

     
  • Anders Widell

    Anders Widell - 2016-09-20
    • Milestone: 4.7.2 --> 5.0.2
     
  • Hung Nguyen

    Hung Nguyen - 2016-10-06
    • Component: imm --> msg
     
  • Anders Widell

    Anders Widell - 2017-04-03
    • Milestone: 5.0.2 --> future
     

Log in to post a comment.