OS : Suse 64bit
Changeset : 7997 ( 5.1.FC)
Setup : 5 nodes ( 2 controllers and 3 payloads with headless feature disabled & no PBE )
AMF Application : 2N model with SUs mapped on PL-3,PL-4
NTFD crashed on active controller, while logging notification to alarm stream.
-> Initially performed couple of switchovers and tests on AMF application.
-> Performed CLM lock operation of standby SC-1 and later unlocked.
-> Performed switchover such that SC-1 became active controller.
-> Stopped opensafd on PL-4. NTFD on active controller crashed.
Sep 6 10:18:25 CONTROLLER-1 osafamfd[2262]: NO Node 'PL-4' left the cluster
..
Sep 6 10:18:25 CONTROLLER-1 osafntfd[2242]: osaf_abort(31) called from 0x414d1e with errno=11
Sep 6 10:18:25 CONTROLLER-1 osafamfnd[2272]: NO 'safComp=NTF,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : Recovery is 'nodeFailfast'
-> Below is the excerpt from the ntfd trace.
Sep 6 10:18:25.436394 osafntfd [2242:NtfAdmin.cc:0252] T2 New notification received, id: 682
Sep 6 10:18:25.436398 osafntfd [2242:NtfAdmin.cc:0187] >> processNotification
Sep 6 10:18:25.436404 osafntfd [2242:NtfNotification.cc:0045] T3 constructor 0x685790, notId: 682
Sep 6 10:18:25.436409 osafntfd [2242:ntfsv_mem.c:0761] >> ntfsv_get_ntf_header
Sep 6 10:18:25.436412 osafntfd [2242:ntfsv_mem.c:0782] << ntfsv_get_ntf_header
Sep 6 10:18:25.436425 osafntfd [2242:NtfAdmin.cc:0200] T2 notification 682 with type 16384 added, notificationMap size is 1
Sep 6 10:18:25.436431 osafntfd [2242:NtfLogger.cc:0130] >> log
Sep 6 10:18:25.436435 osafntfd [2242:NtfLogger.cc:0132] T2 notification Id=682 received in logger with size 0
Sep 6 10:18:25.436439 osafntfd [2242:NtfLogger.cc:0135] T2 IS LOCAL, logging
Sep 6 10:18:25.436442 osafntfd [2242:NtfLogger.cc:0166] >> checkQueueAndLog
Sep 6 10:18:25.436447 osafntfd [2242:NtfLogger.cc:0196] >> logNotification
Sep 6 10:18:25.436452 osafntfd [2242:ntfsv_mem.c:0761] >> ntfsv_get_ntf_header
Sep 6 10:18:25.436455 osafntfd [2242:ntfsv_mem.c:0782] << ntfsv_get_ntf_header
Sep 6 10:18:25.436460 osafntfd [2242:NtfLogger.cc:0231] T2 Logging notification to alarm stream
Sep 6 10:18:25.436495 osafntfd [2242:lga_api.c:1151] >> saLogWriteLogAsync
Sep 6 10:18:25.436500 osafntfd [2242:lga_api.c:1015] >> handle_log_record
Sep 6 10:18:25.436507 osafntfd [2242:lga_api.c:1110] << handle_log_record
Sep 6 10:18:25.436518 osafntfd [2242:lga_api.c:1229] TR saLogWriteLogAsync Node not CLM member or stale client
Sep 6 10:18:25.436524 osafntfd [2242:lga_api.c:1320] << saLogWriteLogAsync
Sep 6 10:18:42.472616 osafntfd [2176:ntfs_main.c:0181] >> initialize
This may be caused by the bug reported in this ticket [#1985]
osaf/services/saf/logsv/lgs/lgs_clm.cc:120]: (error) Uninitialized variable: rc
This ticket is on review status.
Related
Tickets:
#1985After the integration of LOG with CLM (#1638), all LOG clients should reinitialize after CLM unlock operation. It might be that , NTF as a LOG client is not reinitializing after CLM unlock and got the return value 31.
Even linking with New agents A.2.2 code , if client saLogInitialize with A.2.1 ,
CLM status should be ignored .
changeset: 8040:9befd19ce897
branch: opensaf-5.1.x
tag: tip
parent: 8037:435ec42e4847
user: A V Mahesh mahesh.valla@oracle.com
date: Fri Sep 09 15:09:25 2016 +0530
summary: lga: ignore CLM status if client Initialize with A.2.1 [#1999]
changeset: 8039:fe93169cdede
parent: 8036:62fe50517eaf
user: A V Mahesh mahesh.valla@oracle.com
date: Fri Sep 09 15:09:16 2016 +0530
summary: lga: ignore CLM status if client Initialize with A.2.1 [#1999]
Related
Tickets:
#1999