Menu

#305 amfd segfaults on standby controller while running a campaign

4.3.3
fixed
Praveen
None
defect
amf
-
4.3.M0
major
2014-06-18
2013-05-24
No

Migrated from http://devel.opensaf.org/ticket/3009

changeset : 3969 with pataches :2986,2884,2865,2977
Model : 2N
configuration : 1SG,5SUs,5SIs,each SU has 3comps.3CSIs in each SI
csi-csi deps configured in SI1,SI5 as: CSI1<-CSI2<-CSI3 ( chain )
si-si deps configured as SI1<-SI2<-SI3<-SI4
SaAmfCSIAttribute is set for all the CSIs.

scenario:
The campaign is modelled to add a new component in each of SUs and 1CSI in each of SI.
steps:
2) Add Comps and CSis and their associated classes.
3) perform rollingUpgrade with activationUnitTemplate as SG.
GDB output :

(gdb) bt

0 0x000000000042c9f2 in csiattr_ccb_completed_create_hdlr (opdata=0x7731e8) at avd_csiattr.c:310

1 0x000000000042d1d9 in csiattr_ccb_completed_cb (opdata=0x7731e8) at avd_csiattr.c:500

2 0x0000000000430f31 in ccb_completed_cb (immoi_handle=38654837263, ccb_id=125) at avd_imm.c:702

3 0x00007f2a91b9e87f in imma_process_callback_info (cb=0x7f2a91dbd700, cl_node=0x6eb1b0,

callback=0x7f2a8c001250, immHandle=38654837263) at imma_proc.c:1968

4 0x00007f2a91b9df09 in imma_hdl_callbk_dispatch_all (cb=0x7f2a91dbd700, immHandle=38654837263)

at imma_proc.c:1687

5 0x00007f2a91b8e119 in saImmOiDispatch (immOiHandle=38654837263, dispatchFlags=SA_DISPATCH_ALL)

at imma_oi_api.c:539

6 0x000000000043e5c2 in avd_main_proc () at avd_proc.c:533

7 0x000000000040a30e in main (argc=2, argv=0x7fff90459d18) at amfd_main.c:47

(gdb) bt full

0 0x000000000042c9f2 in csiattr_ccb_completed_create_hdlr (opdata=0x7731e8) at avd_csiattr.c:310

rc = SA_AIS_ERR_BAD_OPERATION
csi_dn = {length = 43,

value = "safCsi=CSI4SI1,safSi=TWONSI1,safApp=TWONAPP", '\000' <repeats 212="" times="">}

csi = 0x771640
FUNCTION = "csiattr_ccb_completed_create_hdlr"

1 0x000000000042d1d9 in csiattr_ccb_completed_cb (opdata=0x7731e8) at avd_csiattr.c:500

rc = SA_AIS_ERR_BAD_OPERATION
FUNCTION = "csiattr_ccb_completed_cb"

2 0x0000000000430f31 in ccb_completed_cb (immoi_handle=38654837263, ccb_id=125) at avd_imm.c:702

rc = SA_AIS_OK
opdata = 0x7731e8
type = AVSV_SA_AMF_CSI_ATTRIBUTE
FUNCTION = "ccb_completed_cb"

3 0x00007f2a91b9e87f in imma_process_callback_info (cb=0x7f2a91dbd700, cl_node=0x6eb1b0,

callback=0x7f2a8c001250, immHandle=38654837263) at imma_proc.c:1968

ccbid = 125
localEr = SA_AIS_OK
ccbCompletedRpl = {next = 0x7f2a91ff3300, type = 2449753938, info = {imma = {type = 543319366, info = {

initRsp = {immHandle = 3328495725741161015, error = 909128756}, errRsp = {error = 959789623,

errStrings = 0x6f20373836303434}, admInitRsp = {error = 959789623, ownerId = 774975802},

ccbInitRsp = {error = 959789623, ccbId = 774975802}, searchInitRsp = {error = 959789623,

searchId = 774975802}, searchNextRsp = 0x2e31313a39353a37, searchRemote = {
client_hdl = 3328495725741161015, requestNodeId = 909128756, remoteNodeId = 1864382264,
searchId = 1634099571, objectName = {size = 943207515,

buf = 0x70615f696f5f616d <Address 0x70615f696f5f616d="" out="" of="" bounds="">},

attributeNames = 0x353434303a632e69}, admOpReq = {adminOwnerId = 959789623,
invocation = 774975802, operationId = 8007460852031566900,

—Type <return> to continue, or q <return> to quit—

continuationId = 2334103126856327539, timeout = 7883896640119518299, objectName = {

size = 1868521837, buf = 0x353434303a632e69 <Address 0x353434303a632e69="" out="" of="" bounds="">},

params = 0x207325203e3e205d}, admOpRsp = {oi_client_hdl = 3328495725741161015,
invocation = 8007460852031566900, result = 1634099571, error = 543450733,
parms = 0x6d693a393838345b}, objCreate = {ccbId = 959789623, adminOwnerId = 774975802,
className = {size = 909128756,

buf = 0x2064666d61666173 <Address 0x2064666d61666173="" out="" of="" bounds="">}, parentName = {
size = 943207515, buf = 0x70615f696f5f616d <Address 0x70615f696f5f616d="" out="" of="" bounds="">},

attrValues = 0x353434303a632e69, immHandle = 2338253451948859485}, objDelete = {
ccbId = 959789623, adminOwnerId = 774975802, objectName = {size = 909128756,

buf = 0x2064666d61666173 <Address 0x2064666d61666173="" out="" of="" bounds="">},

immHandle = 7883896640119518299}, objModify = {ccbId = 959789623, adminOwnerId = 774975802,
objectName = {size = 909128756,

buf = 0x2064666d61666173 <Address 0x2064666d61666173="" out="" of="" bounds="">},

attrMods = 0x6d693a393838345b, immHandle = 8097858511433589101}, ccbCompl = {
ccbId = 959789623, implId = 774975802, invocation = 909128756,
immHandle = 2334103126856327539}, classDescr = {className = {size = 959789623,

buf = 0x6f20373836303434 <Address 0x6f20373836303434="" out="" of="" bounds="">},

classCategory = 1634099571, attrDefinitions = 0x6d693a393838345b}, implSetRsp = {
error = 959789623, implId = 774975802}, tmr_info = {type = 959789623,
adm_owner_hdl = 8007460852031566900, client_hdl = 2334103126856327539,
invocation = 7883896640119518299}}}, immnd = {dont_free_me = 70, unused1 = 101, unused2 = 98,

unused3 = 32, error = 824194353, type = 959789623, info = {initReq = {version = {

releaseCode = 52 '4', majorVersion = 52 '4', minorVersion = 48 '0'},

client_pid = 1864382264}, finReq = {client_hdl = 8007460852031566900}, adminitReq = {
client_hdl = 8007460852031566900, i = {adminOwnerName = {length = 24947,

value = "famfd [4889:imma_oi_api.c:0445] >> %s \000\230E\220\377\177\000\000\336[\002\222*\177\000\000Feb 15 17:59:11.440675 osafamfd [4889:avd_proc.c:0532] TR IMM event rec\000\323\004\222*\177\000\000\—Type <return> to continue, or q <return> to quit—

v\000\000\000\347\003\000\000\327\004\222\177\000\000(\000\000\000\060\000\000\000ЙE\220\377\177\000\000\027\036Q\000\000\000\000\312)\022\221\177\000\000\030\000\000\000\177\000\000\200\231E\220\377\177\000\000\060\231E\220\377\177\000\000\220\231E\220\377\177\000\000@\231E\220\377\177\000\000Y\253\000\222*\177\000\000\000\000\000\000\000\000\000\000\204"...}, releaseOwnershipOnFinalize = 32767}}, ccbinitReq = {adminOwnerId = 909128756,

ccbFlags = 2334103126856327539, client_hdl = 7883896640119518299}, implSet = {
client_hdl = 8007460852031566900, impl_name = {size = 1634099571,

buf = 0x6d693a393838345b <Address 0x6d693a393838345b="" out="" of="" bounds="">}, impl_id = 1868521837,

scope = 1885429609}, admFinReq = {adm_owner_id = 909128756}, admReq = {
adm_owner_id = 909128756, scope = 1864382264, objectNames = 0x2064666d61666173}, admOpReq = {
adminOwnerId = 909128756, invocation = 1864382264, operationId = 2334103126856327539,
continuationId = 7883896640119518299, timeout = 8097858511433589101, objectName = {

size = 979578473, buf = 0x207325203e3e205d <Address 0x207325203e3e205d="" out="" of="" bounds="">},

params = 0x7fff90459800}, fevsReq = {sender_count = 8007460852031566900,
reply_dest = 2334103126856327539, client_hdl = 7883896640119518299, msg = {size = 1868521837,

buf = 0x353434303a632e69 <Address 0x353434303a632e69="" out="" of="" bounds="">}, isObjSync = 93 ']'},

admOpRsp = {oi_client_hdl = 8007460852031566900, invocation = 2334103126856327539,

result = 943207515, error = 1835612729, parms = 0x70615f696f5f616d}, ccbUpcallRsp = {
oi_client_hdl = 8007460852031566900, ccbId = 1634099571, implId = 543450733, inv = 943207515,
result = 1835612729, name = {length = 24941,

value = "_oi_api.c:0445] >> %s \000\230E\220\377\177\000\000\336[\002\222*\177\000\000Feb 15 17:59:11.440675 osafamfd [4889:avd_proc.c:0532] TR IMM event rec\000\323\004\222\177\000\000\v\000\000\000\347\003\000\000\327\004\222\177\000\000(\000\000\000\060\000\000\000ЙE\220\377\177\000\000\027\036Q\000\000\000\000\312)\022\221\177\000\000\030\000\000\000\177\000\000\200\231E\220\377\177\000\000\060\231E\220\377\177\000\000\220\231E\220\377\177\000\000@\231E\220\377\177\000\000Y\253\000\222\177\000\000\000\000\000\000\000\000\000\000\204+\022\221*\177\000\000@\326I\000\000\000\000\000"...}, errorString = {size = 0,
buf = 0x7f2a9200c1eb "\211E\350\203", <incomplete sequence="" \\350="">}}, classDescr = {

className = {size = 909128756,

buf = 0x2064666d61666173 <Address 0x2064666d61666173="" out="" of="" bounds="">},

—Type <return> to continue, or q <return> to quit—

classCategory = 943207515, attrDefinitions = 0x70615f696f5f616d}, objCreate = {
ccbId = 909128756, adminOwnerId = 1864382264, className = {size = 1634099571,

buf = 0x6d693a393838345b <Address 0x6d693a393838345b="" out="" of="" bounds="">}, parentName = {
size = 1868521837, buf = 0x353434303a632e69 <Address 0x353434303a632e69="" out="" of="" bounds="">},

attrValues = 0x207325203e3e205d, immHandle = 140735613868032}, objModify = {
ccbId = 909128756, adminOwnerId = 1864382264, objectName = {size = 1634099571,

buf = 0x6d693a393838345b <Address 0x6d693a393838345b="" out="" of="" bounds="">},

attrMods = 0x70615f696f5f616d, immHandle = 3833746564541787753}, objDelete = {
ccbId = 909128756, adminOwnerId = 1864382264, objectName = {size = 1634099571,

buf = 0x6d693a393838345b <Address 0x6d693a393838345b="" out="" of="" bounds="">},

immHandle = 8097858511433589101}, obj_sync = {className = {size = 909128756,

buf = 0x2064666d61666173 <Address 0x2064666d61666173="" out="" of="" bounds="">}, objectName = {
size = 943207515, buf = 0x70615f696f5f616d <Address 0x70615f696f5f616d="" out="" of="" bounds="">},

attrValues = 0x353434303a632e69, next = 0x207325203e3e205d}, finSync = {
lastContinuationId = 909128756, adminOwners = 0x2064666d61666173,
implementers = 0x6d693a393838345b, classes = 0x70615f696f5f616d,
ccbResults = 0x353434303a632e69}, ccbId = 909128756, searchOp = {
client_hdl = 8007460852031566900, searchId = 1634099571}, searchInit = {
client_hdl = 8007460852031566900, rootName = {size = 1634099571,

buf = 0x6d693a393838345b <Address 0x6d693a393838345b="" out="" of="" bounds="">}, scope = 1868521837,

searchOptions = 3833746564541787753, searchParam = {present = 1044258909, choice = {

oneAttrParam = {attrName = {size = 2420480000,

buf = 0x7f2a92025bde "\211E\344\353\065H\213}\350\350\061"},

attrValueType = 543319366, attrValue = {val = {saint32 = 959789623,

sauint32 = 959789623, saint64 = 3328495725741161015,
sauint64 = 3328495725741161015, satime = 3328495725741161015,
safloat = 0.000172831918, sadouble = 3.4569658948043125e-86, x = {size = 959789623,

buf = 0x6f20353736303434 <Address 0x6f20353736303434="" out="" of="" bounds="">}}}}}},

—Type <return> to continue, or q <return> to quit—

attributeNames = 0x2064666d61666173}, rtAttUpdRpl = {sr = {client_hdl = 8007460852031566900,

requestNodeId = 1634099571, remoteNodeId = 543450733, searchId = 943207515, objectName = {

size = 1868521837, buf = 0x353434303a632e69 <Address 0x353434303a632e69="" out="" of="" bounds="">},

attributeNames = 0x207325203e3e205d}, result = 2420480000}, searchRemote = {

client_hdl = 8007460852031566900, requestNodeId = 1634099571, remoteNodeId = 543450733,
searchId = 943207515, objectName = {size = 1868521837,

buf = 0x353434303a632e69 <Address 0x353434303a632e69="" out="" of="" bounds="">},

attributeNames = 0x207325203e3e205d}, rspSrchRmte = {result = 909128756,
requestNodeId = 1864382264, remoteNodeId = 1634099571, searchId = 543450733, runtimeAttrs = {

objectName = {size = 943207515,

buf = 0x70615f696f5f616d <Address 0x70615f696f5f616d="" out="" of="" bounds="">},

attrValuesList = 0x353434303a632e69}}, ctrl = {nodeId = 909128756,

rulingEpoch = 1864382264, fevsMsgStart = 2334103126856327539, ndExecPid = 943207515,
canBeCoord = 57 '9', isCoord = 58 ':', syncStarted = 105 'i', nodeEpoch = 1868521837,
pbeEnabled = 105 'i'}, adminitGlobal = {globalOwnerId = 909128756, i = {adminOwnerName = {

length = 14136,
value = " osafamfd [4889:imma_oi_api.c:0445] >> %s \000\230E\220\377\177\000\000\336[\002\222*\177\000\000Feb 15 17:59:11.440675 osafamfd [4889:avd_proc.c:0532] TR IMM event rec\000\323\004\222\177\000\000\v\000\000\000\347\003\000\000\327\004\222\177\000\000(\000\000\000\060\000\000\000ЙE\220\377\177\000\000\027\036Q\000\000\000\000\312)\022\221\177\000\000\030\000\000\000\177\000\000\200\231E\220\377\177\000\000\060\231E\220\377\177\000\000\220\231E\220\377\177\000\000@\231E\220\377\177\000\000Y\253\000\222"...},

releaseOwnershipOnFinalize = 2420480464}}, ccbinitGlobal = {globalCcbId = 909128756, i = {
adminOwnerId = 1634099571, ccbFlags = 7883896640119518299,
client_hdl = 8097858511433589101}}, mds_info = {change = 909128756,

dest = 2334103126856327539, svc_id = 943207515, node_id = 1835612729, role = 1868521837},

syncFevsBase = 8007460852031566900}}, immd = {type = 543319366, info = {ctrl_msg = {

ndExecPid = 959789623, epoch = 774975802, refresh = 52 '4', pbeEnabled = 52 '4'},

admown_init = {client_hdl = 3328495725741161015, i = {adminOwnerName = {length = 13364,

—Type <return> to continue, or q <return> to quit—

value = "0687 osafamfd [4889:imma_oi_api.c:0445] >> %s \000\230E\220\377\177\000\000\336[\002\222*\177\000\000Feb 15 17:59:11.440675 osafamfd [4889:avd_proc.c:0532] TR IMM event rec\000\323\004\222\177\000\000\v\000\000\000\347\003\000\000\327\004\222\177\000\000(\000\000\000\060\000\000\000ЙE\220\377\177\000\000\027\036Q\000\000\000\000\312)\022\221\177\000\000\030\000\000\000*\177\000\000\200\231E\220\377\177\000\000\060\231E\220\377\177\000\000\220\231E\220\377\177\000\000@\231E\220\377\177\000\000"...},

releaseOwnershipOnFinalize = SA_FALSE}}, ccb_init = {adminOwnerId = 959789623,

ccbFlags = 8007460852031566900, client_hdl = 2334103126856327539}, impl_set = {r = {

client_hdl = 3328495725741161015, impl_name = {size = 909128756,

buf = 0x2064666d61666173 <Address 0x2064666d61666173="" out="" of="" bounds="">},

impl_id = 943207515, scope = 1835612729}, reply_dest = 8097858511433589101}, objModify = {

ccbId = 959789623, adminOwnerId = 774975802, objectName = {size = 909128756,

buf = 0x2064666d61666173 <Address 0x2064666d61666173="" out="" of="" bounds="">},

attrMods = 0x6d693a393838345b, immHandle = 8097858511433589101}, ccbId = 959789623,

admoId = 959789623, fevsReq = {sender_count = 3328495725741161015,

reply_dest = 8007460852031566900, client_hdl = 2334103126856327539, msg = {size = 943207515,

buf = 0x70615f696f5f616d <Address 0x70615f696f5f616d="" out="" of="" bounds="">}, isObjSync = 105 'i'},

tmr_info = {type = 959789623, info = {immnd_dest = 8007460852031566900}}, mds_info = {

change = 959789623, dest = 8007460852031566900, svc_id = 1634099571, node_id = 543450733,
role = 943207515}, rda_info = {io_role = 959789623}, syncFevsBase = {
fevsBase = 3328495725741161015, client_hdl = 8007460852031566900}}}}, sinfo = {

to_svc = 7096272, dest = 139820814871385, stype = MDS_SENDTYPE_SNDRSP, ctxt = {length = 1 '\001',

data = "\000\000\000\004\000\000\000\001\000\000\000\002"}, mSynReqCount = 0 '\000'}}

locked = false
errorStr = 0x0
isPbeOp = false
FUNCTION = "imma_process_callback_info"

4 0x00007f2a91b9df09 in imma_hdl_callbk_dispatch_all (cb=0x7f2a91dbd700, immHandle=38654837263)

at imma_proc.c:1687

—Type <return> to continue, or q <return> to quit—

callback = 0x7f2a8c001250
cl_node = 0x6eb1b0

5 0x00007f2a91b8e119 in saImmOiDispatch (immOiHandle=38654837263, dispatchFlags=SA_DISPATCH_ALL)

at imma_oi_api.c:539

rc = SA_AIS_OK
cb = 0x7f2a91dbd700
cl_node = 0x0
locked = false
pend_fin = 0
pend_dis = 0
FUNCTION = "saImmOiDispatch"

6 0x000000000043e5c2 in avd_main_proc () at avd_proc.c:533

pollretval = 1
cb = 0x6c0c00
evt = 0x7713d0
mbx_fd = {raise_obj = 11, rmv_obj = 12}
error = SA_AIS_OK
polltmo = -1
FUNCTION = "avd_main_proc"

7 0x000000000040a30e in main (argc=2, argv=0x7fff90459d18) at amfd_main.c:47

error = 0
ee_id = 0x0
node_id = 0

(gdb) q

Changed 3 months ago by surenderk
■attachment ctrl1.tgz added
ctrl-1 logs

Changed 3 months ago by surenderk
■attachment ctrl2.tgz added
ctrl2-logs with campaign, imm.xml file,coreFile and exe of amfd

Changed 3 months ago by surenderk ¶
■description modified (diff)
Changed 3 months ago by surenderk ¶
■description modified (diff)
Changed 3 months ago by hafe ¶
Had a quick look and it seems the problem occurs because a CSI is created because of check pointing of an SI assignment:

Feb 15 17:59:11.435247 osafamfd [4889:avd_ckpt_updt.c:0406] >> avd_ckpt_siass: 'safSi=TWONSI1,safApp=TWONAPP' 'safSu=SU1,safSg=SGONE,safApp=TWONAPP'
Feb 15 17:59:11.435267 osafamfd [4889:avd_ckpt_updt.c:0442] TR compcsi create for 'safComp=COMP1SU1TWONAPP,safSu=SU1,safSg=SGONE,safApp=TWONAPP' 'safCsi=CSI4SI1,safSi=TWONSI1,
safApp=TWONAPP'
Feb 15 17:59:11.435292 osafamfd [4889:avd_csi.c:0250] >> csi_create: 'safCsi=CSI4SI1,safSi=TWONSI1,safApp=TWONAPP'
Feb 15 17:59:11.435303 osafamfd [4889:avd_csi.c:0989] >> avd_compcsi_create: Comp'safComp=COMP1SU1TWONAPP,safSu=SU1,safSg=SGONE,safApp=TWONAPP' and Csi'safCsi=CSI4SI1,safSi=TW
ONSI1,safApp=TWONAPP'
Feb 15 17:59:11.435309 osafamfd [4889:avd_csi.c:1025] << avd_compcsi_create
Feb 15 17:59:11.435315 osafamfd [4889:avd_ckpt_updt.c:0517] << avd_ckpt_siass: status '1'

This makes the CSI available in the CSI DB.

Later in csiattr_ccb_completed_cb the CSI is found in the DB and its SI link is trusted. That has not yet been setup so bang!

I think the problem is that the standby amfd (which is applier) is executing completed callbacks. It should just return immediately without any processing.

1 Attachments

Related

Tickets: #305
Tickets: #947
Wiki: ChangeLog-4.3.3
Wiki: ChangeLog-4.4.1

Discussion

  • Nagendra Kumar

    Nagendra Kumar - 2013-05-24

    ctrl2

     
  • Praveen

    Praveen - 2014-05-30
    • status: unassigned --> assigned
    • assigned_to: Praveen
    • Milestone: future --> 4.5.FC
     
  • Praveen

    Praveen - 2014-06-02
    • status: assigned --> review
     
  • Praveen

    Praveen - 2014-06-18

    changeset: 5413:02e77b43ee5b
    branch: opensaf-4.3.x
    parent: 5359:4afcabee1598
    user: praveen.malviya@oracle.com
    date: Wed Jun 18 10:11:26 2014 +0530
    summary: amfd : skip processing of ccb completed cbk at standby amfd [#305]

    changeset: 5414:dba5f3bbbf6f
    branch: opensaf-4.4.x
    parent: 5384:5108b385c780
    user: praveen.malviya@oracle.com
    date: Wed Jun 18 10:12:04 2014 +0530
    summary: amfd : skip processing of ccb completed cbk at standby amfd [#305]

    changeset: 5415:03845a93b7b3
    tag: tip
    parent: 5412:cf91285fbaf7
    user: praveen.malviya@oracle.com
    date: Wed Jun 18 10:12:20 2014 +0530
    summary: amfd : skip processing of ccb completed cbk at standby amfd [#305]

     

    Related

    Tickets: #305

  • Praveen

    Praveen - 2014-06-18
    • status: review --> fixed
    • Milestone: 4.5.FC --> 4.3.3
     

Log in to post a comment.