Menu

#771 logd crashed when logsv application is running

4.5.FC
fixed
elunlen
None
defect
log
-
4.4.M0
major
2014-08-26
2014-02-07
No

The issue is seen on cs 4871 with patches #688, #711 and #721. After we have observed the issue we ran the same app on cs 4733 and we observed the same issue.

The backtrace of the crash is as follows:

Program terminated with signal 11, Segmentation fault.
#0 stream_ccb_apply_modify (opdata=0x65cff0) at lgs_imm.c:1387
1387 lgs_imm.c: No such file or directory.
in lgs_imm.c
(gdb) bt
#0 stream_ccb_apply_modify (opdata=0x65cff0) at lgs_imm.c:1387
#1 0x0000000000411d48 in stream_ccb_apply (opdata=0x65cff0) at lgs_imm.c:1476
#2 0x0000000000411eea in ccbApplyCallback (immOiHandle=<optimized out="">, ccbId=<optimized out="">) at lgs_imm.c:1520
#3 0x00007effed26532a in imma_process_callback_info (cb=0x7effed4812e0, cl_node=0x653260, callback=0x65c8c0, immHandle=17180000527) at imma_proc.c:2071
#4 0x00007effed267475 in imma_hdl_callbk_dispatch_all (cb=0x7effed4812e0, immHandle=17180000527) at imma_proc.c:1688
#5 0x00007effed25870d in saImmOiDispatch (immOiHandle=17180000527, dispatchFlags=SA_DISPATCH_ALL) at imma_oi_api.c:543
#6 0x0000000000412336 in main (argc=<optimized out="">, argv=<optimized out="">) at lgs_main.c:497
(gdb) thread apply all bt

Thread 5 (Thread 0x7effedad8b00 (LWP 3727)):
#0 0x00007effec5464f6 in poll () from /lib64/libc.so.6
#1 0x00007effed6b36fd in osaf_ppoll (io_fds=0x7effedad81d0, i_nfds=1, i_timeout_ts=0x7effedad81a0, i_sigmask=<optimized out="">) at osaf_poll.c:104
#2 0x00007effed6b38a7 in osaf_poll (io_fds=0x7effedad81d0, i_nfds=1, i_timeout=<optimized out="">) at osaf_poll.c:43
#3 0x00007effed6b38f4 in osaf_poll_one_fd (i_fd=11, i_timeout=30000) at osaf_poll.c:136
#4 0x00007effece16ff4 in rda_read_msg (sockfd=-307396144, msg=0x7effedad8250 "", size=30000) at rda_papi.c:662
#5 0x00007effece17970 in rda_callback_task (rda_callback_cb=0x630be0) at rda_papi.c:128
#6 0x00007effecc007b6 in start_thread () from /lib64/libpthread.so.0
#7 0x00007effec54f9cd in clone () from /lib64/libc.so.6
#8 0x0000000000000000 in ?? ()

Thread 4 (Thread 0x7effebc3f700 (LWP 3724)):
#0 0x00007effecc0461c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x000000000041d3b8 in file_hndl_thread (noparam=<optimized out="">) at lgs_file.c:132
#2 0x00007effecc007b6 in start_thread () from /lib64/libpthread.so.0
#3 0x00007effec54f9cd in clone () from /lib64/libc.so.6
#4 0x0000000000000000 in ?? ()

Thread 3 (Thread 0x7effedaf8b00 (LWP 3726)):
#0 0x00007effec5464f6 in poll () from /lib64/libc.so.6
#1 0x00007effed6eec6e in mdtm_process_recv_events () at mds_dt_tipc.c:580
#2 0x00007effecc007b6 in start_thread () from /lib64/libpthread.so.0
#3 0x00007effec54f9cd in clone () from /lib64/libc.so.6
#4 0x0000000000000000 in ?? ()

Thread 2 (Thread 0x7effedb28b00 (LWP 3725)):
#0 0x00007effec5464f6 in poll () from /lib64/libc.so.6
#1 0x00007effed6b362a in osaf_poll_no_timeout (io_fds=0x7effedb28290, i_nfds=1) at osaf_poll.c:31
#2 0x00007effed6b3825 in osaf_ppoll (io_fds=0x7effedb28290, i_nfds=1, i_timeout_ts=0xffffffffffffffff, i_sigmask=0xffffffffffffffff) at osaf_poll.c:78
#3 0x00007effed6b9edf in ncs_tmr_wait () at sysf_tmr.c:411
#4 0x00007effecc007b6 in start_thread () from /lib64/libpthread.so.0
#5 0x00007effec54f9cd in clone () from /lib64/libc.so.6
#6 0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7effedafb700 (LWP 3720)):
#0 stream_ccb_apply_modify (opdata=0x65cff0) at lgs_imm.c:1387
#1 0x0000000000411d48 in stream_ccb_apply (opdata=0x65cff0) at lgs_imm.c:1476
#2 0x0000000000411eea in ccbApplyCallback (immOiHandle=<optimized out="">, ccbId=<optimized out="">) at lgs_imm.c:1520
#3 0x00007effed26532a in imma_process_callback_info (cb=0x7effed4812e0, cl_node=0x653260, callback=0x65c8c0, immHandle=17180000527) at imma_proc.c:2071
#4 0x00007effed267475 in imma_hdl_callbk_dispatch_all (cb=0x7effed4812e0, immHandle=17180000527) at imma_proc.c:1688
#5 0x00007effed25870d in saImmOiDispatch (immOiHandle=17180000527, dispatchFlags=SA_DISPATCH_ALL) at imma_oi_api.c:543
---Type <return> to continue, or q <return> to quit---
#6 0x0000000000412336 in main (argc=<optimized out="">, argv=<optimized out="">) at lgs_main.c:497
(gdb) fr 0
#0 stream_ccb_apply_modify (opdata=0x65cff0) at lgs_imm.c:1387
1387 in lgs_imm.c
(gdb) p *opdata
$1 = {next = 0x0, userData = 0x0, userStatus = 0, operationType = CCBUTIL_MODIFY, objectName = {length = 43,
value = "safLgStrCfg=appstream1,safApp=safLogService", '\000' <repeats 212="" times="">}, ccbId = 175, param = {create = {className = 0x65d130 "+", parentName = 0x65d234,
attrValues = 0x0}, deleteOp = {objectName = 0x65d130}, modify = {objectName = 0x65d130, attrMods = 0x65d234}}}

(gdb) DIR /home/sirisha/6.4/incrementaldropstaging/osaf/services/saf/logsv/lgs/
Source directories searched: /home/sirisha/6.4/incrementaldropstaging/osaf/services/saf/logsv/lgs:$cdir:$cwd
(gdb) list
1382 void value;
1383 const SaImmAttrValuesT_2
attribute = &attrMod->modAttr;
1384
1385 TRACE("attribute %s", attribute->attrName);
1386
1387 value = attribute->attrValues[0];
1388
1389 if (!strcmp(attribute->attrName, "saLogStreamFileName")) {
1390 char fileName = ((char )value);
1391 n = snprintf(stream->fileName, NAME_MAX, "%s", fileName);
(gdb) p opdata->param->modify->attrMods
$2 = (const SaImmAttrModificationT_2
) 0x65d234
(gdb) p opdata->param->modify->attrMods[0]
$3 = (const SaImmAttrModificationT_2 ) 0x65d244
(gdb) p
opdata->param->modify->attrMods[0]
$4 = {modType = SA_IMM_ATTR_VALUES_REPLACE, modAttr = {attrName = 0x65d264 "saLogStreamLogFullHaltThreshold", attrValueType = SA_IMM_ATTR_SAUINT32T, attrValuesNumber = 0,
attrValues = 0x0}}
(gdb) p *opdata->param->modify->attrMods[1]
Cannot access memory at address 0x0
(gdb)

/var/log/messages of SC-1:

Feb 7 12:06:54 SLES-64BIT-SLOT1 osafimmnd[3708]: NO Ccb 172 COMMITTED (immcfg_SLES-64BIT-SLOT1_21847)
Feb 7 12:06:54 SLES-64BIT-SLOT1 osafimmnd[3708]: NO Ccb 173 COMMITTED (immcfg_SLES-64BIT-SLOT1_21855)
Feb 7 12:06:54 SLES-64BIT-SLOT1 osafimmnd[3708]: NO Ccb 174 COMMITTED (immcfg_SLES-64BIT-SLOT1_21860)
Feb 7 12:06:54 SLES-64BIT-SLOT1 osafimmnd[3708]: NO Ccb 175 COMMITTED (immcfg_SLES-64BIT-SLOT1_21864)
Feb 7 12:06:54 SLES-64BIT-SLOT1 kernel: [ 3032.247789] osaflogd[3720]: segfault at 0 ip 00000000004117fc sp 00007fff167f1fb0 error 4 in osaflogd[400000+28000]
Feb 7 12:06:54 SLES-64BIT-SLOT1 osafamfnd[3776]: NO 'safComp=LOG,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : Recovery is 'nodeFailfast'
Feb 7 12:06:54 SLES-64BIT-SLOT1 osafamfnd[3776]: ER safComp=LOG,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery is:nodeFailfast
Feb 7 12:06:54 SLES-64BIT-SLOT1 osafamfnd[3776]: Rebooting OpenSAF NodeId = 131343 EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 131343, SupervisionTime = 60
Feb 7 12:06:54 SLES-64BIT-SLOT1 osafimmnd[3708]: NO Implementer locally disconnected. Marking it as doomed 1 <4, 2010f> (safLogService)
Feb 7 12:06:54 SLES-64BIT-SLOT1 osafimmnd[3708]: NO Implementer disconnected 1 <4, 2010f> (safLogService)
Feb 7 12:06:54 SLES-64BIT-SLOT1 opensaf_reboot: Rebooting local node; timeout=60
Feb 7 12:06:57 SLES-64BIT-SLOT1 kernel: [ 3034.430264] md: stopping all md devices.
Feb 7 12:06:57 SLES-64BIT-SLOT1 kernel: [ 3035.430796] sd 0:0:0:0: [sda] Synchronizing SCSI cache

Logd trace is attached.

1 Attachments

Related

Tickets: #448
Tickets: #771
Tickets: #814
Wiki: ChangeLog-4.3.3
Wiki: ChangeLog-4.4.1

Discussion

  • Mathi Naickan

    Mathi Naickan - 2014-02-07
    • Milestone: future --> 4.4.RC1
     
  • Mathi Naickan

    Mathi Naickan - 2014-02-07

    Can you provide the output of the command - thread apply all bt full?
    (Not just for this one, for any crash, that helps!)

     
  • Sirisha Alla

    Sirisha Alla - 2014-02-07

    (gdb) thread apply all bt full

    Thread 5 (Thread 0x7effedad8b00 (LWP 3727)):
    #0 0x00007effec5464f6 in poll () from /lib64/libc.so.6
    No symbol table info available.
    #1 0x00007effed6b36fd in osaf_ppoll (io_fds=0x7effedad81d0, i_nfds=1, i_timeout_ts=0x7effedad81a0, i_sigmask=<optimized out="">) at osaf_poll.c:104
    current_time = {tv_sec = 0, tv_nsec = 0}
    time_left = 30000
    start_time = {tv_sec = 3021, tv_nsec = 898173680}
    result = 30000
    #2 0x00007effed6b38a7 in osaf_poll (io_fds=0x7effedad81d0, i_nfds=1, i_timeout=<optimized out="">) at osaf_poll.c:43
    timeout_ts = {tv_sec = 30, tv_nsec = 0}
    #3 0x00007effed6b38f4 in osaf_poll_one_fd (i_fd=11, i_timeout=30000) at osaf_poll.c:136
    set = {fd = 11, events = 1, revents = 0}
    result = <optimized out="">
    #4 0x00007effece16ff4 in rda_read_msg (sockfd=-307396144, msg=0x7effedad8250 "", size=30000) at rda_papi.c:662
    rc = <optimized out="">
    #5 0x00007effece17970 in rda_callback_task (rda_callback_cb=0x630be0) at rda_papi.c:128
    msg = '\000' <repeats 63="" times="">
    rc = 1
    value = -1
    retry_count = 0
    conn_lost = false
    cmd_type = RDE_RDA_UNKNOWN
    cb_info = {cb_type = PCS_RDA_ROLE_CHG_IND, info = {io_role = PCS_RDA_UNDEFINED}}
    #6 0x00007effecc007b6 in start_thread () from /lib64/libpthread.so.0
    No symbol table info available.
    #7 0x00007effec54f9cd in clone () from /lib64/libc.so.6
    No symbol table info available.
    #8 0x0000000000000000 in ?? ()
    No symbol table info available.

    Thread 4 (Thread 0x7effebc3f700 (LWP 3724)):
    #0 0x00007effecc0461c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
    No symbol table info available.
    #1 0x000000000041d3b8 in file_hndl_thread (noparam=<optimized out="">) at lgs_file.c:132
    rc = -512
    hndl_rc = 26
    dummy = 0
    FUNCTION = "file_hndl_thread"
    #2 0x00007effecc007b6 in start_thread () from /lib64/libpthread.so.0
    No symbol table info available.
    #3 0x00007effec54f9cd in clone () from /lib64/libc.so.6
    ---Type <return> to continue, or q <return> to quit---
    No symbol table info available.
    #4 0x0000000000000000 in ?? ()
    No symbol table info available.

    Thread 3 (Thread 0x7effedaf8b00 (LWP 3726)):
    #0 0x00007effec5464f6 in poll () from /lib64/libc.so.6
    No symbol table info available.
    #1 0x00007effed6eec6e in mdtm_process_recv_events () at mds_dt_tipc.c:580
    pfd = {{fd = 7, events = 1, revents = 0}, {fd = 8, events = 1, revents = 0}, {fd = 10, events = 1, revents = 0}}
    event = {event = 0, found_lower = 0, found_upper = 0, port = {ref = 0, node = 0}, s = {seq = {type = 0, lower = 0, upper = 0}, timeout = 0, filter = 0,
    usr_handle = "\000\000\000\000\000\000\000"}}
    PRETTY_FUNCTION = "mdtm_process_recv_events"
    #2 0x00007effecc007b6 in start_thread () from /lib64/libpthread.so.0
    No symbol table info available.
    #3 0x00007effec54f9cd in clone () from /lib64/libc.so.6
    No symbol table info available.
    #4 0x0000000000000000 in ?? ()
    No symbol table info available.

    Thread 2 (Thread 0x7effedb28b00 (LWP 3725)):
    #0 0x00007effec5464f6 in poll () from /lib64/libc.so.6
    No symbol table info available.
    #1 0x00007effed6b362a in osaf_poll_no_timeout (io_fds=0x7effedb28290, i_nfds=1) at osaf_poll.c:31
    result = 16777215
    #2 0x00007effed6b3825 in osaf_ppoll (io_fds=0x7effedb28290, i_nfds=1, i_timeout_ts=0xffffffffffffffff, i_sigmask=0xffffffffffffffff) at osaf_poll.c:78
    start_time = {tv_sec = 139637669659264, tv_nsec = 139637669659280}
    result = <optimized out="">
    #3 0x00007effed6b9edf in ncs_tmr_wait () at sysf_tmr.c:411
    rc = <optimized out="">
    inds_rmvd = <optimized out="">
    next_delay = 0
    ts_current = {tv_sec = 2992, tv_nsec = 725917120}
    ts = {tv_sec = 16777215, tv_nsec = 0}
    set = {fd = 6, events = 1, revents = 0}
    #4 0x00007effecc007b6 in start_thread () from /lib64/libpthread.so.0
    No symbol table info available.
    #5 0x00007effec54f9cd in clone () from /lib64/libc.so.6
    No symbol table info available.
    #6 0x0000000000000000 in ?? ()
    No symbol table info available.

    Thread 1 (Thread 0x7effedafb700 (LWP 3720)):
    ---Type <return> to continue, or q <return> to quit---
    #0 stream_ccb_apply_modify (opdata=0x65cff0) at lgs_imm.c:1387
    value = <optimized out="">
    attrMod = 0x65d244
    stream = <optimized out="">
    current_logfile_name = "applicationStream1_20140207_120654\000\373\377\177\000\000\365\224/\345\360\222\234\027P\033c\000\000\000\000\000\360\317e\000\000\000\000\000\300\317e\000\000\000\000\000\021\000\000\000\000\000\000\000\\365A\000\000\000\000\000\v\000\000\000\000\000\000\000\262\fB\000\000\000\000\000n4k\355\377~\000\000\060\000\000\000\060\000\000\000\060!\177\026\377\177\000\000p \177\026\377\177\000\000\000\000\000\000\000\000\000\000\060\060\066\065\000\000\000\000\320 \177\026\377\177\000\000\220'\177\026\377\177\000\000o)\177\026\377\177\000\000\200\037B\000\000\000\000\000\257\000\000\000\000\000\000\000 \000\000\000\060\000\000\000\240*\177\026\377\177\000\000\340)\177\026\377\177\000\000O\371n\355\377~\000\000\220'\177\026\377\177\000\000\t6"...
    new_cfg_file_needed = false
    n = <optimized out="">
    cur_time = <optimized out="">
    FUNCTION = "stream_ccb_apply_modify"
    #1 0x0000000000411d48 in stream_ccb_apply (opdata=0x65cff0) at lgs_imm.c:1476
    FUNCTION = "stream_ccb_apply"
    PRETTY_FUNCTION = "stream_ccb_apply"
    #2 0x0000000000411eea in ccbApplyCallback (immOiHandle=<optimized out="">, ccbId=<optimized out="">) at lgs_imm.c:1520
    ccbUtilCcbData = 0x65cfc0
    opdata = 0x65cff0
    FUNCTION = "ccbApplyCallback"
    PRETTY_FUNCTION = "ccbApplyCallback"
    #3 0x00007effed26532a in imma_process_callback_info (cb=0x7effed4812e0, cl_node=0x653260, callback=0x65c8c0, immHandle=17180000527) at imma_proc.c:2071
    ccbid = 175
    privateAugOmHandle = 0
    isPbeOp = <optimized out="">
    FUNCTION = "imma_process_callback_info"
    #4 0x00007effed267475 in imma_hdl_callbk_dispatch_all (cb=0x7effed4812e0, immHandle=17180000527) at imma_proc.c:1688
    callback = 0x65c8c0
    cl_node = 0x653260
    #5 0x00007effed25870d in saImmOiDispatch (immOiHandle=17180000527, dispatchFlags=SA_DISPATCH_ALL) at imma_oi_api.c:543
    rc = SA_AIS_OK
    cl_node = 0x0
    locked = false
    pend_fin = <optimized out="">
    pend_dis = <optimized out="">
    FUNCTION = "saImmOiDispatch"
    #6 0x0000000000412336 in main (argc=<optimized out="">, argv=<optimized out="">) at lgs_main.c:497
    ret = 0
    mbx_fd = <optimized out="">
    error = <optimized out="">
    rc = 0
    term_fd = 24
    ---Type <return> to continue, or q <return> to quit---
    FUNCTION = "main"
    (gdb)

     
  • Mathi Naickan

    Mathi Naickan - 2014-02-07
    • Milestone: 4.4.RC1 --> 4.4.RC2
     
  • Mathi Naickan

    Mathi Naickan - 2014-02-07

    Moved the milestone in preperation for RC1.

     
  • elunlen

    elunlen - 2014-02-10
    • status: unassigned --> accepted
    • assigned_to: elunlen
     
  • elunlen

    elunlen - 2014-02-10

    If the value of saLogStreamFileName is not poining to a string but is NULL it vill pass the attribute validity check used in the completed modify callback. This will cause a segmentation fault when trying to apply the non existing new name to the stream.

     

    Last edit: elunlen 2014-02-10
  • elunlen

    elunlen - 2014-02-19
    • Milestone: 4.4.RC2 --> 4.5.FC
     
  • elunlen

    elunlen - 2014-03-17
    • status: accepted --> review
     
  • elunlen

    elunlen - 2014-03-18
    • status: review --> fixed
     
  • elunlen

    elunlen - 2014-03-18

    changeset: 5073:589b72d69877
    tag: tip
    parent: 5070:fc02663112d8
    user: Lennart Lund lennart.lund@ericsson.com
    date: Tue Mar 18 17:07:48 2014 +0100
    summary: logsv: Do not allow NULL pointers for string variables in OI validity check [#771]

    ver: 589b72d6987798cb52a615b98c8044a7157d3ce8

    changeset: 5072:ce355da0165e
    branch: opensaf-4.4.x
    parent: 5069:c03bf3b7eb6b
    user: Lennart Lund lennart.lund@ericsson.com
    date: Tue Mar 18 17:07:48 2014 +0100
    summary: logsv: Do not allow NULL pointers for string variables in OI validity check [#771]

    ver: ce355da0165e5cb93d8c58821aed3b2182580a7f

    changeset: 5071:7b8ee8852e23
    branch: opensaf-4.3.x
    parent: 5068:789d348c0819
    user: Lennart Lund lennart.lund@ericsson.com
    date: Tue Mar 18 17:03:21 2014 +0100
    summary: logsv: Do not allow NULL pointers for string variables in OI validity check [#771]

    ver: 7b8ee8852e23c3aa33b1a910f3c6735f4cd8ee82

     

    Related

    Tickets: #771

  • elunlen

    elunlen - 2014-04-22
     
  • elunlen

    elunlen - 2014-04-22

    changeset: 5073:589b72d69877
    parent: 5070:fc02663112d8
    user: Lennart Lund lennart.lund@ericsson.com
    date: Tue Mar 18 17:07:48 2014 +0100
    summary: logsv: Do not allow NULL pointers for string variables in OI validity check [#771]

    rev: 589b72d6987798cb52a615b98c8044a7157d3ce8

    changeset: 5072:ce355da0165e
    branch: opensaf-4.4.x
    parent: 5069:c03bf3b7eb6b
    user: Lennart Lund lennart.lund@ericsson.com
    date: Tue Mar 18 17:07:48 2014 +0100
    summary: logsv: Do not allow NULL pointers for string variables in OI validity check [#771]

    rev: ce355da0165e5cb93d8c58821aed3b2182580a7f

    changeset: 5071:7b8ee8852e23
    branch: opensaf-4.3.x
    parent: 5068:789d348c0819
    user: Lennart Lund lennart.lund@ericsson.com
    date: Tue Mar 18 17:03:21 2014 +0100
    summary: logsv: Do not allow NULL pointers for string variables in OI validity check [#771]

    rev: 7b8ee8852e23c3aa33b1a910f3c6735f4cd8ee82

     

    Related

    Tickets: #771

  • elunlen

    elunlen - 2014-04-22

    Reopened.

    Problem in the attribute validity check when creating and modifying log stream IMM objects. This may cause log to crash instead of returning an appropriate error message in some cases if attributes are set in an incorrect way when creating or modifying IMM objects for streams. The current fix for #771 handles one of the problems but there is more to fix.

     
  • elunlen

    elunlen - 2014-04-22
    • status: fixed --> accepted
     
  • elunlen

    elunlen - 2014-05-02

    This ticket is related to [#448] that describes several problems in the parameter validity check.

     

    Related

    Tickets: #448

  • elunlen

    elunlen - 2014-05-08

    There is no new fix for this one. The reason for reopen is not valid. A new ticket [#891] is written

     

    Related

    Tickets: #891

  • elunlen

    elunlen - 2014-05-08
    • status: accepted --> fixed
     

Log in to post a comment.