Menu

#2207 cpnd: osafckptnd core dump while handling error section_hdr_update_fails

5.2.FC
fixed
None
defect
ckpt
nd
major
2017-02-07
2016-11-24
No

the steps to add a section is add_db_tree -> update_sec_hdr -> update_ckpt_hdr so if an error occur cpsv should handle error in reverse order.
currently, section_hdr_update_fails, cpsv revert ckpt_hdr also that possibly cause the problem

Coredump:

Core was generated by `/usr/lib64/opensaf/osafckptnd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f1846da1109 in __memcpy_sse2_unaligned () from /lib64/libc.so.6
Missing separate debuginfos, use: zypper install opensaf-ckpt-nodedirector-debuginfo-5.1.0-99.0.686bc00.sle12.x86_64
(gdb) bt
#0  0x00007f1846da1109 in __memcpy_sse2_unaligned () from /lib64/libc.so.6
#1  0x00007f1847f26f25 in memcpy (__len=<optimized out>, __src=<optimized out>, __dest=<optimized out>)
    at /usr/include/x86_64-linux-gnu/bits/string3.h:53
#2  ncs_os_posix_shm (req=req@entry=0x7ffd24f978f0) at os_defs.c:859
#3  0x000000000041561d in cpnd_ckpt_hdr_update (cp_node=cp_node@entry=0xa708e0) at cpnd_proc.c:1833
#4  0x000000000041b016 in cpnd_ckpt_sec_del (cp_node=cp_node@entry=0xa708e0, id=id@entry=0x7f1840007350)
    at cpnd_sec.cc:220
#5  0x0000000000406190 in cpnd_ckpt_sec_add (cp_node=cp_node@entry=0xa708e0, id=0x7f1840007350,
    exp_time=1479906629297747000, gen_flag=gen_flag@entry=0) at cpnd_db.c:473
#6  0x000000000040cfe2 in cpnd_evt_proc_ckpt_sect_create (cb=cb@entry=0x8c77f0,
    evt=evt@entry=0x7f184000a560, sinfo=sinfo@entry=0x7f184000abb8) at cpnd_evt.c:2244
#7  0x000000000040e7cc in cpnd_process_evt (evt=0x7f184000a550) at cpnd_evt.c:227
#8  0x00000000004103bd in cpnd_main_process (cb=cb@entry=0x8c77f0) at cpnd_init.c:579
#9  0x0000000000405383 in main (argc=<optimized out>, argv=<optimized out>) at cpnd_main.c:79

consequence of #2202

1 Attachments

Related

Tickets: #2207
Wiki: ChangeLog-5.1.1

Discussion

  • Vo Minh Hoang

    Vo Minh Hoang - 2016-11-24
    • status: accepted --> review
     
  • A V Mahesh (AVM)

     
  • A V Mahesh (AVM)

    We are some how able to simulated one case where slimier core dump occurs.

    their can some other case this core dump can occur , as soon as we fund root cause

    we will provide the patch for that as well.

    In this test case , If OSAF_CKPT_SHM_ALLOC_GUARANTEE is NOT set and SHM is 100% used in system ,
    cpnd Segmentation fault (core dumped) at LEAP memcpy().

    Following are the detailed steps how we reproduced ,
    this test is generating same core dumb as below.

    Test application : cpsv_shm_2202.c

    ==================================================================

    1) /etc/init.d/opensafd stop

    2) Change the defaults /dev/shm size to 3MB

      # vi /etc/fstab  tmpfs
    
     And add following line
    
     ` tmpfs               /dev/shm             tmpfs defaults,size=3m   0 0`
    

    3) Remount /dev/shm

    #mount -o remount /dev/shm
    

    4) Check /dev/shm reflected with new value

    # df -k /dev/shm/
    
    Filesystem     1K-blocks  Used Available Use% Mounted on
    tmpfs               3072     0      3072   0% /dev/shm
    

    5) set ulimit to unlimited

    #ulimit -c  unlimited
    

    6) #/etc/init.d/opensafd start

    7) Compile & run attached test application ( cpsv_shm_2202.c )

    #gcc cpsv_shm_2202.c -o ckpt_shm -lSaCkpt
    
    # ./ckpt_shm
    

    8) Once /dev/shm/ reach 100% Use you will see core dump same as yours

    # df -k /dev/shm/
    

    7) Then we applied the patch test again with no core dump

    saCkptSectionCreate 1 returned 18. ( no core dump )

     
  • A V Mahesh (AVM)

     
  • A V Mahesh (AVM)

    • Attachments has changed:

    Diff:

    --- old
    +++ new
    @@ -0,0 +1 @@
    +cpsv_shm_2202.c (12.7 kB; application/octet-stream)
    
     
  • A V Mahesh (AVM)

    changeset: 8396:7c92427bfd93
    branch: opensaf-5.1.x
    user: Hoang Vo hoang.m.vo@dektech.com.au
    date: Thu Dec 01 15:14:41 2016 +0530
    summary: cpnd: fix error handling while section_hdr_update_fail [#2207]

    changeset: 8397:21094b948d29
    tag: tip
    parent: 8392:119ad64e95b0
    user: Hoang Vo hoang.m.vo@dektech.com.au
    date: Thu Dec 01 15:18:01 2016 +0530
    summary: cpnd: fix error handling while section_hdr_update_fail [#2207]

     

    Related

    Tickets: #2207

  • A V Mahesh (AVM)

    • status: review --> fixed
     

Log in to post a comment.