Menu

#2395 CKPT: Performance degradation ~100% (Time taken is almost double than previous)

5.2.0
fixed
None
defect
ckpt
doc
major
2017-04-08
2017-03-23
No

Environment details

OS : Suse 11, 64bit Physical machine
Changeset : 8634 ( 5.2.FC)
Setup : 4 nodes

There is considerable degradation in CKPT performance in 5.2 when compared to 5.1. The times are calculated just before api and after api for which time difference is calculated.

-> For write operations, checkpoint write api is taking 2x the time taken in earlier release 5.1. Issue is observed in both synchronous and asynchronous mode.
( synchronous -- Checkpoint create flags used : SA_CKPT_WR_ALL_REPLICAS
asynchronous -- Checkpoint create flag used : SA_CKPT_WR_ACTIVE_REPLICA | SA_CKPT_CHECKPOINT_COLLOCATED ) Both local and remote replica

-> For section create operations in asynchronous mode for local replica, checkpoint section create api is taking more than 70% the earlier value in 5.1

-> For read operations in asynchronous mode for local replica, checkpoint read api is taking twice the time than in earlier value in 5.1

Please check the tickets pushed as part of 4.7 to 5.0, for which API performance got affected.

Related

Tickets: #2395

Discussion

  • Anders Widell

    Anders Widell - 2017-03-27

    Diff of changes in CKPT between OpenSAF 5.1.0 and the latest changeset on the default branch.

     
  • A V Mahesh (AVM)

    Performance degrade of cpsv expected because of #2202 , we agreed to have performance degrade in default configuration to address #2202 , if user what to natural performance OSAF_CKPT_SHM_ALLOC_GUARANTEE is set to true cpsv give natural performance.

    ==============================================================================

    [devel] [PATCH 1 of 3] leap : now leap library ensure shm availability before writing [#2202]

    On 11/29/2016 4:07 PM, mahesh.valla@oracle.com wrote:

    Issue :

    If OSAF_CKPT_SHM_ALLOC_GUARANTEE is NOT set and SHM is 100% used in system ,
    pnd Segmentation fault (core dumped) at LEAP memcpy().

    Fix :

    Now LEAP library ensures shm free space before writing
    This may degrade some performance of cpsv , if OSAF_CKPT_SHM_ALLOC_GUARANTEE is set,
    cpsv give natural performance.

    ==============================================================================

     

    Related

    Tickets: #2202

  • A V Mahesh (AVM)

    • status: unassigned --> assigned
    • assigned_to: A V Mahesh (AVM)
     
  • Anders Widell

    Anders Widell - 2017-04-03
    • Milestone: 5.2.RC2 --> 5.2.0
     
  • Anders Widell

    Anders Widell - 2017-04-03
    • Milestone: 5.2.0 --> future
     
  • Anders Widell

    Anders Widell - 2017-04-03
    • Milestone: future --> 5.2.0
     
  • Chani Srivastava

    After exporting OSAF_CKPT_SHM_ALLOC_GUARANTEE=1, performance for checkpoint read operation is improved and is comparable with 5.1.

    But the write and scetion create operation are still giving degraded performance as compared to 5.1 results.

     

    Last edit: Chani Srivastava 2017-04-05
  • Chani Srivastava

    • summary: CKPT: Performance degradation upto 200% --> CKPT: Performance degradation ~100% (Time taken is almost double than previous)
     
  • Anders Widell

    Anders Widell - 2017-04-05

    Could you measure the performance of CKPT after applying the attached patch? Also, make sure to set OSAF_CKPT_SHM_ALLOC_GUARANTEE=2.

     
    • Chani Srivastava

      With this patch shared by Anders, the performance figures shows great improvement then before and the results are comparable to 5.1 results

       

      Last edit: Chani Srivastava 2017-04-06
  • A V Mahesh (AVM)

    I did verify performance degrade is not because of feature [#2202] https://sourceforge.net/p/opensaf/tickets/2202/.
    The statistics WITH #[#2202] feature enable and disabled the % of degrade is ignoreable.

    As if #[#2202] is NOT the major root cause of performance degrade , so for now we don't required any /Immediate changes on top of #[#2202]

    Irrelevant of this #[#2202] feature still we do see 70% to 100% performance degrade in observed
    this could be because of some other changes like cpnd: use shared memory based on ckpt name length [#2108]
    where the SHM change are related to support longDN, currently I am in the process of isolating the change which are causing
    the performance degrade, will update as soon as possible.

    -AVM

     

    Related

    Tickets: #2202

  • A V Mahesh (AVM)

    The reported statistics was not appropriate for OSAF_CKPT_SHM_ALLOC_GUARANTEE=1,
    statistics was taken accidentally with enabled TRACES for IMM , we are seeing NORMAL performance with out any change to 5.2.RC2 code.

    If OSAF_CKPT_SHM_ALLOC_GUARANTEE is set to true (export OSAF_CKPT_SHM_ALLOC_GUARANTEE=1 & Pr-allocated )
    we are seeing NORMAL performance with out any change to 5.2.RC2 code.

    If OSAF_CKPT_SHM_ALLOC_GUARANTEE is set to false (export OSAF_CKPT_SHM_ALLOC_GUARANTEE=0 & default )
    we are seeing seeing ~70% performance degrade as expected with out any change to 5.2.RC2 code.

    Do we still need OSAF_CKPT_SHM_ALLOC_GUARANTEE=2 (Neither per-allocated nor check if memory is available) option as default ,

    but I not for it whybecause osafckptnd core dump in high memory load reported in [#2202] .

    Any how creating/pushing README.SHM explaining the above configuration options for This #ticket

     

    Related

    Tickets: #2202

  • A V Mahesh (AVM)

    • Part: - --> doc
     
  • Anders Widell

    Anders Widell - 2017-04-06
    • status: assigned --> review
     
  • A V Mahesh (AVM)

    On 4/6/2017 5:29 PM, Chani Srivastava wrote:

    With this patch the performance figures shows great improvement then before and the results are > >comparable to 5.1 results

    Thanks for the testing.

    This patch provides the option of rollback way to configure CKPT to get the old behavior as 5.1
    so statistics will match 5.1

     
  • Anders Widell

    Anders Widell - 2017-04-08
    • status: review --> fixed
     
  • Anders Widell

    Anders Widell - 2017-04-08

    changeset: 8753:44e1b6913d35
    user: Anders Widell an..@..com
    date: Sat Apr 08 11:39:46 2017 +0200
    summary: ckpt: Add option OSAF_CKPT_SHM_ALLOC_GUARANTEE=2 for backwards compatibility [#2395]

    [staging:44e1b6]

     

    Related

    Commit: [44e1b6]
    Tickets: #2395


Log in to post a comment.