Menu

#1172 system:performance statistic between OpenSaf 4.5 VS OpenSaf 4.4 on Physical network

5.0.FC
unassigned
nobody
None
discussion
unknown
-
major
2015-11-01
2014-10-16
No

While Collecting performance statistic between OpenSaf 4.5 VS OpenSaf 4.4
we observed following observation :

Issue : It reported that they are observing the Standby Controller/Payload joining time 7 seconds in OpenSaf 4.5 GA , which use to take only 2 seconds in OpenSaf 4.4 with NO Load (Default 4 node imm.xml only 2-controller & 2 payloads) on Physical networking.

OpenSAF(4.4) No Load (Default 4 node imm.xml)

Oct 10 11:48:35 SLES-SLOT2 opensafd: Starting OpenSAF Services
Oct 10 11:48:37 SLES-SLOT2 opensafd: OpenSAF(4.4.0 - ) services successfully started <==== 2 sec
====================================================================================

OpenSAF(4.5) No Load (Default 4 node imm.xml)

Oct 10 11:45:18 SLES-SLOT2 opensafd: Starting OpenSAF Services
Oct 10 11:45:28 SLES-SLOT2 opensafd: OpenSAF(4.5) services successfully started <==== 10 sec
====================================================================================

When we debugged ,it is found that on Physical Networking (Not VM`s) , IMMND pending incoming
fevs messages are reaching > 16 multiple times in quick section as soon as Syn started on Coordinator IMMND, so IMMND is sending ERR_TRY_AGAIN to sync process , where the sync process Retry time 1.7 sec which equivalent to TIPC tolerance time.
==============================================================================================
Oct 10 11:45:19 SLES-SLOT1 osafimmnd[5333]: ER ERR_TRY_AGAIN: Too many pending incoming fevs messages (> 16) rejecting sync iteration next request
Oct 10 11:45:21 SLES-SLOT1 osafimmnd[5333]: ER ERR_TRY_AGAIN: Too many pending incoming fevs messages (> 16) rejecting sync iteration next request
Oct 10 11:45:23 SLES-SLOT1 osafimmnd[5333]: ER ERR_TRY_AGAIN: Too many pending incoming fevs messages (> 16) rejecting sync iteration next request
Oct 10 11:45:25 SLES-SLOT1 osafimmnd[5333]: ER ERR_TRY_AGAIN: Too many pending incoming fevs messages (> 16) rejecting sync iteration next request
==============================================================================================

So IMMND is sending ERR_TRY_AGAIN to sync process , where the sync process retry time 1.7 sec which equivalent to TIPC tolerance time.

That is why we are seeing OpenSAF(4.5) Standby joining time even with with No load is taking 10 sec , which is 5 times higher than
OpenSAF(4.4).

If I tune IMMSV_DEFAULT_FEVS_MAX_PENDING=64 , then OpenSAF(4.5) Standby joining with in 1 second which is 50% less than OpenSAF(4.4) With the increased IMMSV_DEFAULT_FEVS_MAX_PENDING change I observed up-to 917% Performance improvement, please see attached OpenSAFStartup 4.4 vs 4.5 Time Second.pdf for complete OpenSaf 4.5 GA VS 4.4 statistics ) compare to OpenSaf 4.4 with Objects Load .

OpenSAF(4.5.0 - ) No Load (Default 4 node imm.xml) with #define IMMSV_DEFAULT_FEVS_MAX_PENDING 64

Oct 10 11:37:26 SLES-SLOT2 opensafd: Starting OpenSAF Services
Oct 10 11:37:27 SLES-SLOT2 opensafd: OpenSAF(4.5.0 - ) services successfully started <============= 1 sec
====================================================================================

Any how MMSV_DEFAULT_FEVS_MAX_PENDING will be tuned as part of Ticket #952 (immsv: sync optimization) ,mean while how we can address this?

1 Attachments

Discussion


Log in to post a comment.