Migrated from http://devel.opensaf.org/ticket/2601
Have seen a system crash where amfnd is trying to read IMM and immnd is trying to register with AMF.
http://devel.opensaf.org/ticket/1713 exist to improve things on the IMM side. This ticket should address and reduce the risk on the amfnd side.
There will always be a risk if immnd crashes that it will lead to a system crash with the current design. But in the normal case the deadlock should be avoided by design.
The core dump from the crash below shows that amfnd is trying to read component related info in the context of an API response. This information (component capability) can instead be read when the component is initialized.
I also realized there is a slight change in the protocol between amfd and amfnd that was not intentional and probably reduces the risk. amfd is immediately sending instantiate request without waiting for the REGSU response.
Changed 14 months ago by hafe ¶
■owner changed from ravisekhar to hafe
■status changed from new to accepted
Changed 14 months ago by hafe ¶
The protocol change mentioned is between 3.0 and 4.0. In 4.0 the REG_COMP message is not used at all and is dead code. It should be removed in both amfd and amfnd.
When REG_COMP is not used it triggers code to instantiate SUs before they are even registered properly! See the bottom of avd_node_up_evh(), since comp_sent is always false, avd_nd_reg_comp_evt_hdl() is called at this point. Instead the response from REG_SU should be awaited and then SUs should be instantiated. Interesting here is also the error handling when REG_SU fails. Consider that immnd crashes so amfnd cannot read from IMM during the REG_SU handling, should amfnd crash or respond with an error code to amfd? And what should amfd do with the failed REG_SU response?
amfnd reading from IMM needs to be minimized and possibly kept in the handling of REG_SU. Today amfnd is reading from IMM during the handling of an SI assignment. One problem is that the cstype for a CSI is unknown. Another problem is the component capability which is moved (in B.04) into association objects as children to comptype objects. Those objects can be read at REG_SU handling time and put into the comp object. But in order to now the capability for a specific cstype, the cstype needs to be known when the assignment comes.
Can the SI assignment message be extended with cstype information?
Changed 14 months ago by hafe ¶
■patch_waiting changed from no to yes
Changed 13 months ago by hafe ¶
changeset: 3523:b750a1a063cc
branch: opensaf-4.2.x
parent: 3521:ed09cbfa05dd
user: Hans Feldt hans.feldt@…
date: Fri Apr 27 15:41:43 2012 +0200
summary: avsv/avd: instantiate SUs after registration (#2601)
changeset: 3525:aa57d1e2ad6f
tag: tip
user: Hans Feldt hans.feldt@…
date: Fri Apr 27 15:41:43 2012 +0200
summary: avsv/avd: instantiate SUs after registration (#2601)
remote: rev b750a1a063cc1720b9fc8e433d5e9b5f0d1fe5da sent
remote: rev aa57d1e2ad6f6edbab077126e2e0ddad53b56fdc sent
patch 2 does not build, looking into that and will push separately.
Changed 13 months ago by hafe ¶
■milestone changed from 4.2.1 to future_releases
The urgent problem has been solved. In http://devel.opensaf.org/ticket/1713 and this ticket.
Future (after 4.2.1 release) intended work is to:
* update and push "[PATCH 2 of 2] avsv: remove reg_comp code (#2601)"
* read from IMM only in REGSU context.
https://sourceforge.net/p/opensaf/tickets/517