OpenSAF / Tickets / #2219 ntfd: circular dependency with osafntfimcnd

Gary Lee - 2016-12-08

Traces from Minh

ntfimcnd_core.tgz

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Description has changed:

Diff:

--- old
+++ new
@@ -6,7 +6,7 @@

 So we have (1) depending on (2) depending on (3) depending on (1)

-This results in a temporary deadlock that dramatically slows down NTFD's ability to process its main dispatch loop. The deadlock only lasts for approx. 1 second, when mbcsv_mds_send_msg() times out. But since there could be lots of MBC messages to send, sometimes osafntfimcnd is killed with SIGABRT generating a coredump. The si-swap operation can also timeout.
+This results in a temporary deadlock that dramatically slows down NTFD's ability to process its main dispatch loop. The deadlock only lasts for approx. 1 second, when mbcsv_mds_send_msg() times out. But since there could be lots of MBC messages to send, sometimes osafntfimcnd is killed with SIGABRT generating a coredump. The si-swap operation will also timeout.

 ~~~
 SC-2 (active)

status: unassigned --> assigned
assigned_to: Praveen

Praveen - 2016-12-08

Hi Gary,

Is it the case when some configuration change was done and within not time switchover was triggered? In this case active IMCN was still processing IMM callbacks and sending notification for them. Since switchover got triggered when IMCN was processing CCB calbacks, it will get stuck in saNotificationSend() API as active NTFS was not available. As soon as active NTFS gets available, IMCN will clear its IMM callback list and will be able to act on term signal. But new active NTFS will have some notifications related to switchover for checkpointing to standby, so active NTFS will also get stuck.
Attach is the patch (2219_v0.patch) in which NTFS does not terminate IMCN in csi_set callback. It posts a local event with low priority to NTFS mailbox for terminating IMCN. Since standby will not stuck in csi set callback for IMCN termination, it will allow completion of si-swap operation and other high priority messages.
I think as a long term solution, termination of IMCN must be avoided.
Thanks,
Praveen

2219_v0.patch

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Description has changed:

Diff:

--- old
+++ new
@@ -7,6 +7,14 @@
 So we have (1) depending on (2) depending on (3) depending on (1)

 This results in a temporary deadlock that dramatically slows down NTFD's ability to process its main dispatch loop. The deadlock only lasts for approx. 1 second, when mbcsv_mds_send_msg() times out. But since there could be lots of MBC messages to send, sometimes osafntfimcnd is killed with SIGABRT generating a coredump. The si-swap operation will also timeout.
+
+steps to reproduce
+- Run loop of ntftest 32
+root@SC-1:~# for i in {1..10}; do ntftest 32; done
+- On another terminal, keep swapping 2N Opensaf SI, got coredump after couples of swaps
+root@SC-1:~# amf-adm si-swap safSi=SC-2N,safApp=OpenSAF
+...
+root@SC-1:~# amf-adm si-swap safSi=SC-2N,safApp=OpenSAF

 ~~~
 SC-2 (active)

Gary Lee - 2016-12-09

Hi Praveen

The original problem was seen during an upgrade of OpenSAF, after one controller had been upgraded and SMF was about to upgrade the other. Minh's instruction will also replicate the problem.

I think moving the instantiation/termination out of CSI set callback might be problematic. We might temporarily have no or multiple active / standby ntfimmcn instances.

Also, if an OpenSAF component calls ntf_send() from CSI set callback, during si-swap, the si-swap may still timeout.

Last edit: Gary Lee 2016-12-09

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Gary Lee - 2016-12-09

Hi Praveen

Vu is writing a test patch that starts a new thread to kill ntfimmcn. Do you think that would work?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Gary Lee - 2016-12-09

Minh and I talked about it, I think your patch can work if timedwait_imcn_exit() is changed a bit. So instead of waiting for X seconds, it could post another event to try again to keep the main loop flowing.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Praveen - 2016-12-09

Hi Gary, MInh,
What I see here is when NTFS sends term singal to IMCN component, the singal handler sigterm_handler() sets the FD. But poll does not gets unblocked for this because for IMM Dispatch is being done with SA_DISPATCH_ALL, so until all callbacks are processesed IMCN will not come out of dispatch. If it comes out then only TERM_EVENT will be processed.
So even if we terminate IMCN in a sepatate thread in NTFS the situation may remain same.
Regarding changing timedwait_imcn_exit() to post another event after first try may work. But in this case also configuration changes are continoulsy being done (testcase suggested by MInh) then Dispatch will remain busy in that single call and will never come out.
I still did not get one thing why IMMDispatch() is not coming out. It is working like a blocking call.

There could be other solutions like:
1)to do IMMDispatch in IMCN in a different thread, so main thread will be ready for any other event. But it will require good amount of code change.
2)Do not terminated IMCN now. Instead it can subscribe for PG tracking of MW SI and based on role it will work. It currently sends notification when it is active. (I never got why this terminated always. IMM and notificatiohn handles are valid across switchvoer and failovers. )
3)Third way could be to post events for sending notifications in mailbox, just like we do in AMF job queues

Thanks,
Praveen

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Minh Hon Chau - 2016-12-09

Hi Praveen,

Attached file bt printed from the coredump.
ntfimcnd could not get out of IMMDispatch because it was waiting for response of NtfSend() sync call.

Thanks
Minh

bt_ntfimcnd

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Minh Hon Chau - 2016-12-09

Just want to share our findings so far, not sure it's correct though.

Coredump tells that osafntfimcnd[2697] received changes of ccb 237, and was sending attributeChange notification which has numAdditionalInfo=8
CcbId = 237
...
lengthAdditionalText = 0, numAdditionalInfo = 8, notificationId = 0x6603f0,

CCB 237 committed at 11:01:35
2016-12-07 11:01:35 SC-1 osafimmnd[431]: NO Ccb 237 COMMITTED (multiCcbOwner)
2016-12-07 11:01:35 SC-2 osafimmnd[433]: NO Ccb 237 COMMITTED (multiCcbOwner)

ntfd-SC1 received QUIESCED csi callback at
Dec 7 11:01:34.502019 osafntfd [464:ntfs_amf.c:0175] >> amf_csi_set_callback
Dec 7 11:01:34.502022 osafntfd [464:ntfs_main.c:0271] >> initialize_for_assignment: ha_state = 3

ntfd-SC2 received ACTIVE csi callback at
Dec 7 11:01:35.414545 osafntfd [452:ntfs_amf.c:0175] >> amf_csi_set_callback
Dec 7 11:01:35.414553 osafntfd [452:ntfs_main.c:0271] >> initialize_for_assignment: ha_state = 1

ntf agent - amfd received MDS NEW ACTIVE
Dec 7 11:01:35.415337 osafamfd [486:ntfa_mds.c:0416] T2 MSG from NTFS NCSMDS_NEW_ACTIVE/UP
Dec 7 11:01:35.415356 osafamfd [486:ntfa_util.c:1550] TR Standby NTF server becomes new Active

So osafntfimcnd[2697] should have sent attributeChange around/after 11:01:35.415337
According to both ntfd traces, all notifications were preserved - no loss during switch over. So question is which notification was sent from osafntfimcnd[2697]
The last notification before switchover received at ntfd-SC1 has notifId = 537, and ntfd-SC2 (new active) continued receiving notifId starting from 538
The one sent from osafntfimcnd[2697] should have notifId=541 (because it was the only attributeChange notif having numAdditionalInfo=8 after 11:01:35:415337 and before osafntfimcnd[2697]'s death)
notif=541 was received at mds level
Dec 7 11:01:40.546538 osafntfd [452:ntfsv_enc_dec.c:0856] >> ntfsv_dec_not_msg

And processed in main thread at (delay time ~ 4 seconds)
Dec 7 11:01:44.572702 osafntfd [452:NtfAdmin.cc:0252] T2 New notification received, id: 541

osafntfimcnd[2697] died at
Dec 7 11:01:45.572387 osafntfd [452:ntfs_mds.c:0920] T8 MDS DOWN dest: 2020f000002b6, node ID: 2020f, svc_id: 29

So ntfd-SC2 could not send response to osafntfimcnd[2697]
Dec 7 11:01:45.575322 osafntfd [452:ntfs_com.c:0166] >> notfication_result_lib
Dec 7 11:01:45.575336 osafntfd [452:ntfs_com.c:0172] T4 not_id: 541, rv: 1
...
Dec 7 11:01:47.083431 osafntfd [452:ntfs_com.c:0176] TR ntfs_mds_msg_send FAILED

Looks like there's also a delay of notif 541 from sender-ntfimcnd[2697] to its receiver-[ntfd-SC2] at mds level, here it seems to be ~ 5 secs
MDS docs has mentioned this

Message loss minimization: When an MDS receiver performs a controlled switchover, it helps to minimize the loss of messages. The MDS provides a mechanism for receivers to declare switchovers. Then, the MDS agent of the sender starts buffering messages locally until a new “active” receiver appears. The buffered messages are then delivered to the new active address.

Another delay in ntfd-SC2's main thread that could not process notif=541 in time is the circular dependency as Gary described.
We can see many mbscv checkpoints but could not be processed in ntfd-SC1, because it was stuck in termination of ntfimcnd, that results in 1 second delay top up ntfd-SC2 for every mbscv timeout

Last edit: Minh Hon Chau 2016-12-09
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Praveen - 2016-12-09

Hi Gary, Minh,

Attached 2219_v1.patch is the patch which performs IMM dispatch in a separate thread. I kept 2219_v0.patch also and used class_def.xml for class defintion. It is based on ntfsv_test_classes.xml with only one class and one attribute. object_create.xml creates one object of this class.
With 10 switchovers and continuously running "for i in {0..20000} ; do immcfg -a testUint32Cfg=21 stringRdnCfg=object1; done" no crash observed.

Thanks,
Praveen

2219_v1.patch

class_def.xml

object_create.xml

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Hi Praveen,

I repeat the same test, got ntfd coredump, but it looks like another bug? Thanks, Minh

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/local/lib/opensaf/osafntfd --tracemask=0xffffffff'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  NtfNotification::getNextSubscription (this=0x978930, subId=...) at NtfNotification.cc:165
165 NtfNotification.cc: No such file or directory.
Traceback (most recent call last):
  File "/usr/share/gdb/auto-load/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19-gdb.py", line 63, in <module>
    from libstdcxx.v6.printers import register_libstdcxx_printers
ImportError: No module named 'libstdcxx'
(gdb) bt full
#0  NtfNotification::getNextSubscription (this=0x978930, subId=...) at NtfNotification.cc:165
No locals.
#1  0x000000000041658d in NtfAdmin::checkNotificationList (this=0x974240) at NtfAdmin.cc:661
        uSubId = {clientId = 1277, subscriptionId = 1}
        notification = {<std::tr1::__shared_ptr<NtfNotification, (__gnu_cxx::_Lock_policy)2>> = {
            _M_ptr = 0x978930, _M_refcount = {_M_pi = 0x978a40}}, <No data fields>}
        __FUNCTION__ = "checkNotificationList"
        posNot = {_M_node = 0x978770}
#2  0x000000000040407e in amf_csi_set_callback (invocation=4290772996, compName=<optimized out>, 
    new_haState=<optimized out>, csiDescriptor=...) at ntfs_amf.c:255
        error = <optimized out>
        prev_haState = SA_AMF_HA_STANDBY
        role_change = <optimized out>
        rc = <optimized out>
        __FUNCTION__ = "amf_csi_set_callback"
#3  0x00007ff32687e48e in ava_hdl_cbk_rec_prc (info=0x7ff32000b880, 
    reg_cbk=reg_cbk@entry=0x7ffc7437fd00) at ava_hdl.cc:655
        csi_set = 0x7ff32000b898
        actv_or_stdby = <optimized out>
        rc = 1
        i = <optimized out>
        __FUNCTION__ = "ava_hdl_cbk_rec_prc"
#4  0x00007ff32687f749 in ava_hdl_cbk_dispatch_all (hdl_rec=0x7ffc7437fda8, cb=0x7ffc7437fda0)
    at ava_hdl.cc:446
        list_resp = 0x962698
        reg_cbk = {saAmfHealthcheckCallback = 0x403ee0 <amf_health_chk_callback>, 
          saAmfComponentTerminateCallback = 0x403f00 <amf_comp_terminate_callback>, 
          saAmfCSISetCallback = 0x403fd0 <amf_csi_set_callback>, 
          saAmfCSIRemoveCallback = 0x403e70 <amf_csi_rmv_callback>, 
---Type <return> to continue, or q <return> to quit---
          saAmfProtectionGroupTrackCallback = 0x0, saAmfProtectionGroupTrackCallback_4 = 0x0, 
          saAmfProxiedComponentInstantiateCallback = 0x0, 
          saAmfProxiedComponentCleanupCallback = 0x0, 
          saAmfContainedComponentInstantiateCallback = 0x0, 
          saAmfContainedComponentCleanupCallback = 0x0, osafCsiAttributeChangeCallback = 0x0}
        rec = 0x7ff32000c2a0
        hdl = 4288675841
        rc = 1
#5  ava_hdl_cbk_dispatch (cb=0x7ffc7437fda0, hdl_rec=0x7ffc7437fda8, flags=<optimized out>)
    at ava_hdl.cc:320
        rc = 1
        __FUNCTION__ = "ava_hdl_cbk_dispatch"
#6  0x00007ff3268783ea in AmfAgent::Dispatch (hdl=4288675841, flags=flags@entry=SA_DISPATCH_ALL)
    at amf_agent.cc:283
        hdl_rec = 0x962610
        pend_dis = 0
        __FUNCTION__ = "Dispatch"
        cb = 0x961b20
        rc = SA_AIS_OK
        pend_fin = 0
#7  0x00007ff326878465 in saAmfDispatch (hdl=<optimized out>, flags=flags@entry=SA_DISPATCH_ALL)
    at amf_agent.cc:244
No locals.
#8  0x00000000004037e4 in main (argc=<optimized out>, argv=<optimized out>) at ntfs_main.c:357
        mbx_fd = <optimized out>
        error = <optimized out>
        rc = <optimized out>
        fds = {{fd = 20, events = 1, revents = 0}, {fd = 16, events = 1, revents = 1}, {fd = 22, 
            events = 1, revents = 1}, {fd = 14, events = 1, revents = 0}, {fd = 24, events = 1, 
---Type <return> to continue, or q <return> to quit---
            revents = 0}, {fd = 18, events = 1, revents = 0}}
        term_fd = 20
        __FUNCTION__ = "main"

Praveen - 2016-12-09

Hi Minh,
Based on NTFS trace if this crash looks a different issue, please raise a new ticket for this.

Thanks,
Praveen

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Minh Hon Chau - 2016-12-09

Sorry, I mean I repeat the same test with the patch V1, and got this coredump, will check the trace then.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Minh Hon Chau - 2016-12-12

V1 worked with the same test - no ntfimcnd coredump. Created #2223 for another ntfd coredump - different issue.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Gary Lee - 2016-12-12

Hi Praveen

V1 looks OK - as an enhancement perhaps the OI handle could be created in the thread itself to avoid passing the CB into the thread, and issues with concurrent access to the OI handle.

Last edit: Gary Lee 2016-12-12

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Vu Minh Nguyen - 2016-12-12

Hi Praveen,

I think, with your V1's patch, there is possiblity of ntf message lost as the IMCN main poll is non-blocking, then SIGTERM could be proceed while ntf msg not yet come to active NTFD.

Other thing, regarding your concern on handling IMCN terminating on a seperated thread,
you stated: "So even if we terminate IMCN in a sepatate thread in NTFS the situation may remain same."
We intend to not handle SIGABORT/SIGKILL in that thread, that thread will block until IMCN process proceed SIGTERM, and come up with new ha_state.

/Vu

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Vu Minh Nguyen - 2016-12-12

I attached the patch here for your ref.

ntfsv_imcn_coredump_v3.patch

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Praveen - 2016-12-12

Hi Vu,
I raised the concern for termination in a separate thread because IMCN is blocked in one of the CCB callbacks while sending notification NotificationSend() API. This is happening because of circular depedency as standby NTFS is blocked. We need to remember that when IMCN is blocked, the test case is still continuing and making configuration changes and because of this IMCN is accumulating those callbacks (IMM is not blocked and sending those callbacks to IMCN). These callbacks plus already in the list will be processed once IMCN gets unlocked. Now moving termination in a separate thread will surely allow standby NTFS to proceed and will unblock IMCN, but IMCN component after getting unblocked will still be executing those remaining callbacks. Earliar NTFS used to abort IMCN when terminates gets timed out. So we need gaurantee that thread will surely terminated IMCN.
One more thing, Will the temination by NTFS in a separate thread not lead to accumulation of terminations itself? Suppose NTFS started the thread for termination and termination thread has still not exited and another switchver happens. In this way one more thread will get spawned for termination.
If both these problems are resolved and gauranteed, I do not see any problem with V3 because my gues is the problem of notifcation being lost has been always existed and it will remain so irrespective of v1 or v3 (refined version) approach. The reasons are:
1) NTFS responds to AMF in csi set callback before termination of IMCN component. Because of this during switchover, a time comes when active NTFS has just responded for active assignments and is terminating standby IMCN process. While this is happening quiesced NTFS has got the standby CSI callback and it also has started termination of active IMCN process. Since there is no syncronization between these two terminations though on different nodes, configuration changes may get missed and hence the notification.
2) IMCN is not role aware. So NTFS currently terminates IMCN when it gets standby role andMCN exits without any confirmation that it has cleared its notification list and CCB callback list. Neither patch v1 not v3 handles this.
Note: These is a ticket fot these two (#157 NTF: Improve HA handling in IMCN).

Thanks,
Praveen

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Vu Minh Nguyen - 2016-12-12

Hi Praveen,

Below is my understanding regarding your concerns:

We need gaurantee that thread will surely terminated IMCN

IMCN component after getting unblocked, i think, it will go to handle_sigterm_event() for SIGTERM signal handling and exit gracefully.

Other thing is, when IMCN is blocking at saImmOiDispatch(), if any IMM configuration changes, all CCB callbacks to the blocking applier may get TIME_OUT, I guess. I see, IMCN sets IMMA_SYNCR_TIMEOUT IMM sync timeout to 01 second.

One more thread will get spawned for termination if doing another switchver

This concern is valid. We should have one flag to prevent this happens.

/Vu

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Praveen - 2016-12-12

Hi Vu,
That was expected behaviour that IMCN after getting unblocked will process sig_term event. For this to happen, it has to come out of IMMDispatch(DISPATCH_ALL). I think IMMND just sends CCB callbacks for atleast create/delete/modify and complete on one go. For CCB Apply IMMND will have to wait for completed response which creates a dependency on IMCN. But I this not synschronous lilke callbacks in other services. So in our test application which is only one client making conf changes, no problem will occur. However in case of multiple clients are making CCB changes then IMMND will be sending these to IMCNand these will be piling up.
Anyways, I do not see much risk but still I would like Minh to clarify the problem enviornment whether configuration changes are done much faster and by multiple clients. I have seen in some IMM tickets where reporter was creating thousands of objects
Yes, by maintaining the flag spawning of mulitple threads can be avoided. But we need to remember here that maintaining this flag does not block a user to trigger swtichvers. So per switchover, restart of IMCN may get missed.

Thanks,
Praveen

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

elunlen - 2016-12-12

Hi
The original specificatinon for IMCN said that it should be able to handle HA state changes. A first increment of IMCN without any HA handling except the possibility to inform the IMCN process if it shall run as active or standby when started. To handle a HA state change the IMCN process must be terminated and restarted given the correct state.
In the "IMCN NTF notification protocol" a special notification was defined (See Error Report Notification in the OpenSAF Extensions PR document) to be sent if there is a risk that any IMM changes may have been missed. This notification is sent every time IMCN is started as active. If a client receives such a notification it has to read all relevant IMM data in order to synchronize itself. After that it can rely on notifications from IMCN until the next Error report notification is received.
The intention was to make IMCN able to change role without beeing restarted and without loosing track of IMM changes (ticket #157). However the first increment was considered good enough so #157 was never implemented.
This means that IMCN can be terminated at any time also if executing an IMM callback function. It does not matter if any IMM change is lost and never reported. Since IMCN is a process and it is the process that is terminated there is no problem with memory leeks or resource leeks. IMM will detect that the client is gone and will finalize all resources.
In NTF the IMCN surveillance thread will also detect that the IMCN process is gone and start a new one and give it the correct state (which may no longer be active)
This means that any solution for immediate termination is Ok e.g. to handle the termination in a separate thread to avoid the described deadlock.
NOTE:
SInce there is no HA handling implemented it is actually not needed to start any IMCN process on any other node than the active. That this is done is also remainings from the original intention.

MUST BE FIXED!
I saw that the init_ntfimcn() has been move from the main initialize() to initialize_for_assignment(). This is NOT OK since initialize_for_assignment() is called more than once meaning that also init_ntfimcn() is called more than once. The init_ntfimcn() function starts the IMCN surveillance thread and that cannot be done more than once!
Other init functions that are called from initialize_for_assignment() has an "is_initialized flag" in order to prevent initialization more than once. This could be a solution to implement for init_ntfimcn() as well.

Thanks
Lennart

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Vu Minh Nguyen - 2016-12-12

Hi Lennart,

Seems the global flag fully_initialized in control block is used to ensure that initialize_for_assignment is called once.

Another thing I would like to bring up here, that is TIMEOUT for all NTF API is set to 10 seconds (#define NTFS_WAIT_TIME 1000). Then, if calling to one of them e.g saNtfNotificationSend(), in worse case, the calling thread will be blocked up to 10 seconds and could lead to unexpected result as below syslog when perfoming si-swap:

osafsmfd[9785]: NO Fail to invoke admin operation, rc=SA_AIS_ERR_TIMEOUT (5). dn=[safSi=SC-2N,safApp=OpenSAF], opId=[7]
osafsmfd[9785]: NO Admin op SA_AMF_ADMIN_SI_SWAP fail [rc = 5]

I am not sure if that setting time is ok or not.

/Vu

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Minh Hon Chau - 2016-12-13
  
  Hi Vu,
  
  For the NtfSend API(), 10 secs can count on saflogging at server side, writing to disk, ... so I guess we should not change for NtfSend(). For the other APIs() we could see how many critical jobs the server get involved, then we can decide to reduce this timeout, perhaps for thos API() doing finalize jobs
  
  thanks,
  Minh
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Minh Hon Chau - 2016-12-13

Hi,

I think both patches can solve the problem. V1 (Praveen's) removes the dependency in ntfimcnd, V3 (Vu's) removes the dependency in ntfd. Below is a summary I have collected from previous discussions:
- V3 (Vu's)
. There could be chances that we have no active ntfimcnd: the period of no active imcn happens when both termination threads are terminating imcnd at the same time (there should be one doing earlier), and the active imcnd is stuck in ntfInitialize()
. The period of 2 active ntfimcnd could be longer: In state transition from quiesced->standby, termination thread is ongoing, the old active imcnd could be a bit longer as active one
I think the difficulty in V3 is ntfd can not know things happening in both local and remote ntfimcnd

V1 (Praveen's)
. When ntfimcnd gets sigterm, the job from dispatching IMM callback to sending notification could be interrupted. If we make sigterm_handler sync with IMM's thread, that could cause signal abort from ntfd due to latency
. But what if loss of notification is always assumpted, and we have special notification to inform clients at ntfimcnd's start up
. IMM thread needs to do synchronization with sigterm_handler on common stuffs: ie immOI handle, ...

Thanks
/Minh
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ntfd: circular dependency with osafntfimcnd

Milestone

Searches

Help

#2219 ntfd: circular dependency with osafntfimcnd

Related

Discussion