Changeset: 4.6.M0 - 6009:b2ddaa23aae4
When starting ~50 linux containers IMMD coredumps resulting in cluster reset.
Communication is TCP.
dtmd.conf configuration is:
DTM_SOCK_SND_RCV_BUF_SIZE=65536
DTM_CLUSTER_ID=1
DTM_NODE_IP=172.17.1.42
DTM_MCAST_ADDR=224.0.0.6
BatchSize reduced to 4096
opensafImm=opensafImm,safApp=safImmService
Name Type Value(s)
========================================================================
opensafImmSyncBatchSize SA_UINT32_T 4096 (0x1000)
When node PL-51 joins the cluster the following messages is seen in the syslog:
Oct 6 00:35:57 SC-1 osafdtmd[1028]: NO Established contact with 'PL-51'
Oct 6 00:35:57 SC-1 osafimmd[1063]: NO Extended intro from node 2330f
Oct 6 00:35:57 SC-1 osafimmd[1063]: NO Node 2330f request sync sync-pid:79 epoch:0
Oct 6 00:35:58 SC-1 osafimmnd[1072]: NO Announce sync, epoch:292
Oct 6 00:35:58 SC-1 osafimmnd[1072]: NO SERVER STATE: IMM_SERVER_READY --> IMM_SERVER_SYNC_SERVER
Oct 6 00:35:58 SC-1 osafimmnd[1072]: NO NODE STATE-> IMM_NODE_R_AVAILABLE
Oct 6 00:35:58 SC-1 osafimmd[1063]: NO Successfully announced sync. New ruling epoch:292
Oct 6 00:35:58 SC-1 osafimmloadd: NO Sync starting
Oct 6 00:36:00 SC-1 osafimmd[1063]: MDTM unsent message is more!=200
Oct 6 00:36:00 SC-1 osafimmnd[1072]: WA Director Service in NOACTIVE state - fevs replies pending:9 fevs highest processed:20037
Oct 6 00:36:00 SC-1 osafamfnd[1143]: NO 'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : Recovery is 'nodeFailfast'
Oct 6 00:36:00 SC-1 osafamfnd[1143]: ER safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery is:nodeFailfast
Oct 6 00:36:00 SC-1 osafamfnd[1143]: Rebooting OpenSAF NodeId = 131343 EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 131343, SupervisionTime = 60
Oct 6 00:36:00 SC-1 opensaf_reboot: Rebooting local node; timeout=60
Oct 6 00:36:00 SC-1 osafimmnd[1072]: NO No IMMD service => cluster restart, exiting
There is a coredump generated:
core_1412555760.osafimmd.1063
The IMMD crashes inside MDS BCAST send.
Information is needed about exactly which branch & changeset this was
executed with. There have been some fixes recently in MDS on the
4.5 and default branch. Relevant may also be a TIPC fix/patch.
TIPC fix is not related to this TCP MDS BCAST send , the issue is seen when opensaf is running docker containers setup ,while ~50 payload is joining the cluster.
Changeset is provided.
This doesn't look like Opensaf/MDS/IMM issue.
In case of INTRA Node send() fails , MDS do allow to recover temporary network problem by
by queuing up to 200 messages , in this case the network was not recovered till the accumulation of 200 messages in unsent queue, so MDS did a intentional asset assuming the network issue may not be recoverable.
Aha I have seen this one before. This is a behavior difference between MDS/TCP and MDS/TIPC. With TIPC we get flow control by having a blocking send. In this case obviously not. I remember that I have seen this before. Any kind of bursty send would trigger this. For example a LOG burst of async messages.
Why can't send be blocking in the MDS/TCP case and this queue removed?
I wonder if I did write some ticket on this...
https://sourceforge.net/p/opensaf/tickets/607/
Duplicate of https://sourceforge.net/p/opensaf/tickets/607/
With the patch provided in #607 I can passs 45 containers limitation and get up to 67 containers before next problem appears; which I am investigating.
Around 60 nodes, where nodes still are joining the cluster; other nodes seem to leave the cluster. Some immnd coredumps are seen on payloads. One segfault of immnd on controller. But this is subject for different ticket where I have done more investigation.