Amir - 2017-08-02

Hi,

Our applicaiton uses JGroups for cluster management and Infinisspan as replicated cache using JGroupsTransport.

Our application runs on Wildfly 10.1.0 using JGroups 3.6.10 & Infinispan 8.2.4. The default JGroups recommended configuration is used.

Clustering, aching is working fine in our application.We are running a four node cluster.When we load our application to handle more workload, the following error is seen on the JGroups log..

The folloiwng are the extract from the log:

Server4-log:

2017-07-21 10:27:55,614 SEVERE Timer-3,drgWork_loader,server4/0TCP JGRP000029: server4/0: failed sending message to server3:45606 (89 bytes): java.lang.NullPointerException, headers: MERGE3: INFO: view_id=[server3/0 9], logical_name=server4/0, physical_addr=server4:45606, TP: [cluster_name=drgWork_loader]

2017-07-21 10:27:55,622 SEVERE Timer-2,drgWork_loader-infinispan,server4-16598TCP JGRP000029: server4-16598: failed sending message to server3:45607 (137 bytes): java.l2017-07-21 10:27:55,622 SEVERE Timer-2,drgWork_loader-infinispan,server4-16598ang.NullPointerException, headers: TCPPING: [type=GET_MBRS_REQ, cluster=drgWork_loader-infinispan], TP: [cluster_name=drgWork_loader-infinispan]

Server3-log

2017-07-21 10:27:55,677 WARNING OOB-515,drgWork_loader-infinispan,server3-1927 NAKACK2 JGRP000011: vpsl32101-1927: dropped message 2040 from non-member server4-16598 (view=[server3-1927 10] (3) [server3-1927, server1-34448, server2-4574])

Server2.log

2017-07-21 10:27:55,612 WARNING TcpServer.Acceptor [45607],null,null TCP JGRP000006: failed accepting connection from peer: java.io.EOFException

2017-07-21 10:27:55,672 WARNING OOB-529,drgWork_loader-infinispan,server2-4574 NAKACK2 JGRP000011: server2-4574: dropped message 2040 from non-member server2-16598 (view=[server3-1927

Server1.log

2017-07-21 10:27:55,671 WARNING OOB-721,drgWork_loader-infinispan,server1-34448 NAKACK2 JGRP000011: server1-34448: dropped message 2040 from non-member server4-16598 (view=[server3-1927

2017-07-21 10:27:55,738 WARNING Incoming-2,drgWork_loader,server1/0 NAKACK2 JGRP000011: server1/0: dropped message 3 from non-member server4/0 (view=[server3/0


After getting these errors our application & infinispan becomes unstable.

Can you please suggest, what is causing this issue and what is to be done to overcome this issue.

Regards,
Amir