javagroups-development Mailing List for JGroups

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

http://belaban.blogspot.com/2020/11/i-hate-distributed-locks.html

-- 
Bela Ban, JGroups lead (http://www.jgroups.org)

Hi,

There is cluster of 5 nodes, we tried to restart node5, but it couldn't get
restarted.

I am new to jgroups concept, I can see the following log in node5.

we are using jgroups-3.3.5.Final.jar.

30.01.2020 07.06.38:134 1912 DEBUG {main}JGROUPS_GMS election results:
{98765487-bd11-d6dd-eb21-b9e7d345c9fb=8}
30.01.2020 07.06.38:134 1913 DEBUG {main}JGROUPS_GMS sending
JOIN(10.5.181.132-smsrouter) to 98765487-bd11-d6dd-eb21-b9e7d345c9fb
30.01.2020 07.06.38:983 1914 INFO {circle-id-conn-rate-stats-keeper-1}WHA
Permits replenished.Last 1 second interval traffic stats:Stats
[minResponseWaitTime=0, maxResponseWaitTime=0, avgResponseWaitTime=0.0,
minConnectionAcquireWaitTime=0, maxConnec
tionAcquireWaitTime=0, avgConnectionAcquireWaitTime=0.0, requestsCount=0]
30.01.2020 07.06.39:983 1915 INFO {circle-id-conn-rate-stats-keeper-1}WHA
Permits replenished.Last 1 second interval traffic stats:Stats
[minResponseWaitTime=0, maxResponseWaitTime=0, avgResponseWaitTime=0.0,
minConnectionAcquireWaitTime=0, maxConnec
tionAcquireWaitTime=0, avgConnectionAcquireWaitTime=0.0, requestsCount=0]
30.01.2020 07.06.40:983 1916 INFO {circle-id-conn-rate-stats-keeper-1}WHA
Permits replenished.Last 1 second interval traffic stats:Stats
[minResponseWaitTime=0, maxResponseWaitTime=0, avgResponseWaitTime=0.0,
minConnectionAcquireWaitTime=0, maxConnec
tionAcquireWaitTime=0, avgConnectionAcquireWaitTime=0.0, requestsCount=0]
30.01.2020 07.06.41:134 1917 WARNING {main}JGROUPS_GMS
JOIN(10.5.181.132-smsrouter) sent to 98765487-bd11-d6dd-eb21-b9e7d345c9fb
timed out (after 3000 ms), on try 150

Please help me out.

Thanks in advance.

--
Sent from: http://jgroups.1086181.n5.nabble.com/JGroups-Dev-f6604.html

Hi folks, as a heads up:

I just created branch 4.x and merged the (inofficial) 5.x branch 
JGRP-2218.msg [1] onto master.

4.x is used for 4.1.x/4.2.x and will only be used to backport bug fixes 
from master, but no new functionality will be added.

The master branch is now used for development of 5.x. Once I've finished 
the documentation and updated the web site, I'll release an alpha of 5.0 
and blog about it.

This means that if you want to get the latest stable code, go to 4.x. If 
you want to try out 5.x, use master.

I hope to release the first alpha by the end of next week.
Cheers,

[1] https://issues.redhat.com/browse/JGRP-2218

-- 
Bela Ban | http://www.jgroups.org

Sorry, but I don't support 2.6.15, which is 9+ years old!
This even uses JBossCache, has has been EOL'ed a long time ago.

I suggest upgrade those components to a more recent version.
Cheers,

On 26.09.19 09:16, Development issues wrote:
> Hi Team,
> 
> We are facing some “jgroups” cluster related exceptions which is directly
> creating application related issues. Once we restart the system gets
> recovered.
> 
> Attaching the logs from 4 different servers. We are getting below error lots
> of times :
> 
> [org.quartz.impl.jdbcjobstore.JobStoreCMT] ClusterManager: Error managing
> cluster: Failure obtaining db row lock: ORA-00060:
> 
> Please let us know ,if you have any information related to these issues.
> 
> Let us know if you need any information to debug the issue.
> 
> Version :: “jgroups.jar version 2.6.15 GA”
> 
> 
> Regards,
> Manisha
> server_N1.log
> <http://jgroups.1086181.n5.nabble.com/file/t1838/server_N1.log>
> server_N2.log
> <http://jgroups.1086181.n5.nabble.com/file/t1838/server_N2.log>
> server_S1.log
> <http://jgroups.1086181.n5.nabble.com/file/t1838/server_S1.log>
> server_s2.log
> <http://jgroups.1086181.n5.nabble.com/file/t1838/server_s2.log>
> 
> 
> 
> --
> Sent from: http://jgroups.1086181.n5.nabble.com/JGroups-Dev-f6604.html
> 
> 
> _______________________________________________
> Javagroups-development mailing list
> 

-- 
Bela Ban | http://www.jgroups.org

Hi Team,

We are facing some “jgroups” cluster related exceptions which is directly
creating application related issues. Once we restart the system gets
recovered.

Attaching the logs from 4 different servers. We are getting below error lots
of times :

[org.quartz.impl.jdbcjobstore.JobStoreCMT] ClusterManager: Error managing
cluster: Failure obtaining db row lock: ORA-00060:

Please let us know ,if you have any information related to these issues.

Let us know if you need any information to debug the issue.

Version :: “jgroups.jar version 2.6.15 GA”

Regards,
Manisha
server_N1.log
<http://jgroups.1086181.n5.nabble.com/file/t1838/server_N1.log>  
server_N2.log
<http://jgroups.1086181.n5.nabble.com/file/t1838/server_N2.log>  
server_S1.log
<http://jgroups.1086181.n5.nabble.com/file/t1838/server_S1.log>  
server_s2.log
<http://jgroups.1086181.n5.nabble.com/file/t1838/server_s2.log>  

--
Sent from: http://jgroups.1086181.n5.nabble.com/JGroups-Dev-f6604.html

FYI: 
http://belaban.blogspot.com/2019/07/compiling-jgroups-to-native-code-with.html
-- 
Bela Ban | http://www.jgroups.org

Hello,

I am using ReplicatedHashMap with ForkChannel. PFA configuration file
(*config.xml*).

ForkChannel forkChannel = new ForkChannel(channel, "myForkStack",
"myForkChannel");
forkChannel.connect(CLUSTER_NAME);

logger.info("ViewAsString :: " + forkChannel.getViewAsString());

wordCounter = new ReplicatedHashMap<>(forkChannel);
wordCounter.start(10_000);

Here, I am using Fork channel to make private communication between main
channel and Fork channel. Now, my JGroup Cluster initialized properly on
Main channel and Fork Channel. The values in ReplicatedHashMap are visible
to all running nodes, but as soon as I start new node in cluster, the
ReplicatedHashMap initialized again and the existing values in
ReplicatedHashMap are not visible to newly started Node.

PFA sample test program.

Note : Before I was using two channels with "Shared Transport" (<UDP
singleton_name="tp_one"), but Upgrading JGROUP to 3.6.16 gives warning to
use Fork Channel instead of Shared Transport.

Regards,
Tarana Desai

FYI: http://belaban.blogspot.com/2019/01/jgroups-4016.html

-- 
Bela Ban | http://www.jgroups.org

Hi,
We are facing an issue with jgroup on start of services. We have some caches
that are configured with jgroups which will be populated on start of server.
However we are seeing some dropped messages and it takes a lot of time .
There are also messages when it tries to connect to other nodes. However the
connection fails . 
Can we configure jgroup not to replicate to any nodes during start ?

Thanks

--
Sent from: http://jgroups.1086181.n5.nabble.com/JGroups-Dev-f6604.html

I'm afraid I don't support such an old version (5 years old), see [1] 
for details.

Running out of credits may have a number of reasons, e.g. application 
threads blocking on the receivers, excessive GC, exhausted thread pools 
(this can be check with probe.sh) etc.

I highly recommend to upgrade to the latest stable 3.6.x or 4.0.x 
version. Then copy the tcp.xml shipped with that version and modify it 
to fit your env, e.g. replace TCPPING with JDBC_PING etc. I see that you 
for example still use UNICAST and NAKACK instead of UNICAST3 and NAKACK2 
in your config...

[1] https://developer.jboss.org/wiki/Support
[2] https://sourceforge.net/projects/javagroups/files/JGroups/

On 24/05/18 11:46, Development issues wrote:
> Hi
> 
> Usually in discount/sale periods we are facing the threads which take care
> of sending JGroups messages to be in  TIMED_WAITING state, more precisely
> on:
> 
> java.lang.Thread.State: TIMED_WAITING (parking)
> 	at sun.misc.Unsafe.park(Native Method)
> 	- parking to wait for  <0x00000005d2968450> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> 	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> 	at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2163)
> 	at org.jgroups.util.CreditMap.decrement(CreditMap.java:157)
> 	at org.jgroups.protocols.MFC.handleDownMessage(MFC.java:102)
> 
> 
> It's like the case when the messages are so many the nodes cannot handle
> them and there are no enough credits to be send over between the nodes in
> order new messages to be processed.
> 
> During peak periods we are having more or less 17 - 20 AWS EC2 instances in
> our cluster. One of the EC2 instance is dedicated for batch processing and
> on few occasions we receive huge files which initiates a big load of
> messages. We have around 10 nodes serving the user traffic and some more
> nodes for administration purposes. All of these nodes are communication
> between each other via JGroups in the cluster using TCP (at the time we
> migrated to AWS there was a constraint on only using TCP and we are
> exploring the ways to move to UDP now) and we are using version 3.4.1
> JGroups.
> 
> However having said that, with the current infrastructure, what should be
> the proposed JGroups TCP configuration? We feel that it is good practice to
> optimise our configuration.
> 
> our configuration is as follows:
> <config xmlns="urn:org:jgroups"
> 		xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> 		xsi:schemaLocation="urn:org:jgroups
> http://www.jgroups.org/schema/JGroups-3.0.xsd">
> 	
> 	
> 	<TCP loopback="true"
> 		recv_buf_size="${tcp.recv_buf_size:20M}"
> 		send_buf_size="${tcp.send_buf_size:640K}"
> 		discard_incompatible_packets="true"
> 		max_bundle_size="64K"
> 		max_bundle_timeout="30"
> 		enable_bundling="true"
> 		use_send_queues="true"
> 		sock_conn_timeout="300"
> 		timer_type="new"
> 		timer.min_threads="4"
> 		timer.max_threads="10"
> 		timer.keep_alive_time="3000"
> 		timer.queue_max_size="500"
> 		thread_pool.enabled="true"
> 		thread_pool.min_threads="10"
> 		thread_pool.max_threads="40"
> 		thread_pool.keep_alive_time="5000"
> 		thread_pool.queue_enabled="false"
> 		thread_pool.queue_max_size="10000"
> 		thread_pool.rejection_policy="discard"
> 		oob_thread_pool.enabled="true"
> 		oob_thread_pool.min_threads="5"
> 		oob_thread_pool.max_threads="20"
> 		oob_thread_pool.keep_alive_time="5000"
> 		oob_thread_pool.queue_enabled="false"
> 		oob_thread_pool.queue_max_size="10000"
> 		oob_thread_pool.rejection_policy="discard"
> 		bind_addr="${hybris.jgroups.bind_addr}"
> 		bind_port="${hybris.jgroups.bind_port}" />
> 
> 	<JDBC_PING connection_driver="${hybris.database.driver}"
> 		connection_password="${hybris.database.password}"
> 		connection_username="${hybris.database.user}"
> 		connection_url="${hybris.database.url}"
> 		initialize_sql="${hybris.jgroups.schema}"
>          datasource_jndi_name="${hybris.datasource.jndi.name}"/>
> 
> 	<MERGE2 min_interval="10000" max_interval="30000" />
> 	<FD_SOCK />
> 	<FD timeout="3000" max_tries="3" />
> 	<VERIFY_SUSPECT timeout="1500" />
> 	<BARRIER />
> 	<pbcast.NAKACK use_mcast_xmit="false" exponential_backoff="500"
> discard_delivered_msgs="true" />
> 
> 	<UNICAST />
> 	<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
> max_bytes="4M" />
> 	<pbcast.GMS print_local_addr="true" join_timeout="3000"
> view_bundling="true" />
> 
> 	<UFC max_credits="20M" min_threshold="0.6" />
> 	<MFC max_credits="20M" min_threshold="0.6"  />
> 
> 	<FRAG2 frag_size="60K" />
> 	<pbcast.STATE_TRANSFER />
> 	
> </config>
> 
> Based on this configuration, do you have any recommendation for us to modify
> anything here to get better throughput?
> 
> Thanks in advance
> Simeon
> 
> 
> 
> 
> 
> 
> --
> Sent from: http://jgroups.1086181.n5.nabble.com/JGroups-Dev-f6604.html
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Javagroups-development mailing list
> 

-- 
Bela Ban | http://www.jgroups.org

Hi 

Usually in discount/sale periods we are facing the threads which take care
of sending JGroups messages to be in  TIMED_WAITING state, more precisely
on: 

java.lang.Thread.State: TIMED_WAITING (parking) 
        at sun.misc.Unsafe.park(Native Method) 
        - parking to wait for  <0x00000005d2968450> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) 
        at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) 
        at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2163) 
        at org.jgroups.util.CreditMap.decrement(CreditMap.java:157) 
        at org.jgroups.protocols.MFC.handleDownMessage(MFC.java:102) 

It's like the case when the messages are so many the nodes cannot handle
them and there are no enough credits to be send over between the nodes in
order new messages to be processed. 

During peak periods we are having more or less 17 - 20 AWS EC2 instances in
our cluster. One of the EC2 instance is dedicated for batch processing and
on few occasions we receive huge files which initiates a big load of
messages. We have around 10 nodes serving the user traffic and some more
nodes for administration purposes. All of these nodes are communication
between each other via JGroups in the cluster using TCP (at the time we
migrated to AWS there was a constraint on only using TCP and we are
exploring the ways to move to UDP now) and we are using version 3.4.1
JGroups. 

However having said that, with the current infrastructure, what should be
the proposed JGroups TCP configuration? We feel that it is good practice to
optimise our configuration. 

our configuration is as follows: 
<config xmlns=&quot;urn:org:jgroups&quot; 

xmlns:xsi=&quot;http://www.w3.org/2001/XMLSchema-instance&quot;  

xsi:schemaLocation=&quot;urn:org:jgroups http://www.jgroups.org/schema/JGroups-3.0.xsd&quot;>

        <TCP loopback=&quot;true&quot; 
                recv_buf_size=&quot;${tcp.recv_buf_size:20M}&quot; 
                send_buf_size=&quot;${tcp.send_buf_size:640K}&quot; 
                discard_incompatible_packets=&quot;true&quot; 
                max_bundle_size=&quot;64K&quot; 
                max_bundle_timeout=&quot;30&quot; 
                enable_bundling=&quot;true&quot; 
                use_send_queues=&quot;true&quot; 
                sock_conn_timeout=&quot;300&quot; 
                timer_type=&quot;new&quot; 
                timer.min_threads=&quot;4&quot; 
                timer.max_threads=&quot;10&quot; 
                timer.keep_alive_time=&quot;3000&quot; 
                timer.queue_max_size=&quot;500&quot; 
                thread_pool.enabled=&quot;true&quot; 
                thread_pool.min_threads=&quot;10&quot; 
                thread_pool.max_threads=&quot;40&quot; 
                thread_pool.keep_alive_time=&quot;5000&quot; 
                thread_pool.queue_enabled=&quot;false&quot; 
                thread_pool.queue_max_size=&quot;10000&quot; 
                thread_pool.rejection_policy=&quot;discard&quot; 
                oob_thread_pool.enabled=&quot;true&quot; 
                oob_thread_pool.min_threads=&quot;5&quot; 
                oob_thread_pool.max_threads=&quot;20&quot; 
                oob_thread_pool.keep_alive_time=&quot;5000&quot; 
                oob_thread_pool.queue_enabled=&quot;false&quot; 
                oob_thread_pool.queue_max_size=&quot;10000&quot; 
                oob_thread_pool.rejection_policy=&quot;discard&quot; 
                bind_addr=&quot;${hybris.jgroups.bind_addr}&quot; 
                bind_port=&quot;${hybris.jgroups.bind_port}&quot; />

        <JDBC_PING connection_driver=&quot;${hybris.database.driver}&quot; 
                connection_password=&quot;${hybris.database.password}&quot; 
                connection_username=&quot;${hybris.database.user}&quot; 
                connection_url=&quot;${hybris.database.url}&quot; 
                initialize_sql=&quot;${hybris.jgroups.schema}&quot; 
        datasource_jndi_name=&quot;${hybris.datasource.jndi.name}&quot;/>

        <MERGE2 min_interval="10000" max_interval="30000" />
        <FD_SOCK />
        <FD timeout="3000" max_tries="3" />
        <VERIFY_SUSPECT timeout="1500" />
        <BARRIER />
        <pbcast.NAKACK use_mcast_xmit="false" exponential_backoff="500"
discard_delivered_msgs="true" />

        <UNICAST />
        <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
max_bytes="4M" />
        <pbcast.GMS print_local_addr="true" join_timeout="3000"
view_bundling="true" />

        <UFC max_credits="20M" min_threshold="0.6" />
        <MFC max_credits="20M" min_threshold="0.6"  />

        <FRAG2 frag_size="60K" />
        <pbcast.STATE_TRANSFER />

</config>

Based on this configuration, do you have any recommendation for us to modify
anything here to get better throughput? 

Thanks in advance 
Simeon 

--
Sent from: http://jgroups.1086181.n5.nabble.com/JGroups-Dev-f6604.html

Hi 

Usually in discount/sale periods we are facing the threads which take care
of sending JGroups messages to be in  TIMED_WAITING state, more precisely
on: 

java.lang.Thread.State: TIMED_WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000005d2968450> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
	at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2163)
	at org.jgroups.util.CreditMap.decrement(CreditMap.java:157)
	at org.jgroups.protocols.MFC.handleDownMessage(MFC.java:102)

It's like the case when the messages are so many the nodes cannot handle
them and there are no enough credits to be send over between the nodes in
order new messages to be processed.

During peak periods we are having more or less 17 - 20 AWS EC2 instances in
our cluster. One of the EC2 instance is dedicated for batch processing and
on few occasions we receive huge files which initiates a big load of
messages. We have around 10 nodes serving the user traffic and some more
nodes for administration purposes. All of these nodes are communication
between each other via JGroups in the cluster using TCP (at the time we
migrated to AWS there was a constraint on only using TCP and we are
exploring the ways to move to UDP now) and we are using version 3.4.1
JGroups.

However having said that, with the current infrastructure, what should be
the proposed JGroups TCP configuration? We feel that it is good practice to
optimise our configuration.

our configuration is as follows: 
<config xmlns="urn:org:jgroups" 
		xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
		xsi:schemaLocation="urn:org:jgroups
http://www.jgroups.org/schema/JGroups-3.0.xsd">

	<TCP loopback="true" 
		recv_buf_size="${tcp.recv_buf_size:20M}" 
		send_buf_size="${tcp.send_buf_size:640K}"
		discard_incompatible_packets="true" 
		max_bundle_size="64K" 
		max_bundle_timeout="30"
		enable_bundling="true" 
		use_send_queues="true"
		sock_conn_timeout="300" 
		timer_type="new" 
		timer.min_threads="4" 
		timer.max_threads="10" 
		timer.keep_alive_time="3000"
		timer.queue_max_size="500" 
		thread_pool.enabled="true" 
		thread_pool.min_threads="10"
		thread_pool.max_threads="40"
		thread_pool.keep_alive_time="5000" 
		thread_pool.queue_enabled="false" 
		thread_pool.queue_max_size="10000"
		thread_pool.rejection_policy="discard" 
		oob_thread_pool.enabled="true" 
		oob_thread_pool.min_threads="5"
		oob_thread_pool.max_threads="20"
		oob_thread_pool.keep_alive_time="5000" 
		oob_thread_pool.queue_enabled="false"
		oob_thread_pool.queue_max_size="10000"
		oob_thread_pool.rejection_policy="discard" 
		bind_addr="${hybris.jgroups.bind_addr}" 
		bind_port="${hybris.jgroups.bind_port}" />

	<JDBC_PING connection_driver="${hybris.database.driver}" 
		connection_password="${hybris.database.password}" 
		connection_username="${hybris.database.user}"
		connection_url="${hybris.database.url}" 
		initialize_sql="${hybris.jgroups.schema}"
        datasource_jndi_name="${hybris.datasource.jndi.name}"/>

	<MERGE2 min_interval="10000" max_interval="30000" />
	<FD_SOCK />
	<FD timeout="3000" max_tries="3" />
	<VERIFY_SUSPECT timeout="1500" />
	<BARRIER />
	<pbcast.NAKACK use_mcast_xmit="false" exponential_backoff="500"
discard_delivered_msgs="true" />

	<UNICAST />
	<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
max_bytes="4M" />
	<pbcast.GMS print_local_addr="true" join_timeout="3000"
view_bundling="true" />

	<UFC max_credits="20M" min_threshold="0.6" />
	<MFC max_credits="20M" min_threshold="0.6"  />

	<FRAG2 frag_size="60K" />
	<pbcast.STATE_TRANSFER />

</config>

Based on this configuration, do you have any recommendation for us to modify
anything here to get better throughput?

Thanks in advance
Simeon

--
Sent from: http://jgroups.1086181.n5.nabble.com/JGroups-Dev-f6604.html

Hi all,

due to a change by SourceForge (dropping everyone and making everyone 
re-opt-in, see below), the ML membership has decreased from a few 
hundred to ~50.

Because I was never notified of this and because I don't like it at all, 
I'm creating a google group [1] for all JGroups related questions.

Please post questions/feedback/suggestions etc to [1] from now on.
Cheers,

[1] https://groups.google.com/forum/#!forum/jgroups-dev

> Due to new European anti-spam and personal data privacy regulations, we can no longer expose subscriber e-mail addresses to 3rd parties (project administrators, list admins, etc). We didn't want to make this change but was forced to due to changes in law. There is nothing we can do about this.
> 
> As for the users decreasing, due to the anti-spam regulation portion, back in June and July we send several notifications out over a period of several weeks that legacy subscribers wishing to continue to be subscribed to mailing lists will have to re-opt in once. Those who did not respond though invalid e-mail addresses or lack of interest or those that didn't want to continue to be subscribed to the list were unscribed on July 31st 2017.
> 

-- 
Bela Ban | http://www.jgroups.org

On 30/08/17 13:25, Development issues wrote:
> Thanks for your reply.
> I do not see anything abnormal in "hotspots by object size" or 
> "hostspots by object count", except for some object clones of 
> org.jgroups.protocols.MERGE3$MergeHeader$Type.

OK, so then what makes you think JGroups is leaking memory?

> I was wondering whether all TcpConnections may point on an infinite call 
> loop (of send-receive) that causes multiple classes to be created and 
> eventually cause a leak?

No, there should be 1 TcpConnection object *per destnation*, e.g. if you 
have a cluster of 10, and everyone sends messages to everybody else (or 
multicasts messages), then every member should have 9 TcpConnection 
objects (each having a reader thread).

> On Tue, Aug 29, 2017 at 7:01 PM, Development issues 
> <jav...@li... 
> <mailto:jav...@li...>> wrote:
> 
>     I don't see anything wrong with this: what you're seeing in the
>     screen shot are TcpConnections, and 1 thread for each waiting on I/O...
> 
>     These connections don't use up a lot of memory.
> 
>     It would be more interesting to see "hotspots by object size" and/or
>     "hostspots by object count"...
> 
> 
> 
>     On 29/08/17 10:20, Development issues wrote:
> 
>         Hi,
>         For the last 2 years, we are using JGroups (currently 3.4.8) for
>         messaging
>         between java processes, some are java main and some running
>         under Tomcat.
>         We are now facing memory issues and used a profiler to analyze
>         them, and
>         found out a huge increase in the # of JGroups classes. See
>         attached stack
>         straces and image
> 
>         We are using the attached configuration.
> 
>         i'll appreciate if someone can tell me what's wrong...
> 
>         Thanks in advance.
>         jgroups_classes.png
>         <http://jgroups.1086181.n5.nabble.com/file/n11406/jgroups_classes.png
>         <http://jgroups.1086181.n5.nabble.com/file/n11406/jgroups_classes.png>>
>         stacktraces.txt
>         <http://jgroups.1086181.n5.nabble.com/file/n11406/stacktraces.txt <http://jgroups.1086181.n5.nabble.com/file/n11406/stacktraces.txt>>
>         jgroupsConf.xml
>         <http://jgroups.1086181.n5.nabble.com/file/n11406/jgroupsConf.xml <http://jgroups.1086181.n5.nabble.com/file/n11406/jgroupsConf.xml>>
> 
> 
> 
> 
>         --
>         View this message in context:
>         http://jgroups.1086181.n5.nabble.com/JGroups-memory-leak-tp11406.html
>         <http://jgroups.1086181.n5.nabble.com/JGroups-memory-leak-tp11406.html>
>         Sent from the JGroups - Dev mailing list archive at Nabble.com.
> 
>         ------------------------------------------------------------------------------
>         Check out the vibrant tech community on one of the world's most
>         engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>         _______________________________________________
>         Javagroups-development mailing list
> 
> 
>     -- 
>     Bela Ban | http://www.jgroups.org
> 
> 
> 
>     ------------------------------------------------------------------------------
>     Check out the vibrant tech community on one of the world's most
>     engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>     _______________________________________________
>     Javagroups-development mailing list
> 
> 
> 
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> 
> 
> 
> _______________________________________________
> Javagroups-development mailing list
> 

-- 
Bela Ban | http://www.jgroups.org

Thanks for your reply.
I do not see anything abnormal in  "hotspots by object size" or "hostspots
by object count", except for some object clones of
org.jgroups.protocols.MERGE3$MergeHeader$Type.
I was wondering whether all TcpConnections may point on an infinite call
loop (of send-receive) that causes multiple classes to be created and
eventually cause a leak?

On Tue, Aug 29, 2017 at 7:01 PM, Development issues <
jav...@li...> wrote:

> I don't see anything wrong with this: what you're seeing in the screen
> shot are TcpConnections, and 1 thread for each waiting on I/O...
>
> These connections don't use up a lot of memory.
>
> It would be more interesting to see "hotspots by object size" and/or
> "hostspots by object count"...
>
>
>
> On 29/08/17 10:20, Development issues wrote:
>
>> Hi,
>> For the last 2 years, we are using JGroups (currently 3.4.8) for messaging
>> between java processes, some are java main and some running under Tomcat.
>> We are now facing memory issues and used a profiler to analyze them, and
>> found out a huge increase in the # of JGroups classes. See attached stack
>> straces and image
>>
>> We are using the attached configuration.
>>
>> i'll appreciate if someone can tell me what's wrong...
>>
>> Thanks in advance.
>> jgroups_classes.png
>> <http://jgroups.1086181.n5.nabble.com/file/n11406/jgroups_classes.png>
>> stacktraces.txt
>> <http://jgroups.1086181.n5.nabble.com/file/n11406/stacktraces.txt>
>> jgroupsConf.xml
>> <http://jgroups.1086181.n5.nabble.com/file/n11406/jgroupsConf.xml>
>>
>>
>>
>>
>> --
>> View this message in context: http://jgroups.1086181.n5.nabb
>> le.com/JGroups-memory-leak-tp11406.html
>> Sent from the JGroups - Dev mailing list archive at Nabble.com.
>>
>> ------------------------------------------------------------
>> ------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> _______________________________________________
>> Javagroups-development mailing list
>>
>>
> --
> Bela Ban | http://www.jgroups.org
>
>
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Javagroups-development mailing list
>
>

I don't see anything wrong with this: what you're seeing in the screen 
shot are TcpConnections, and 1 thread for each waiting on I/O...

These connections don't use up a lot of memory.

It would be more interesting to see "hotspots by object size" and/or 
"hostspots by object count"...

On 29/08/17 10:20, Development issues wrote:
> Hi,
> For the last 2 years, we are using JGroups (currently 3.4.8) for messaging
> between java processes, some are java main and some running under Tomcat.
> We are now facing memory issues and used a profiler to analyze them, and
> found out a huge increase in the # of JGroups classes. See attached stack
> straces and image
> 
> We are using the attached configuration.
> 
> i'll appreciate if someone can tell me what's wrong...
> 
> Thanks in advance.
> jgroups_classes.png
> <http://jgroups.1086181.n5.nabble.com/file/n11406/jgroups_classes.png>
> stacktraces.txt
> <http://jgroups.1086181.n5.nabble.com/file/n11406/stacktraces.txt>
> jgroupsConf.xml
> <http://jgroups.1086181.n5.nabble.com/file/n11406/jgroupsConf.xml>
> 
> 
> 
> 
> --
> View this message in context: http://jgroups.1086181.n5.nabble.com/JGroups-memory-leak-tp11406.html
> Sent from the JGroups - Dev mailing list archive at Nabble.com.
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Javagroups-development mailing list
> 

-- 
Bela Ban | http://www.jgroups.org

Hi,
For the last 2 years, we are using JGroups (currently 3.4.8) for messaging
between java processes, some are java main and some running under Tomcat.
We are now facing memory issues and used a profiler to analyze them, and
found out a huge increase in the # of JGroups classes. See attached stack
straces and image

We are using the attached configuration.

i'll appreciate if someone can tell me what's wrong...

Thanks in advance.
jgroups_classes.png
<http://jgroups.1086181.n5.nabble.com/file/n11406/jgroups_classes.png>  
stacktraces.txt
<http://jgroups.1086181.n5.nabble.com/file/n11406/stacktraces.txt>  
jgroupsConf.xml
<http://jgroups.1086181.n5.nabble.com/file/n11406/jgroupsConf.xml>  

--
View this message in context: http://jgroups.1086181.n5.nabble.com/JGroups-memory-leak-tp11406.html
Sent from the JGroups - Dev mailing list archive at Nabble.com.

Hi all,

fyi,
I've revamped the workshop and will be teaching 2 workshops in November: 
[1].

Would be cool to see some of you there! Please forward to people you 
think may be interested. The cap is 15.
Cheers

[1] 
http://belaban.blogspot.ch/2017/08/jgroups-workshops-in-rome-and-berlin-in.html

-- 
Bela Ban | http://www.jgroups.org

This looks ok to me, haven't tested it though.

Report back once you have a reproducible case, with instructions on how 
to reproduce it, a sample program and config, and I'll take a look.

FD_ALL has worked forever, so I assume you may run into network issues 
every now and then...

On 21/07/17 12:49, Development issues wrote:
> Custom address looks like:
> 
> import java.util.function.Supplier;
> 
> import org.jgroups.Address;
> import org.jgroups.conf.ClassConfigurator;
> import org.jgroups.stack.AddressGenerator;
> import org.jgroups.util.NameCache;
> import org.jgroups.util.UUID;
> 
> public class CustomAddress extends UUID implements AddressGenerator {
> 
> 	/**
> 	 *
> 	 */
> 
> 	static {
> 		ClassConfigurator.add((short) 12545, CustomAddress.class);
> 	}
> 
> 	public CustomAddress() {
> 		super();
> 	}
> 
> 	public CustomAddress(long mostSigBits, long leastSigBits) {
> 		super(mostSigBits, leastSigBits);
> 	}
> 
> 	protected CustomAddress(byte[] data) {
> 		super(data);
> 	}
> 
> 	public static CustomAddress randomUUID(String name) {
> 		CustomAddress retval=new CustomAddress(generateRandomBytes());
> 		if(name != null)
> 			NameCache.add(retval, name);
> 		return retval;
> 	 }
> 
> 	@Override
> 	 public Supplier<? extends UUID> create() {
> 		 return CustomAddress::new;
> 	 }
> 
> 	@Override
> 	public Address generateAddress() {
> 		return CustomAddress.randomUUID("master");
> 	}
> 
> }
> 
> This issue is not reproducible always. My config and custom membership
> policy works fine and no problem with the view generation.
> 
> Thanks in advance :-)
> 
> 
> 
> --
> View this message in context: http://jgroups.1086181.n5.nabble.com/Continuous-heartbeat-messages-without-View-change-tp11352p11367.html
> Sent from the JGroups - Dev mailing list archive at Nabble.com.
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Javagroups-development mailing list
> 

-- 
Bela Ban | http://www.jgroups.org

Custom address looks like:

import java.util.function.Supplier;

import org.jgroups.Address;
import org.jgroups.conf.ClassConfigurator;
import org.jgroups.stack.AddressGenerator;
import org.jgroups.util.NameCache;
import org.jgroups.util.UUID;

public class CustomAddress extends UUID implements AddressGenerator {

	/**
	 * 
	 */

	static {
		ClassConfigurator.add((short) 12545, CustomAddress.class);
	}

	public CustomAddress() {
		super();
	}

	public CustomAddress(long mostSigBits, long leastSigBits) {
		super(mostSigBits, leastSigBits);
	}

	protected CustomAddress(byte[] data) {
		super(data);
	}

	public static CustomAddress randomUUID(String name) {
		CustomAddress retval=new CustomAddress(generateRandomBytes());
		if(name != null)
			NameCache.add(retval, name);
		return retval;
	 }

	@Override
	 public Supplier<? extends UUID> create() {
		 return CustomAddress::new;
	 }

	@Override
	public Address generateAddress() {
		return CustomAddress.randomUUID("master");
	}

}

This issue is not reproducible always. My config and custom membership
policy works fine and no problem with the view generation.

Thanks in advance :-)

--
View this message in context: http://jgroups.1086181.n5.nabble.com/Continuous-heartbeat-messages-without-View-change-tp11352p11367.html
Sent from the JGroups - Dev mailing list archive at Nabble.com.

This looks ok to me. I tested this with your custom membership policy 
and everything worked fine. Pulling the cable is something that has 
always worked and is supported by FD_ALL. The only code I haven't yet 
seen is CustomAddress, any special wizardry in there?

If you try this with Draw, your config and custom membership policy, and 
things still don't work, then it must be the env, e.g. 
firewalls/SELinux/NIC issue etc.

Have you tried this with Draw?

-- 
Bela Ban | http://www.jgroups.org

The custom membership policy is correctly forming new view and it looks like:

import java.util.Collection;
import java.util.List;

import org.jgroups.Address;
import org.jgroups.Membership;
import org.jgroups.stack.MembershipChangePolicy;

public class CustomMembershipPolicy implements MembershipChangePolicy {
	@Override
	public List<Address> getNewMembership(final Collection<Address>
currentMembers, final Collection<Address> joiners,
			final Collection<Address> leavers, final Collection<Address> suspects) {
		Membership retval = new Membership();

		// add the beefy nodes from the current membership first
		for (Address addr : currentMembers) {
			if (addr instanceof CustomAddress) {
				retval.add(addr);
			}
		}

		// then from joiners
		for (Address addr : joiners) {
			if (addr instanceof CustomAddress) {
				retval.add(addr);
			}
		}

		// then add all non-beefy current nodes
		retval.add(currentMembers);

		// finally the non-beefy joiners
		retval.add(joiners);

		retval.remove(leavers);
		retval.remove(suspects);
		return retval.getMembers();
	}

	@Override
	public List<Address> getNewMembership(final
Collection<Collection&lt;Address>> subviews) {
		Membership mbrs = new Membership();
		Membership retval = new Membership();
		for (Collection<Address> subview : subviews) {
			mbrs.add(subview);
		}
		for (Address addr : mbrs.getMembers()) {
			if (addr instanceof CustomAddress) {
				retval.add(addr);
			}
		}
		retval.add(mbrs.getMembers());
		return retval.getMembers();
	}
}

Yes, I have disconnected the cable of node1. [Note: Unplugging of cable
didn't resulted in the above issue every time.]

Thanks in advance :-)

--
View this message in context: http://jgroups.1086181.n5.nabble.com/Continuous-heartbeat-messages-without-View-change-tp11352p11364.html
Sent from the JGroups - Dev mailing list archive at Nabble.com.

The config looks fine to me, but "haven't received a heartbeat from node3
for 4130760 ms, adding it to suspect list" indicates that node1 never 
installs the new view.

I noticed that you have a custom membership policy 
("com.membership.CustomMembershipPolicy")> Is it correctly forming the 
new view? Can you post that code?

How do you disconnect node1? Do you pull the cable (I assume)?

On 20/07/17 10:43, Development issues wrote:
> Protocol Stack:
> 
> <config xmlns="urn:org:jgroups"
>          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>          xsi:schemaLocation="urn:org:jgroups
> http://www.jgroups.org/schema/jgroups.xsd">
>      <UDP
>           mcast_port="${jgroups.udp.mcast_port:45588}"
>           ip_ttl="4"
>           tos="8"
>           ucast_recv_buf_size="5M"
>           ucast_send_buf_size="5M"
>           mcast_recv_buf_size="5M"
>           mcast_send_buf_size="5M"
>           max_bundle_size="64K"
>           enable_diagnostics="true"
>           thread_naming_pattern="cl"
> 
>           thread_pool.min_threads="2"
>           thread_pool.max_threads="8"
>           thread_pool.keep_alive_time="30000"/>
> 
>      <PING />
>      <MERGE3 max_interval="30000"
>              min_interval="10000"/>
>      <FD_SOCK/>
>      <FD_ALL/>
>      <VERIFY_SUSPECT timeout="1500"  />
>      <BARRIER />
>      <pbcast.NAKACK2 xmit_interval="500"
>                      xmit_table_num_rows="100"
>                      xmit_table_msgs_per_row="2000"
>                      xmit_table_max_compaction_time="30000"
>                      use_mcast_xmit="false"
>                      discard_delivered_msgs="true"/>
>      <UNICAST3 xmit_interval="500"
>                xmit_table_num_rows="100"
>                xmit_table_msgs_per_row="2000"
>                xmit_table_max_compaction_time="60000"
>                conn_expiry_timeout="0"/>
>      <pbcast.STABLE desired_avg_gossip="50000"
>                     max_bytes="4M"/>
>      <pbcast.GMS print_local_addr="true" join_timeout="2000"
>                  view_bundling="false"
> membership_change_policy="com.membership.CustomMembershipPolicy"
> max_bundling_time="50"/>
> 				
> 	<SEQUENCER />
>      <UFC max_credits="2M"
>           min_threshold="0.4"/>
>      <MFC max_credits="2M"
>           min_threshold="0.4"/>
>      <FRAG2 frag_size="60K"  />
>      <FORK />
>      <RSVP resend_interval="2000" timeout="10000"/>
>      <pbcast.STATE_TRANSFER />
>      
> </config>
> 
> Thanks in advance :-)
> 
> 
> 
> --
> View this message in context: http://jgroups.1086181.n5.nabble.com/Continuous-heartbeat-messages-without-View-change-tp11352p11356.html
> Sent from the JGroups - Dev mailing list archive at Nabble.com.
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Javagroups-development mailing list
> 

-- 
Bela Ban | http://www.jgroups.org

Protocol Stack:

<config xmlns="urn:org:jgroups"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="urn:org:jgroups
http://www.jgroups.org/schema/jgroups.xsd">
    <UDP
         mcast_port="${jgroups.udp.mcast_port:45588}"
         ip_ttl="4"
         tos="8"
         ucast_recv_buf_size="5M"
         ucast_send_buf_size="5M"
         mcast_recv_buf_size="5M"
         mcast_send_buf_size="5M"
         max_bundle_size="64K"
         enable_diagnostics="true"
         thread_naming_pattern="cl"

         thread_pool.min_threads="2"
         thread_pool.max_threads="8"
         thread_pool.keep_alive_time="30000"/>

    <PING />
    <MERGE3 max_interval="30000"
            min_interval="10000"/>
    <FD_SOCK/>
    <FD_ALL/>
    <VERIFY_SUSPECT timeout="1500"  />
    <BARRIER />
    <pbcast.NAKACK2 xmit_interval="500"
                    xmit_table_num_rows="100"
                    xmit_table_msgs_per_row="2000"
                    xmit_table_max_compaction_time="30000"
                    use_mcast_xmit="false"
                    discard_delivered_msgs="true"/>
    <UNICAST3 xmit_interval="500"
              xmit_table_num_rows="100"
              xmit_table_msgs_per_row="2000"
              xmit_table_max_compaction_time="60000"
              conn_expiry_timeout="0"/>
    <pbcast.STABLE desired_avg_gossip="50000"
                   max_bytes="4M"/>
    <pbcast.GMS print_local_addr="true" join_timeout="2000"
                view_bundling="false"
membership_change_policy="com.membership.CustomMembershipPolicy"
max_bundling_time="50"/>

	<SEQUENCER />
    <UFC max_credits="2M"
         min_threshold="0.4"/>
    <MFC max_credits="2M"
         min_threshold="0.4"/>
    <FRAG2 frag_size="60K"  />
    <FORK /> 
    <RSVP resend_interval="2000" timeout="10000"/>
    <pbcast.STATE_TRANSFER />

</config>

Thanks in advance :-)

--
View this message in context: http://jgroups.1086181.n5.nabble.com/Continuous-heartbeat-messages-without-View-change-tp11352p11356.html
Sent from the JGroups - Dev mailing list archive at Nabble.com.

Can you post your configuration?

On 20/07/17 10:08, Development issues wrote:
> I am using JGroups version: 4.0.2. I am testing with 4 nodes where each node
> broadcasts request that it receives. I am using POSTMAN tool to send REST
> request. I am sending REST request for CRUD operation to node 1. While the
> CRUD request is being processed, I disconnected node 1 from the network. In
> that case, view change was not triggered in node 1 and this in turn affects
> application's behavior.
> 
> The following messages grows in node 1,
> 
> DEBUG:org.jgroups.protocols.FD_ALL: haven't received a heartbeat from node3
> for 4130760 ms, adding it to suspect list
> WARN:org.jgroups.protocols.FD_ALL: suspecting [node2, node3, node4]
> 
> The following message grows in other nodes than node 1,
> 
> WARN:org.jgroups.protocols.UDP: JGRP000032: 126.71: no physical address for
> node1, dropping message
> 
> 
> It was noted that the view change is received after connecting node 1 to
> network again.
> 
> What is the maximum time interval / number of heartbeat messages? How to
> resolve / justify this?
> 
> Thanks in Advance :-)
> 
> 
> 
> --
> View this message in context: http://jgroups.1086181.n5.nabble.com/Continuous-heartbeat-messages-without-View-change-tp11351.html
> Sent from the JGroups - Dev mailing list archive at Nabble.com.
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Javagroups-development mailing list
> 

-- 
Bela Ban | http://www.jgroups.org

2000	Jan	Feb	Mar	Apr	May	Jun (5)	Jul	Aug	Sep	Oct	Nov	Dec (8)
2001	Jan (6)	Feb (18)	Mar (25)	Apr (26)	May (17)	Jun (43)	Jul (4)	Aug (5)	Sep (5)	Oct (16)	Nov (19)	Dec (82)
2002	Jan (72)	Feb (144)	Mar (145)	Apr (70)	May (185)	Jun (62)	Jul (31)	Aug (105)	Sep (52)	Oct (46)	Nov (19)	Dec (30)
2003	Jan (82)	Feb (31)	Mar (38)	Apr (17)	May (25)	Jun (52)	Jul (37)	Aug (28)	Sep (54)	Oct (37)	Nov (40)	Dec (29)
2004	Jan (16)	Feb (17)	Mar (11)	Apr (14)	May (5)	Jun (13)	Jul (46)	Aug (12)	Sep (6)	Oct (4)	Nov (2)	Dec (23)
2005	Jan (12)	Feb (12)	Mar (26)	Apr (6)	May (24)	Jun (50)	Jul (13)	Aug (18)	Sep (26)	Oct (6)	Nov (46)	Dec (28)
2006	Jan (50)	Feb (22)	Mar (35)	Apr (16)	May (20)	Jun (17)	Jul (18)	Aug (19)	Sep (12)	Oct (4)	Nov (9)	Dec (24)
2007	Jan (18)	Feb (11)	Mar (13)	Apr (11)	May (24)	Jun (4)	Jul (18)	Aug (17)	Sep (33)	Oct (7)	Nov (12)	Dec
2008	Jan (30)	Feb (34)	Mar (8)	Apr (84)	May (32)	Jun (57)	Jul (23)	Aug (7)	Sep (26)	Oct (26)	Nov (41)	Dec (31)
2009	Jan (13)	Feb (17)	Mar (14)	Apr (45)	May (10)	Jun (25)	Jul (16)	Aug (23)	Sep (43)	Oct (54)	Nov (24)	Dec (23)
2010	Jan (18)	Feb (8)	Mar (23)	Apr (40)	May (15)	Jun (13)	Jul (4)	Aug (47)	Sep (46)	Oct (48)	Nov (45)	Dec (16)
2011	Jan (19)	Feb (14)	Mar (4)	Apr (5)	May (5)	Jun (2)	Jul (12)	Aug (1)	Sep (8)	Oct (7)	Nov (4)	Dec (5)
2012	Jan (7)	Feb (3)	Mar (3)	Apr (5)	May (4)	Jun (5)	Jul (4)	Aug (5)	Sep (6)	Oct (3)	Nov (10)	Dec (4)
2013	Jan	Feb (7)	Mar (4)	Apr (6)	May (18)	Jun (5)	Jul (8)	Aug (11)	Sep (7)	Oct (4)	Nov	Dec
2014	Jan (1)	Feb (1)	Mar (3)	Apr (9)	May (2)	Jun (1)	Jul (1)	Aug	Sep (3)	Oct (2)	Nov (1)	Dec
2015	Jan (3)	Feb (3)	Mar (5)	Apr (3)	May	Jun	Jul (2)	Aug (4)	Sep	Oct	Nov (1)	Dec (11)
2016	Jan (2)	Feb	Mar (1)	Apr	May (1)	Jun (2)	Jul (6)	Aug	Sep	Oct	Nov	Dec
2017	Jan	Feb	Mar	Apr	May	Jun	Jul (8)	Aug (5)	Sep (1)	Oct	Nov	Dec
2018	Jan	Feb	Mar	Apr	May (3)	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2019	Jan (2)	Feb	Mar	Apr (1)	May	Jun	Jul (1)	Aug	Sep (1)	Oct (1)	Nov	Dec
2020	Jan (1)	Feb (1)	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov (1)	Dec