Activity for Bela Ban

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    Hi Dave note that the ML is at [1]. Can you post your configuration (jgroups.xml)? Does IP:port correlate with the bind_addr you use in the transport? Can you post the startup log with TRACE enabled for TCP, GMS and the discovery protocol (e.g. TCPPING)? The fake version number might be caused by receiving garbage on one of the ports, ie. a conflict with a different process. I also suggest turn selinux and firewalling off for this test, to see if this has an impact. [1] https://groups.google.com...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    Not sure I understand your topology. My comments below are therefore of a general nature. Use a Network CNI which provides IP multicast support, e.g. one mentioned in [1] Use a different discovery protocol, e.g. KUBE_PING [2] or DNS_PNG [3]. Note that you can list multiple discovery protocols in the same config (depends on the JGroups version you use). Use an external discovery service such as GossipRouter / TCPGOSSIP [4] Switch from bridged to host networking I hope these suggestions help [1] https://tanmaybatham.medium.com/multicast-in-kubernetes-challenges-solutions-and-implementation-f30c29438f2a...

  • Bela Ban Bela Ban posted a comment on discussion Help

    First off: this config isn't correct: message_processing_policy="0" will not create a MessageProcessingPolicy (has to be "submit" or "max" (default)), so incoming messages will get dropped, with an exception If UNICAST3 is removed, messages will indeed get received in the right order, but the thread pool might reorder their delivery, so correct ordering is not guaranteed The big problem here is that you're using JGroups in a client-server mode, which is not what the idea is: peer-to-peer. Various...

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    On 15.09.21 12:45, Rainer Stransky wrote: I am using a small Notification application based on "JGroups V5.0.0 Final" in a small office Windows 10 network. All worked perfect. Since a view weeks ago a new security policy has now enabled the Windows 10 firewall and my notification application was not able to find the other instances. I have now changed the Windows 10 firewall settings. In the extended setting I changed the incoming rules for the used "OpenJDK Platform binary" to allow incomming connections....

  • Bela Ban Bela Ban posted a comment on discussion Help

    You need to set this attribute directly in the GossipRouter: GossipRouter g=new GossipRouter(...); g.maxLength(100_000); On 19.05.21 19:50, Michael Cirioli wrote: One more question - my understanding is that the gossip router is used for TCP communication, so setting max_length on TCP_NIO2 (for example) should ensure that no messages larger than max_length actually get routed, right? Also, I am not sure if you should expect to be able to set max_length for <tcpgossip ...="">, it fails for me with...

  • Bela Ban Bela Ban posted a comment on discussion Help

    On 19.05.21 17:48, Michael Cirioli wrote: Thanks for the quick turnaround! I've grabbed the latest (4.2.15) and I am now able to use the new attribute. My simple test shows that it is working as expected! Excellent! Bear in mind that the main use case was that a (possibly malicious) service connects to a JGroups port incidentally, and should not cause the service to allocate too much memory. Cheers, thanks again -mike cirioli On 5/19/2021 8:57:12 AM, Michael Cirioli mikecirioli@gmail.com mikecirioli@gmail.com...

  • Bela Ban Bela Ban posted a comment on discussion Help

    Note that I just released 4.2.15.Final. Should take ~1hr to sync to maven central. Cheers, On 19.05.21 15:27, Michael Cirioli wrote: Bela - lol, no worries! it happens to the best of us :) Thank you! -mike cirioli On 5/19/2021 2:17:23 AM, Bela Ban belaban@users.sourceforge.net belaban@users.sourceforge.net wrote: Hi Michael I'm embarassed: max_length is only used by GossipRouter, but not by either TCP or TCP_NIO2! :-) I created [1] to change this. Cheers, [1] https://issues.redhat.com/browse/JGRP-2559...

  • Bela Ban Bela Ban posted a comment on discussion Help

    Hi Michael I'm embarassed: max_length is only used by GossipRouter, but not by either TCP or TCP_NIO2! :-) I created [1] to change this. Cheers, [1] https://issues.redhat.com/browse/JGRP-2559 On 18.05.21 23:56, Michael Cirioli wrote: I am using org.jgroups:jgroups:jar:4.2.14.Final, my configuration file looks like this excerpt: <config xmlns="urn:org:jgroups" ="" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemalocation="urn:org:jgroups &lt;a href=" http:="" www.jgroups.org="" schema=""...

  • Bela Ban Bela Ban posted a comment on discussion Help

    Do you have the correct version? You need at least 4.2.11. What does the error say? On 16.05.21 21:03, Michael Cirioli wrote: Which configuration sections accept that parameter? I assume tcp and nio, but my app doesn't start due to an error saying invalid attributes. Thanks -mike On Sun, May 16, 2021, 2:17 PM Bela Ban belaban@users.sourceforge.net belaban@users.sourceforge.net wrote: The attribute is max_length Bela Ban -------- Original message -------- From: Michael Cirioli mikecirioli@users.sourceforge.net...

  • Bela Ban Bela Ban posted a comment on discussion Help

    The attribute is max_lengthBela Ban -------- Original message --------From: Michael Cirioli <mikecirioli@users.sourceforge.net>Date: Fri, May 14, 2021, 11:11 PMTo: "[javagroups:discussion] " <18795@discussion.javagroups.p.re.sourceforge.net>Subject: [javagroups:discussion] Configuration example for how to cap max data read by TcpConnection or NioConnectionFrom what I can tell, its not listed in any .xsd and therefore my test app won't even start. Configuration example for how to cap max data read...

  • Bela Ban Bela Ban posted a comment on discussion Help

    Hi Subhash I don't support such an old version. IIRC, the issue you've run into has been fixed a while ago, don't remember in which version. Regards On 12.01.21 3:30 pm, Subhash C wrote: Hi Bela Ban, Thanks for your reply. First off, I highly recommend to upgrade to a newer version (3.6.7 is 4+ years old)... We have maintaining this version to support an old version of application If the particular packet with the new view is missed by any member, Will the coordinator resend the view again to that...

  • Bela Ban Bela Ban posted a comment on discussion Help

    First off, I highly recommend to upgrade to a newer version (3.6.7 is 4+ years old)... Comments inline On 12.01.21 2:16 pm, Subhash C wrote: In our code, we have attached custom org.jgroups.ReceiverAdapter to the channel for the callback. private JChannel channel = new JChannel(is) channel.setReceiver(new JGroupsListener()); class JGroupsListener extends ReceiverAdapter { @Override publicvoidviewAccepted(finalViewview){ viewChange(view); } @Override publicvoidreceive(finalMessagemessage){ processMessage(message);...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    Hi Cyndy removing use_fork_join_pool was a simplification; one less knob in the configuration and better maintainability of the code. This setting never yielded better performance, but only added complexity to the configuration. When you use virtual threads (use_fibers=true), the thread pools are ignored, and a new virtual thread is started each time. This is the recommended way to use virtual threads. Cheers On 30.10.20 10:14 pm, Cyndy Koobs wrote: I was wondering what the thought process was for...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    The best place is JIRA [1]. For an overview, see [2] and [3]. 95% of you changes will be to change 'new Message(...)' into 'new BytesMessage(...)' or 'new ObjectMessage(...)'. The transition from 4 to 5 should be trivial. If you run into problems, don't hesitate to post them here. Cheers, [1] https://issues.redhat.com/projects/JGRP?selectedItem=com.atlassian.jira.jira-projects-plugin:release-page&status=released-unreleased [2] http://belaban.blogspot.com/2020/01/first-alpha-of-jgroups-50.html [3]...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    I don't support 3.5; it is from 2014. BTW, your config is missing UNICAST2{3} On 07.08.20 14:17, reeta Aggarwal wrote: we are using jgroups-3.5.0.Final.jar yes we are using TCPPING as below we are passing current local host in initial_hosts , below is configuration at run time : 04Aug202016:31:14,842[main]DEBUG[JGroupsChannelProvider]initialize():usingthefollowingprotocolstack... 04 Aug 2020 16:31:14,842 [main] DEBUG [JGroupsChannelProvider] TCP(bind_addr=172.17.152.5;bind_port=6800):TCPPING(initial_hosts=172.17.152.7[6800];port_range=1;timeout=15000;num_initial_members=2):FD(timeout=7500;max_tries=2):VERIFY_SUSPECT(timeout=30000):pbcast.NAKACK2(use_mcast_xmit=false;discard_delivered_msgs=true):pbcast.STABLE(desired_avg_gossip=20000;stability_delay=2000):pbcast.GMS(join_timeout=5000;print_local_addr=false)...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    On 07.08.20 07:55, sneha thakur wrote: Hi, We are getting this issue [TransferQueueBundler,controlChannelSAS_CLUSTER,LABECSAP01-60365] WARN [TCP] JGRP000032: LABECSAP01-60365: no physical address for 766961e5-ebd6-5e5f-6e27-e9321d827688, dropping message Every nodes has a cache, mapping UUIDs (like the one above) to IP addresses:ports. The cache is populated by the discovery protocol. In your case, the cache didn't have an entry for 766961e5-ebd6-5e5f-6e27-e9321d827688, which means JGroups was unable...

  • Bela Ban Bela Ban posted a comment on discussion javagroups-development

    I guess DNS_PING for simple deployments. If you use cross-site (RELAY2) deployments, you'd want to use TUNNEL/GossipRouter, as described in my blog. On 01.07.20 18:17, Abhijeet wrote: Thanks Bela for quick help. I have tried DNS_PING and its working for me. Just a short question, Any preference, between using TCPPING, TCPGOSSIP or DNS_PING on K8S? Suggestion to Deploy Jgroup in Kubernetes https://sourceforge.net/p/javagroups/discussion/134002/thread/b0a6db3645/?limit=25#4929 Sent from sourceforge.net...

  • Bela Ban Bela Ban posted a comment on discussion javagroups-development

    On 23.06.20 07:53, Abhijeet wrote: Currently we are using JGroup 4.0.10 in our project using UDP and PING protocol [multicasting]. We are planning to create the docker image of our project and will deploy in K8S (Not using openShift etc). I am very new to K8S, and dont have very much deep understanding on same. While going through Github and forums I found TCP with DNS_PING is viable option to deploy in K8S. So using Jgroup version 4.1.9.Final in our project. Below is Protocol list I am using Protocol[]...

  • Bela Ban Bela Ban posted a comment on discussion Help

    Works for me, with CENTRAL_LOCK2 though... haven't looked at CENTRAL_LOCK for a while. Have you tried with 4.1.8? On 27.11.19 13:06, Heiko Tappe wrote: I see a different behaviour. Client 1 obviously does not release the lock if a tryLock of client 2 is busy. There is no error. But I can tell from calling printLocks afterwards. Without a running tryLock of client 2 releasing the lock works as expected. Join problems https://sourceforge.net/p/javagroups/discussion/18795/thread/ffd1a67189/?limit=25#8111/6404/c432...

  • Bela Ban Bela Ban posted a comment on discussion Help

    On 27.11.19 12:42, Heiko Tappe wrote: Oh. I just noticed that releasing the lock on client 1 does not succeed if client 2 is trying to get a lock with timeout. Is that the expected behaviour? Client1 holds lock X Client2 does a trylock X 20000 // 20secs Client1 unlocks X within 20 secs Client2 will be able to acquire lock X Join problems https://sourceforge.net/p/javagroups/discussion/18795/thread/ffd1a67189/?limit=25#8111 Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/javagroups/discussion/18795/...

  • Bela Ban Bela Ban posted a comment on discussion Help

    On 27.11.19 12:40, Heiko Tappe wrote: Hmm. Things improved. A lot :-) But I am not exactly sure why. In the last tests I was so focused on the log messages ("Could not join ...") that I didn't check the lock functionality itself. And surprise - it works now, as far as I can tell! But the warnings in the log still remain. Perhaps you should post the full logs of the 2 members' startup One thing I noticed was some strange socket binding with a multi cast address of "192.168.1.21". Maybe things got...

  • Bela Ban Bela Ban posted a comment on discussion Help

    Try setting receive_on_all_interfaces to false, and use 4.1.8.Final

  • Bela Ban Bela Ban posted a comment on discussion Help

    You probably have no route (netstat -nr will show you) from 192.168. to 224.. I suggest do not use 224.x.x.x, as routers may even discard application traffic to these addrs. I suggest use sth like 232.5.5.5 and perhaps add a route to it: sudo route add -net 232.0.0.0/5 192.168.1.21 On 27.11.19 9:00 AM, Heiko Tappe wrote: Here is what I got... I explicitly defined UDP like this: <udp mcast_port="${jgroups.udp.mcast_port:45588}" bind_addr="${jgroups.bind_addr:192.168.1.21}" mcast_addr="${jgroups.udp.mcast_addr:224.0.0.0}"...

  • Bela Ban Bela Ban posted a comment on discussion Help

    On 26.11.19 11:27 AM, Heiko Tappe wrote: Let's start with: I am a newbie to jgroups. So be patient with me ;-) always am... :-) What I try to achieve are distributed locks. Right now I do my first tests locally with one node. And some things seem to work already (a bit): I can get a lock and a second attempt in the same session (same lock service instance) to get a lock fails. But another instance of the lock service does not seem to "see" any locks held by the other "session". The problem might...

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    Apparently, the member whose log is shown below didn't receive any heartbeats from all of the 4 members for 14 secs and excluded them (after the double-check by VERIFY_SUSPECT). On 02.10.19 16:06, ankit kumar garg wrote: HI , The members are leaving the view in running system. Attaching the logs : could not understand the root cause or fix. pls help 2019-09-21 05:18:58.479 TRACE org.jgroups.protocols.UNICAST3 - seliics02331-8367 --> DATA(seliics02333-52690: #1760889, conn_id=6) 2019-09-21 05:18:58.479...

  • Bela Ban Bela Ban posted a comment on discussion Help

    On 19.08.19 17:19, Maximilian Gerhard wrote: Hello, I am trying to find out if there was any message order bugs on jgroup version 3.5.1.Final. I know its an old version. I have a report with log files that indicate that messages didn't arrive in same order they were sent. Who's reporting this? If this is in application code, can you be sure that the application does not reorder messages (e.g. using a thread pool for processing)? I assume you require total order (not per-sender FIFO)? Note that SEQUENCER...

  • Bela Ban Bela Ban posted a comment on discussion Help

    The lock notications are emitted at the coordinator (lock server) only. I didn't want to send messages to all members when a lock was created/locked/unlocked; too much overhead. If you need this kind of information, I suggest another channel (e.g. a ForkChannel) to disseminate this info.

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    On 24.05.19 11:00, Maximilian Gerhard wrote: Distributed locking, hmm... have you seen my thoughts about this [1]? I don't think distributed locks are a good idea. I did read that part several times and thought about the impact for my application. Thx for the hint. At the end I came to the conclusion that the impact is not a problem in my use case. I need the distrubuted lock basically just as some kind of synchronization to prevent problems when the same user interaction is tried on 2 or more nodes...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    On 23.05.19 13:20, Maximilian Gerhard wrote: Hello, I am using jgroups since a while (currently 4.0.19-Final) in my application and so far everything is fine. Thanks for this awesome framework. You're welcome! I need to integrate a locking protocol now. Distributed locking, hmm... have you seen my thoughts about this [1]? I don't think distributed locks are a good idea. I am using the udp.xml with CENTRAL_LOCK2 at the top and started the LockServiceDemo on 2 different notebooks. So far everything...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    Version 3.5.1 is ~5 years old; I don't support such an old version. Please upgrade to a newer version! If you can reproduce this on a newer version, I'd be happy to look into this. Cheers,

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    What do you mean? I see only a single merge, creating metge-view 192.168.11.103:7801:1332-22808|3] (view number 4), consisting of 2 members. This looks okay to me!

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    Klicking on "start discarding" in either node should cause a split brain. This can be merged after cloking on "stop discarding". Alternatively to the GUI, you could use probe.sh to do this: probe.sh -addr HHH:PPP jmx=DISCARD.discard_all=true

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    On 10.04.19 14:35, Jikku Joyce wrote: Is 4.1.0 released? As I could see, only upto 4.0.18 is there in mvnrepo. No, I still need to close 2 more issues to release 4.1.0. Will it work if I run on different nodes with TCP? Yes. Or use different IP addresses, e.g. IP aliases, if you need to run on the same box Will inject view work on different nodes with TCP? Yes, it should Is the below command the right way to inject view? op=INJECTVIEW.injectView["machinename:7800:1331-8579=machinename:7800:1331-8579;machinename:7800:1332-5780=machinename:7800:1332-5780;"]...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    Re showing the same member twice: this is an issue that has been fixed in 4.1.0 [1]. This only occurs for TCP and multiple members running on the same box. [1] https://issues.jboss.org/browse/JGRP-2336

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    Also: 'netstat -na |grep 780': does it show both instances?

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    What's the output of running 'probe.sh -addr 192.168.79.126 member-addrs'?

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    On 28.03.19 08:15, Jikku Joyce wrote: Hi, Say I have a cluster of 3 nodes, N1,N2 and N3 and N1 was the coordinator. N1 left the cluster like a split brain scenario. Then N2 becomes the Coordinator and N1 is a coordinator in its on cluster. Then again if N1 joins the original cluster, then will N1 be the coordinator or N2 be the coordinator? Depends on the MembershipChangePolicy (pluggable) [1]. The default is to sort all members according to their UUIDs, and the first member in the set becomes new...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    Well, FRAG2 is not in your stack. Try STATE_TRANSFER instead. You cannot place FORK directly above the transport, it needs to be at the top of the stack!

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    You main mistake was to add FORK just above the transport (TCP). Instead, FORK needs to be placed at the top of the protocol stack: public class bla4 extends ReceiverAdapter { protected JChannel ch; protected void start() throws Exception { ch=new JChannel(); // uses udp.xml by default TP transport=ch.getProtocolStack().getTransport(); transport.setBindAddress(Util.getLocalhost()); ch.connect("demo"); ForkChannel jGroupsForkChannel=new ForkChannel(ch, "stackId", "channelId", true, ProtocolStack.Position.ABOVE,...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    On 25.03.19 13:17, Jikku Joyce wrote: Hi, I am using fork channel api to fork my main channel. I have set it as a receiver using my implementation of reciever adapter. I can send and receive messages but cannot use the inherited method "viewAccepted" which should theoretically get invoked when there is a change in the jgroups view. Take a look at the program below. I set a ReceiverAdapter and implement viewAccepted(), which is called. But I do not get control to this method when I add a node or remove...

  • Bela Ban Bela Ban posted a comment on discussion Help

    I doubt this is the root cause; killing a node and restarting it has worked for decade(s)... If you come up with a reproducer (including code, configuration and instructions to reproduce), I'll take a look. Note that if you remove UNICAST3, you will destroy ordering guaranteesfor unicast messages. [1] http://www.jgroups.org/manual4/index.html#UNICAST3

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    On 22.02.19 12:03, Rashmi wrote: Hello Ban, Thanks a lot for your help so far !! Can you please help to clarify one more doubt.. We have these proeprties set: jgroups_cluster.property_string=TCP(bind_addr=172.20.242.100;bind_port=8241):TCPPING(initial_hosts=172.20.242.100[8241],172.20.242.100[8256],172.20.242.105[8241],172.20.242.105[8256],172.20.242.106[8241],172.20.242.106[8256];port_range=0;timeout=5000;num_initial_members=2):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=48;level=ERROR):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK2(use_mcast_xmit=false;retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):UNICAST3:pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    Possibly, but what are the backspaces () doing in the config? I suggest use an XML config, plain-style config is not recommended anymore On 11.02.19 12:34, reeta aggarwal wrote: Hi i have updated to following configuration: TCP(bind_port{1};max_bundle_size=256k;max_bundle_timeout=30;use_send_queues=true;sock_conn_timeout=300;thread_pool.enabled=true;thread_pool.min_threads=2;thread_pool.max_threads=8;oob_thread_pool.enabled=true;oob_thread_pool.min_threads=1;oob_thread_pool.max_threads=8):\ TCPPING(async_discovery=true;initial_hosts={2};port_range=1;timeout=3000;num_initial_members=2):\...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    These configs are weird, where did you get them from? I suggest either upgrade to a newer version and/or use at least a config shipped with the version you currently use (and then make modifications, e.g. to TCP/TCPPING). Some observat ons: - FD and FD_ALL is redundant - FD_SOCK is missing - The thread pools in TCP are disabled! Why? - UNICAST3 is missing - FRAG2 is in the wrong place - There's no merging protocol etc etc etc On 11.02.19 10:09, reeta aggarwal wrote: i m using two channels control...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    On 11.02.19 07:59, reeta aggarwal wrote: Hi, i am using jgroups jgroups-3.5.0.Final.jar . when i am rebooting by secondary node . my system gets shutdown after some time. there are few messages as shown below in my server logs .there is ambious address coming up after reboot which is causing the issue. DEBUG [NAKACK2] LABECSAP01-60365: removed 766961e5-ebd6-5e5f-6e27-e9321d827688 from xmit_table (not member anymore) This means you received a message from a node which was expelled from the cluster,...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    On 07.02.19 08:59, Rashmi wrote: We are using Jgroup 3.4.1. Ouch! :-) Customer is using 6 nodes in the network and after restart, imeediately within 40s , receiveing the message saying haven't received a heartbeat from 2432c2fa-acda-c0c0-3f88-469b7781f968 for 41088 ms,adding it to suspect list So this must be from FD_ALL (btw: remove FD, you don't need FD_ALL and FD!). .. CLuster communcation is broken with this error. Customer is using WIndiws 2012 R2 with IPV6 underlying network and want cluster...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    On 06.02.19 05:26, Puneet Srivastava wrote: Hi, Thanks for your response. We will try to open the case. It should be with red hat right? Correct. If you have an EAP subscription, you should be able to file a case with Red Hat online. Is the fix was said to be available in 3.2.5 and 3.3. Is it available in 3.2.12 also as it is higher version of 3.2.5? Yes, but you're assuming that JGRP-1554 is the root cause of your problem, which I don't know. Thanks Puneet On Tue, Feb 5, 2019, 6:02 PM Bela Ban <belaban@users.sourceforge.net...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    This version is 5+ years old; I don't support such old versions, but since you have EAP I suggest open a case. On 02.02.19 13:49, Puneet Srivastava wrote: Hi, We are Using EAP6.2.4 Which uses internally JGROUP 3.2.12. We have been facing issue in server start up. On some occasion the servers fails to join the clustter during and server never starts. Below is the message keeps printing the logs. simalar issue is of the link http://jgroups.1086181.n5.nabble.com/Cluster-coordinator-problem-td6586.html...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    JGroups does work correctly with IPv6, but your routing table has to be setup correctly On 21.01.19 16:08, karan arora wrote: Hi, I didn't quite follow the conclusion on the thread. What i understood from this thread is that the problem is with the network configuration and not with jgroups. Jgroups should work correctly with 'preferIPv4Stack=false' and should work fine with IPv6 stack as well. is that the correct understanding here or jgroups will not be able to work with IPv6 stack. JGRP000200:...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    Correct On 31/12/18 7:18 AM, sandeep wrote: Thank you for the input, i will take a look into it. Does that mean, i need to use-Djava.net.preferIPv4Stack=true irrespective if IPV4 or IPV6. If the configuration is set properly it works in both IPV4 or IPV6 with the option set ? JGRP000200: failed sending discovery request: java.io.IOException: Network is unreachable https://sourceforge.net/p/javagroups/discussion/18794/thread/c32a72f3e7/?limit=25#1b86 Sent from sourceforge.net because you indicated...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    You probably have your routing table set up incorrectly; MPING.mcast_addr (default: 230.5.6.7 in IPv4) needs to have either a route in your routing table, or a default route (0.0.0.0). In some cases, the combination of network interface (e.g. loopback) and mcast route doesn't match, see [1] for details [1] https://github.com/belaban/JGroups/wiki/Multicast-routing-on-Mac-OS On 17.12.18 04:06, sandeep wrote: As part of hibernate upgrade from 3 to 5, I had to upgrade infinispan to from 4 to 8 and jgroups...

  • Bela Ban Bela Ban posted a comment on discussion Help

    In the server config, you're missing STABLE. This will lead to OOMEs for multicast messages over time, why have you remvoed it? Other than that, your env seems quite complicated... I suggest reduce it to the smallest possible setup that reproduces an OOME and re-post. On 12.11.18 11:09, Valerii Pekarskyi wrote: Thanks for the answer! I am using JGroups 4.0.15.Final, and attached cluster configs to the first post in the thread. Did no customization in "client" cluster, and "server" cluster has UDP...

  • Bela Ban Bela Ban posted a comment on discussion Help

    A jgroups-temp-thread is created when the thread pool is exhausted; so when looking at your config I assume your thread pool is too small...

  • Bela Ban Bela Ban posted a comment on discussion Help

    Which version of JGroups and can you post your config? On 09/11/18 12:35 PM, Valerii Pekarskyi wrote: Hi there! Last several days I struggle with sporadic failures of instances running inside the JGroups cluster. I believe there is some cluster misconfiguration, and I try to find correct configuration, but with no much luck yet. May I ask for some advice? There are two types of failures I see in logs. There is a positive side that failed JVM generates heap dump that allows better diagnosis, so there...

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    This has been fixed and merged onto master. Please check out master and see if this fixes your problem. Cheers

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    This makes a lot of sense; I've created [1] to track this. This is the second item on my todo list, and will be fixed in 4.0.16. Cheers, [1] https://issues.jboss.org/browse/JGRP-2305

  • Bela Ban Bela Ban posted a comment on discussion Help

    Yes, this is possible, but perhaps you should consider using something more dynamic (such as TCPGOSSIP) if your membership changes frequently. For the existing nodes, you can grab TCPPING as follows: JChanel ch; TCPPING ping=ch.getProtocolStack().findProtocol(TCPPING.class); ping.setInitialHosts(...) OR ping.getInitialHosts().add(...) This could be done via a message that's sent to all existing nodes, at the application level. Note that you should also modify the XML config, if a node is restarted...

  • Bela Ban Bela Ban posted a comment on discussion Help

    Try a standalone JGroups cluster with the config below. I don't think sock_conn_timeout should have an effect, especially in discovery where you defined async_discovery to be true. Perhaps reduce port_range, as this will lead to (port_range * initial_hosts) connection attempts. Cache synchronization is Infinispan-specific, I suggest ask this Q on the Infinispan mailing list On 18/09/18 22:17, Suresh wrote: We have multiple nodes behind load balancer and each nodes maintain own cache if the cache...

  • Bela Ban Bela Ban posted a comment on discussion Help

    What cache sync? Do you have a config you could post? Also post a stack trace of the exception you're seeing On 17/09/18 16:29, Suresh wrote: Hi, In our jgroups 4.x configuration we have sock_conn_timeout="300" then cache sync is working fine. When we increase more nodes then SocketTimeout exceptino is coming. Then we tried to change sock_conn_timeout="2000" then cache sync is not working. If my understanding is correct then this cache sync should work for 2000 also because it has more timeout value....

  • Bela Ban Bela Ban posted a comment on discussion Help

    No, these values look good. You never even exceed the min number of threads (10). The thread pool is therefore sized correctly.

  • Bela Ban Bela Ban posted a comment on discussion Help

    I doubt this is the root cause; as a matter of fact, TCP is much more battle-tested than TCP_NIO2. You could run probe.sh to look at thread pool sizes, to see if your pool is too small.

  • Bela Ban Bela Ban posted a comment on discussion Help

    Perhaps your threads compete for the same keys? I don't know, as I said, with the little and generic innformation you posted, it is almost impossible to know... Note that your config is missing UNICAST3... but that's certainly not the cause of the slowness.

  • Bela Ban Bela Ban posted a comment on discussion Help

    Hard to tell what causes this slowness... My suggestion is run this under JMC or some other profiler, to find out what's causing this. If this is caused by Infinispan (e.g. a click on an item results in a get()), then it may be Infinispan or JGroups. In that case, I suggest write a simple standalone application which mimicks the way Infinispan is used (e.g. gets and puts), and uses your existing config, then I can take a look. On 15/08/18 10:28, Suresh wrote: We have recently upgraded our application...

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    On 07/08/18 10:20, Joseph Leonard wrote: JGroups version We are using the JGroups 3.6.2.Final version and this issue is specifically related to the |CENTRAL_EXECUTOR| and |Executing| protocols. I appreciate this code base is now a couple of years behind trunk but from reviewing the trunk in GitHub (and the circa 10 subsequent commits across these classes) it appears this issue is still present. OK Issue description During cluster instability, which is causing a number of view changes, we have been...

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    This is not the code I wanted to see! I wanted to see the code you use in receive(Message) to de-serialize the object from the message's byte array, and the code you use to serialize an object to a byte array that's then passed to the constructor of the Message. BTW: the code you show above does not take a possible offset into account, ie. if the message had a byte array of 64, but the data was at offset=16 and length=54, your code would be incorrect!

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    How do you serialize and un-serialize your data to and from a byte array? Do you have some sample code?

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    Note that JGroups does support IPv6, but then all addresses have to be either IPv4 or IPv6; you can't mix-n-match. Your config code looks fine

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    You're binding to 127.0.0.1 ('physical address=127.0.0.1:10601'); this will only work for processes on the same box. What's your discovery protocol? Post the config, please You have a mixture of 3.x and 4.x members: packet from fe80:0:0:0:45ed:5b40:960d:cd9f%12:10600 has different version (3.4.3) than ours (4.0.10); packet is discarded You also mix IPv4 and IPv6: I suggest use -Djava.net.preferIPv4Stack=true to start your processes

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    Perhaps you're binding to the wrong network interface in dev, but the correct one in prod. Can you post your config? Also post the logs for all 4 members. I'm interested in org.jgroups.protocols.pbcast.GMS and the discovery protocol at TRACE level.

  • Bela Ban Bela Ban modified a comment on discussion javagroups-users

    What about offset/length? Have you read my previous comment?

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    What about offset/length? Have you read my orev comment?

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    Use the BAOS constructor that takes offset and length into account. You cannot ignore those! Sent from my Samsung device -------- Original message -------- From: krishna karthik karthikk@users.sourceforge.net Date: 7/24/18 12:17 PM (GMT+01:00) To: "[javagroups:discussion]" 130427@discussion.javagroups.p.re.sourceforge.net Subject: [javagroups:discussion] StreamCorruptedException: invalid stream header Hi The code is just serializing the data. Not taking into consideration the msg.getOffset() and...

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    What does at fromByteArray() do? com.karthik.rm.cluster.router.jgroups.serialize.SerializableByteArray.fromByteArray(SerializableByteArray.java:109) Does it read the byte array starting at offset (msg.getOffset()) and take msg.getLength() into account?

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    Do the hosts connect to the GR at all? You can start the gossiprouter with JGroups/bin/gossiprouter.sh -dum_msgs true -jmx true, then connect to it and dump the routing table, to see who's connected. Also, if you enable TRACE level logging for org.jgroups.protocols.TCP/TCPGOSSIP, you'll see the actual bind addresses. You may also want to use probe.sh to inspect TCP.bind_addr/bind_port/extrenal_addr/external_port and TCPGOSSIP.initial_hosts.

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    Right, I traded compatibility for better mainatinability in {A}SYM_ENCRYPT. In other words, if you use ASYM_ENCRYPT, it won't be backwards compatible. My bad...

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    No, this is wrong! In TCP: external_addr="MY_PUBLIC_IP" external_port: leave this empty (so it will be 7801, same as bind_port) In TCPGOSSIP: * initial_hosts="GOSSIPROUTER[12000]" // possibly list multiple GRs here

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    List the public IPs in initial_hosts

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    You need to set bind_addr="local_IP" and external_addr="public_IP"

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    Can you ping 14.142.204.195 from the other box? Perhaps ICMP is disabled, or dropped by the VPN, so try using a tool such as netcat to connect to a service running on 14.142.204.195 instead. I'm afraid I won't be able to help you much here; if nc / ping don't work, JGroups will definitely not work, either! You should contact your networking support org to resolve this!

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    No. And TCP won't work either, if you cannot establish a TCP connection from any of the boxes to any other to the 14.x.x.x address.

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    Why do you use external_addr? Can't you ping 192.168.225.170 (for example) directly from the other network? You have both MPING and TCPPING in your stack, this won;'t work and only MPING will be used here. If you use external_addr, then you need to list the 14.x.x.x addresses in TCPPING.initial_hosts.

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    What's an 'external network'? If members can ping each other, then things will work. What config do you use?

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    The reason is that you're calling dumpStats() after the channel has been disconnected, and 'bundler' was set to null. I fixed this on master (commit 42bcc3b13f1fe677d00e135798e4ed38df82b972), but calling dumpStats() should be done on a connected channel. Cheers, On 25/05/18 06:10, alka pandey wrote: Hi, can you please let us know about these warning/error messages" ? While starting and stopping the application , I am getting thsese errors .( Upgrading my jgroup version 3.4.3 Version 4.0.10 , This...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    These protocols don't open any ports. Protocols that do include FD_SOCK, UDP/TCP/TCP_NIO/TCP_NIO2 and STATE_SOCK. Not sure if this list is exhaustive though...

  • Bela Ban Bela Ban posted a comment on discussion javagroups-development

    On 13/03/18 06:59, Rajat wrote: Hi, I am updraging 2 node cluster one by one from 2.6.15 to 3.5.0 , as i upgrade First node it stops talking to other node on 2.6.15 version . Yes, these versions are not binary-compatible, therefore they cannot join the same cluster and 'understand' each other. Due to this all of request that comes on Second node got failed, Can you please provide a RCA for this JGroup Upgrade failure so that same can be communicated to Customer by us. What's an RCA? I've started...

  • Bela Ban Bela Ban posted a comment on discussion javagroups-development

    What you want to do is currently not possible, as major/minor versions aren't backward compatible (only patch versions). There's work in progress which will fix this, see [1]. [1] https://github.com/jgroups-extras/RollingUpgrades On 08/02/18 12:30, Chetana Dixit wrote: Hi, I am updraging 3 node cluster one by one from 3.4.3 to 4.0.10 , as i upgrade First node it stops talking to other 2 nodes on 3.4.3 version . As i upgrade second node also , first and second node start communicating but third node...

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    I suggest experiment enabling/disabling remove_old_coords_on_view_change and remove_all_data_on_view_change, as described in [1] [1] http://www.jgroups.org/manual4/index.html#_jdbc_ping On 11/01/18 19:56, Saravanan wrote: Hi, I have tried upgrading our jgroups cluster from version 3.2.9 to 4.0.8. After upgrading I am facing two issues: 1. The entries that are added to JGROUPSPING table are not cleared even after the node leaves the cluster (killed using -9). Because of that the number of entries...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    On 04/10/17 03:51, jp$ wrote: Hey Bela So the issue seems to have been resolved. Its under testing for now. I just wanted to clarify a few more nuances about our setup which other's might find useful and I want to see if there's a better configuration than what I have currently 1. I did not mention this earlier by in my setup we have big physical machines on which we are running multiple JVM's each of which runs our application. Should that require a specific change to get better throughput? If you...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    This means the cluster hasn't formed On 21/09/17 03:02, jp$ wrote: The warning's which we get are [m2017-09-20 20:03:09,127 WARN [Incoming-2,CNC-dr,node-608] [] () [org.jgroups.protocols.pbcast.NAKACK2] JGRP000011: node-608: dropped message 6640 from non-member node-216 (view=[node-317|11220] (86) [node-317, node-600, node-601, node-602, node-603, node-605, node-604, node-606, node-608, node-607, node-609, node-611, node-613, node-614, node-610, node-615, node-612, node-616, node-500, node-501 ...])...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    On 21/09/17 02:33, jp$ wrote: Hello - We are using a commerce platform which internally uses jgroups for its clustering needs. We have had an ongoing issue where in nodes get out of the cluster, we have not been able to establish a cause for the same however the hypothesis which we have is that whenever the CPU utilization on the nodes spikes it causes nodes to step out of the cluster and they do not join back in. We are running JGroups 3.1 and considering a bunch of options 1. Upgrading to Jgroups...

  • Bela Ban Bela Ban posted a comment on discussion Help

    On 08/08/17 21:42, James Ahlborn wrote: Hard to say without more info, e.g. JGroups version, configuration etc. (Note that I won't support versions < 3.6 on a community basis) We are using a largely vanilla version of the udp.xml with some minor values tweaked. We using an older version, 3.4.0. The thing is that 3.4.0 is from 2013, and there have been many improvements since then, e.g. (only showing 3.4.x and 3.5.x versions, not even 3.6.x): - https://issues.jboss.org/browse/JGRP-1724 (3.4.1) - https://issues.jboss.org/browse/JGRP-1752...

  • Bela Ban Bela Ban posted a comment on discussion Help

    Hi James, On 07/08/17 23:15, James Ahlborn wrote: We have a cluster with 12 nodes running a udp stack. Lately, we've been running into stability issues with the cluster. Our workloads are increasing on the nodes, and we seem to be increasingly encountering scenarios where the nodes "suspect" other nodes and then try to change the view, which causes them to end up fighting sometimes between the old coordinator (still around) and the new. Additionally, when we rolling restart the cluster, we sometimes...

  • Bela Ban Bela Ban posted a comment on discussion javagroups-users

    I have no idea, googling for this does show a few people had the same issue but I didn't find any answers: [1] [1] https://github.com/netty/netty/issues/1258 On 17/07/17 23:12, Alireza wrote: java.io.IOException: Operation not permitted -- Bela Ban | http://www.jgroups.org

  • Bela Ban Bela Ban posted a comment on discussion Help

    On 18/05/17 16:55, Rich Coe wrote: I think this is happening when several processes are broadcasting messages and this process gets a storm of messages all at once. That's possible as it might deplete the thread pools. Try increasing the max thread pool size: thread_pool.max_threads="200". You can monitor the thread pool sizes via probe or JMX, and you can also see how many times messages were rejected due to full pools or new threads spawned. Spawning new threads is a second line of defense and...

  • Bela Ban Bela Ban posted a comment on discussion Help

    Looks like both regular and internal thread pools reject the message because they're full, so a new thread is spawned. This fails because the new thread cannot be created, because (I guess) you have too many threads started. What's your configuration (especially interested in the thread pool config)? On 18/05/17 14:25, Rich Coe wrote: I'm trying to diagnose an OutOfMemory error at startup logged by JGroups. I think this is a symptom of a larger issue, and I'm looking for hints of what causes it and...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    Yes, reincarnation was fixed in 2.8: https://issues.jboss.org/browse/JGRP-130

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    JGroups 3.6.x runs with JDK 7 or higher

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    Why don't you ask the makers of JRockit? I've never used it

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    You're probably running a version compiled for Java 8 with Java 7...

  • Bela Ban Bela Ban posted a comment on discussion Open Discussion

    Backport SNIFF. It's half a page of code, and such a backport should be trivial....

1 >
MongoDB Logo MongoDB