Menu

Migration from JGroups 2.6.x to 3.4.x

Help
2014-02-11
2014-02-13
  • Justin Cranford

    Justin Cranford - 2014-02-11

    I just upgraded HA-JDBC from 2.0.15 to 3.0.0. It has a dependency on JGroups, so I am trying to migrate my JGroups config from 2.6.15 to 3.4.2.

    I am getting a port bind exception during JGroups startup. I am not sure how to resolve it, and I did not find anything useful in other discussions regarding my specific use case. My port configuration seems to be the same as what I used with JGroups 2.6.15, so I am not sure why 3.4.2 is giving me a different result.

    -------------------------------------------------------------------
    GMS: address=myhost-42639, cluster=mycluster.lock, physical address=10.20.0.131:7900
    -------------------------------------------------------------------
    java.sql.SQLException: No available port to bind to (start_port=7900)
        at net.sf.hajdbc.sql.SQLExceptionFactory.createException(SQLExceptionFactory.java:51)
        at net.sf.hajdbc.sql.SQLExceptionFactory.createException(SQLExceptionFactory.java:35)
        at net.sf.hajdbc.AbstractExceptionFactory.createException(AbstractExceptionFactory.java:62)
        at net.sf.hajdbc.util.concurrent.LifecycleRegistry.get(LifecycleRegistry.java:95)
        at net.sf.hajdbc.util.concurrent.LifecycleRegistry.get(LifecycleRegistry.java:34)
        at net.sf.hajdbc.sql.CommonDataSource.getProxy(CommonDataSource.java:85)
    Caused by: java.net.BindException: No available port to bind to (start_port=7900)
        at org.jgroups.blocks.ConnectionTableNIO.createServerSocket(ConnectionTableNIO.java:614)
        at org.jgroups.blocks.ConnectionTableNIO.start(ConnectionTableNIO.java:304)
        at org.jgroups.protocols.TCP_NIO.getConnectionTable(TCP_NIO.java:54)
        at org.jgroups.protocols.TCP_NIO.start(TCP_NIO.java:69)
        at org.jgroups.stack.ProtocolStack.startStack(ProtocolStack.java:952)
        at org.jgroups.JChannel.startStack(JChannel.java:864)
        at org.jgroups.JChannel._preConnect(JChannel.java:527)
        at org.jgroups.JChannel.connect(JChannel.java:321)
        at org.jgroups.JChannel.connect(JChannel.java:297)
        at net.sf.hajdbc.distributed.jgroups.JGroupsCommandDispatcher.start(JGroupsCommandDispatcher.java:104)
        at net.sf.hajdbc.state.distributed.DistributedStateManager.start(DistributedStateManager.java:174)
        at net.sf.hajdbc.sql.DatabaseClusterImpl.start(DatabaseClusterImpl.java:684)
        at net.sf.hajdbc.util.concurrent.LifecycleRegistry.get(LifecycleRegistry.java:76)
        ... 30 more
    Caused by: java.net.BindException: Address already in use: bind
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Net.java:444)
        at sun.nio.ch.Net.bind(Net.java:436)
        at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
        at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
        at org.jgroups.blocks.ConnectionTableNIO.createServerSocket(ConnectionTableNIO.java:608)
        ... 42 more
    

     

    Note I am setting these system properties:

    java.net.preferIPv4Stack=true
    jgroups.bind_address=10.20.0.131
    jgroups.tcpping.initial_hosts=10.20.0.131[7900]
    

     

    Here is my JGroups 3.4.2 config. If you could help me identify what is wrong with it that would be much appreciated.

    <config xmlns="urn:org:jgroups"
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/JGroups-3.4.xsd">
        <TCP_NIO
                bind_port="7900"
                port_range="0"
                loopback="true"
                enable_diagnostics="true"
                recv_buf_size="${tcp.recv_buf_size:5M}"
                send_buf_size="${tcp.send_buf_size:640K}"
                tcp_nodelay="true"
                linger="-1"
                use_send_queues="true"
                sock_conn_timeout="2000"
                peer_addr_read_timeout="2000"
                reader_threads="3"
                writer_threads="3"
                max_bundle_size="64000"
                max_bundle_timeout="3"
    
                timer_type="new3"
                timer.min_threads="2"
                timer.max_threads="8"
                timer.keep_alive_time="3000"
                timer.queue_max_size="500"
    
                thread_pool.enabled="true"
                thread_pool.min_threads="2"
                thread_pool.max_threads="8"
                thread_pool.keep_alive_time="5000"
                thread_pool.queue_enabled="false"
                thread_pool.queue_max_size="100"
                thread_pool.rejection_policy="discard"
    
                oob_thread_pool.enabled="true"
                oob_thread_pool.min_threads="2"
                oob_thread_pool.max_threads="8"
                oob_thread_pool.keep_alive_time="5000"
                oob_thread_pool.queue_enabled="false"
                oob_thread_pool.queue_max_size="100"
                oob_thread_pool.rejection_policy="discard"
        />
        <TCPPING timeout="2000" port_range="0" num_initial_members="1" initial_hosts="${jgroups.tcpping.initial_hosts:127.0.0.1[7900]}"/>
        <MERGE2 max_interval="100000" min_interval="20000"/>
        <FD_SOCK start_port="7901" port_range="0" keep_alive="false" num_tries="3" sock_conn_timeout="2000"/>
        <FD timeout="2000" max_tries="3"/>
        <VERIFY_SUSPECT timeout="2000"/>
        <pbcast.NAKACK2 use_mcast_xmit="false" discard_delivered_msgs="true"/>
        <UNICAST3/>
        <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000" max_bytes="4M"/>
        <pbcast.GMS print_local_addr="true" join_timeout="60000" merge_timeout="60000" view_bundling="true" print_physical_addrs="true"/>
        <MFC max_credits="2M" min_threshold="0.4"/>
        <FRAG2 frag_size="60K"/>
        <pbcast.STATE_TRANSFER/>
    </config>
    
     
  • Justin Cranford

    Justin Cranford - 2014-02-11

    Here is my JGroups 2.6.15 configuration file for reference. Not sure if this helps:

    <protocol_stacks>
        <stack name="tcp-sync" description="TCP and TCP PING when IP multicasting not available (ex: routers discard multicast) or PING not available (ex: firewall).">
            <config>
                <TCP start_port="7900"
                     enable_diagnostics="false"
                     loopback="true"
                     discard_incompatible_packets="true"
                     recv_buf_size="20000000" send_buf_size="640000"
                     enable_bundling="true" max_bundle_size="64000" max_bundle_timeout="3"
                     use_incoming_packet_handler="true"
                     use_send_queues="false"
                     sock_conn_timeout="1000"
                     skip_suspected_members="true"
                     thread_pool.enabled="true" thread_pool.keep_alive_time="30000"
                     thread_pool.min_threads="5" thread_pool.max_threads="25"
                     thread_pool.queue_enabled="false" thread_pool.queue_max_size="100"
                     thread_pool.rejection_policy="run"
                     oob_thread_pool.enabled="true" oob_thread_pool.keep_alive_time="30000"
                     oob_thread_pool.min_threads="1" oob_thread_pool.max_threads="8"
                     oob_thread_pool.queue_enabled="false" oob_thread_pool.queue_max_size="100"
                     oob_thread_pool.rejection_policy="run"
                />
                <TCPPING timeout="4000" port_range="1" num_initial_members="2" initial_hosts="${jgroups.tcpping.initial_hosts:127.0.0.1[7900]}"/>
                <MERGE2 max_interval="100000" min_interval="20000"/>
                <FD_SOCK start_port="7901" keep_alive="false"/>
                <FD timeout="4000" max_tries="3" shun="false"/>
                <VERIFY_SUSPECT timeout="1500"/>
                <BARRIER/>
                <pbcast.NAKACK use_mcast_xmit="false" gc_lag="0" retransmit_timeout="300,600,1200,2400" discard_delivered_msgs="false"/>
                <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000" max_bytes="400000"/>
                <VIEW_SYNC avg_send_interval="60000"/>
                <pbcast.GMS print_local_addr="true" join_timeout="60000" merge_timeout="60000" shun="false" view_bundling="true"/>
                <pbcast.STREAMING_STATE_TRANSFER/>
                <pbcast.FLUSH timeout="0"/>
            </config>
        </stack>
    
        <stack name="udp-sync" description="UDP multicast and unicast, plus PING.">
            <config>
                <UDP
                     enable_diagnostics="false"
                     loopback="false"
                     discard_incompatible_packets="true"
                     ip_mcast="true" ip_ttl="${jgroups.udp.ip_ttl:5}" tos="4"
                     mcast_addr="${jgroups.udp.mcast_addr:229.10.10.10}"
                     mcast_port="${jgroups.udp.mcast_port:7600}"
                     ucast_recv_buf_size="20000000" ucast_send_buf_size="640000"
                     mcast_recv_buf_size="25000000" mcast_send_buf_size="640000"
                     enable_bundling="true" max_bundle_size="64000" max_bundle_timeout="3"
                     use_incoming_packet_handler="true"
                     use_concurrent_stack="true"
                     thread_pool.enabled="true" thread_pool.keep_alive_time="30000"
                     thread_pool.min_threads="5" thread_pool.max_threads="25"
                     thread_pool.queue_enabled="false" thread_pool.queue_max_size="100"
                     thread_pool.rejection_policy="Run"
                     oob_thread_pool.enabled="true" oob_thread_pool.keep_alive_time="30000"
                     oob_thread_pool.min_threads="1" oob_thread_pool.max_threads="8"
                     oob_thread_pool.queue_enabled="false" oob_thread_pool.queue_max_size="100"
                     oob_thread_pool.rejection_policy="Run"/>                    
                <PING timeout="4000" num_initial_members="2"/>
                <MERGE2 max_interval="100000" min_interval="20000"/>
                <FD_SOCK start_port="7901"/>
                <FD timeout="4000" max_tries="3" shun="false"/>
                <VERIFY_SUSPECT timeout="1500"/>
                <BARRIER/>
                <pbcast.NAKACK use_mcast_xmit="true" gc_lag="0" retransmit_timeout="300,600,1200,2400" discard_delivered_msgs="true"/>
                <UNICAST timeout="300,600,1200,2400,3600"/>
                <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000" max_bytes="400000"/>
                <VIEW_SYNC avg_send_interval="60000"/>
                <pbcast.GMS print_local_addr="true" join_timeout="60000" merge_timeout="60000" shun="false" view_bundling="true"/>
                <FRAG2 frag_size="60000"/>
                <pbcast.STREAMING_STATE_TRANSFER/>
                <pbcast.FLUSH timeout="0"/>
            </config>
        </stack>
    </protocol_stacks>
    
     
  • Bela Ban

    Bela Ban - 2014-02-12

    You should use the JGroups config shipped with HA-JDBC and make changes to it rather than trying to port an old and outdated config. Couple of comments:
    TCP_NIO is not supported, you're on your own using it. I suggest TCP as alternative, or UDP if you can do IP multicasting
    port_range=0 means that if port 7900 is taken, then the process will not be able to start. This would result in the error you saw, e.g. when you're trying to start 2 instances on the same box

     
  • Justin Cranford

    Justin Cranford - 2014-02-13

    Root cause is port_range="0" in FD_SOCK. It works in TCP_NIO, but not in FD_SOCK.

    • TCP_NIO port_range="1", and FD_SOCK port_range="1" works
    • TCP_NIO port_range="0", and FD_SOCK port_range="1" works

    • TCP_NIO port_range="1", and FD_SOCK port_range="0" triggers exception

    • TCP_NIO port_range="0", and FD_SOCK port_range="0" triggers exception

    Using port_range="1" in FD_SOCK works, and it binds to 7901 as expected.

    TCP 10.20.0.131:7900 0.0.0.0:0 LISTENING
    TCP 10.20.0.131:7901 0.0.0.0:0 LISTENING

    The protocol documentation (http://www.jgroups.org/manual/html/protlist.html) says port_range="0" is valid. I should get the same 7901 binding for FD_SOCK no matter if I use port_range="0" or port_range="1", correct?

     

    Thanks for the comment about TCP_NIO. I will switch to TCP. My comment about porting my configuration just means using the same ports and timeouts from my old config. I have to use TCP due to no multi-cast, and I have to use those ports to avoid reconfiguring my existing firewall rules.

     

Log in to post a comment.