I am working on a JGroups project where we are transferring a pretty large state when a node joins the group. It's about 1.25 GB of memory and takes >2 min. to stream the state. I'd like to configure JGroups in such a way so that it Queues messages for the nodes involved in state transfer but allow messages to continue as normal to other nodes.
For Example, lets say we have the following nodes in the view:
A - Coordinator
B - Running node
C - Running node
D - Running node
E - Starting node
So when E joins the view and starts state transfer, I'd like A and E to stop "accepting" messages and queue them for when state transfer completes. B, C, & D should continue "accepting" messages and process them as normal.
Then when state transfer completes, A and E would process the queued messages that came in during state transfer and start "accepting" messages as normal.
Can JGroups be configured to "hold" and queue messages for nodes during an event like state transfer?
JGroups version: 3.6.3.Final
My protocol stack is below. Thanks for your input!
Seems my email wasn't received.. here it goes again...:
By default, JGroups does queue incoming messages on state transfer on the state provider (A) when using a subclass of StreamingStateTransfer (STATE, STATE_SOCK) and BARRIER. Not sure about the state requester (E), I'd have to check the code.
This is done by closing BARRIER. When closing, BARRIER waits until all incoming threads have returned and allows no new threads to enter, but queues their messages.
When the state has been transferred, all queued messages will be sent up by BARRIER.
Note that queueing messages on A means that A won't be able to do certain things, e.g. admitting new members (joins) and handling STABLE requests. If your state is large, perhaps pick someone else, not the coordinator, as state provider.
You could also ship your state by means other than JChannel.getState(), e.g. on the application level: join your new member, but leave it non-operational until state has been transferred. State could be transferred by means of messages or RPCs, and - when complete - make the member operational.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for the quick reply!
So the BARRIER only applies to one node (the state provider) when in is closed? What about the requester? Regardless, It's not a "stop the world" barrier on all nodes, right?
Agreed that queuing up messages on the coordinator for a long time is not desirable.
Could you point me at a doc/example that shows how to join a member but leave it non-operational?
Many thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yes, BARRIER is only closed on the state provider.
No, this is not stop-the-world. If you wanted that, use FLUSH.
The non-operational member is something at the application level, e.g. a member doesn't process client requests etc. It's not something JGroups does.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am working on a JGroups project where we are transferring a pretty large state when a node joins the group. It's about 1.25 GB of memory and takes >2 min. to stream the state. I'd like to configure JGroups in such a way so that it Queues messages for the nodes involved in state transfer but allow messages to continue as normal to other nodes.
For Example, lets say we have the following nodes in the view:
A - Coordinator
B - Running node
C - Running node
D - Running node
E - Starting node
So when E joins the view and starts state transfer, I'd like A and E to stop "accepting" messages and queue them for when state transfer completes. B, C, & D should continue "accepting" messages and process them as normal.
Then when state transfer completes, A and E would process the queued messages that came in during state transfer and start "accepting" messages as normal.
Can JGroups be configured to "hold" and queue messages for nodes during an event like state transfer?
JGroups version: 3.6.3.Final
My protocol stack is below. Thanks for your input!
Last edit: Steve Schick 2015-05-26
Seems my email wasn't received.. here it goes again...:
By default, JGroups does queue incoming messages on state transfer on the state provider (A) when using a subclass of StreamingStateTransfer (STATE, STATE_SOCK) and BARRIER. Not sure about the state requester (E), I'd have to check the code.
This is done by closing BARRIER. When closing, BARRIER waits until all incoming threads have returned and allows no new threads to enter, but queues their messages.
When the state has been transferred, all queued messages will be sent up by BARRIER.
Note that queueing messages on A means that A won't be able to do certain things, e.g. admitting new members (joins) and handling STABLE requests. If your state is large, perhaps pick someone else, not the coordinator, as state provider.
You could also ship your state by means other than JChannel.getState(), e.g. on the application level: join your new member, but leave it non-operational until state has been transferred. State could be transferred by means of messages or RPCs, and - when complete - make the member operational.
Thanks for the quick reply!
So the BARRIER only applies to one node (the state provider) when in is closed? What about the requester? Regardless, It's not a "stop the world" barrier on all nodes, right?
Agreed that queuing up messages on the coordinator for a long time is not desirable.
Could you point me at a doc/example that shows how to join a member but leave it non-operational?
Many thanks.
Yes, BARRIER is only closed on the state provider.
No, this is not stop-the-world. If you wanted that, use FLUSH.
The non-operational member is something at the application level, e.g. a member doesn't process client requests etc. It's not something JGroups does.
Sounds good. Thanks for the help!