Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.
I have a problem using RMI as a method for communication in a cluster.
RMI hangs when network connection is lost. this makes it hard for us to
detect network problems and failover to a different machine.
I understand that there is no simple way to make RMI fail quickly
without limiting execution time. we need to allow long operations and
detect disconnections/crashes quickly.
I could build a polling mechanism and try to interrupt() threads that
are stuck on a remote call that will not return. but since we use RMI
very conservatively (pass objects strictly by value, and use a generic
interface so that our generic failover logic will work always - in
effect we have a single interface with a single call that never returns
an object by reference) im thinkng maybe we can replace RMI altogether,
instead of building an elaborate solution around it.
I'm not familiar with JavaGroups but it looks like it may be simplest to
just retire RMI and replace it with RpcDispatcher.
JGroups claims to detect crashes in the cluster, and it seems like this
feature is just what we are missing in RMI. Does it guarantee that
RpcDispatcher will not hang when a crash is detected? does JGroups have
problems? is it proven to be robust in enterprise/ASP environments? is
it an evolving project? in short, how is it as a choice for a communication layer for a robust distributed application instead of RMI?
thanks in advance,