We use jgroups with oscache to send cache invalidation messages in a cluster of 10 hosts on a medium scale website. Our setup has been working in production for close to 6 months now but lately we've seen a new problem.
We've noticed that messages dont get sent(or maybe they are sent but not received) at all in one direction between 2 hosts. So A will not be receving any messages from B, but B would be receiving messages from A. Apart from the lost messages from B --> A, every host would be receiving messages from every other host in the cluster just fine.
On the face of it, the cluster itself looks healthy and every host contains a view with all 10 hosts in it.
Has anybody seen this issue before? If you need more information to understand the problem better, I
would be happy to provide it. As the logs dont seem to offer more detailed information on where the failure might be , I would also appreciate any leads on how to go about debugging this issue.
Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more.