|
From: Colin D. <co...@ma...> - 2018-11-01 12:37:24
|
We do something a little different to handle this problem. We do an in-memory cache of sequence numbers backed by a shared database with the in-memory cache being shared between machines by means of a cluster mechanism. One of the implementations is Hazelcast, but others would do just as well. However, we don't worry too much about the seq nums being completely synced by as we do async instead of sync. Our theory is that the failure case is in the great minority, statistically, so we don't want to punish successful operation. On node failure, i.e. we lost the primary machine/process, when the backup takes over, it will replay a certain number of messages to make sure that nothing is lost. It does this by calculating the last message it can guarantee was received and completely processed and resetting the seq num to there before starting the session, which forces a seq num replay from the counter. In order to make this work, the business logic has to be able to test an incoming message to see if it's already been processed, and skip it if it has. On 10/31/18 4:23 PM, Robert Nicholson wrote: > QuickFIX/J Documentation: http://www.quickfixj.org/documentation/ > QuickFIX/J Support: http://www.quickfixj.org/support/ > > > How do folks handle seqnum management in the context of secondary machines needing to know where the primary session finished when > the secondary is activated? > > I’ve seen a couple of approaches such as asynchronous writes to a shared database. > > But what you don’t want is just a write to the database that’s synchronous as that would kill performance. > > Has anybody successfully done this with rsync seqnum files between servers? > > > > _______________________________________________ > Quickfixj-users mailing list > Qui...@li... > https://lists.sourceforge.net/lists/listinfo/quickfixj-users |