[Quickfix-developers] RFC: transaction, speed upgrades

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hello,

I am making some changes to QuickFIX and would like feedback on the 
plans.  I'm not sure how best to get the changes into the source tree 
but that is my goal since I think they'll be generally useful and I'd 
like to not patch future versions...  :-)  I'm hoping to get some 
consensus on the work to be done so that my chances of getting the 
changes incorporated are greater... and to benefit from the combined 
wisdom of the group.  I am relatively new to QuickFIX so please bear 
with me!  I would need a volunteer to get the Java wrappers done since 
we are C++ only...

First: I'd like to modify MySQLStore such that it uses the "prepared 
statements" in version 4.1.1+.  I'm hoping for a significant 
performance improvement.

Second: I'd like to provide the option to store the received messages 
in the 'messages' table in case the developer needs to look at those 
messages for some other reason.  This would be less error prone than 
remembering to manually persist received messages.  For reasons that 
will become clear soon, this would happen at dequeue from the 
ThreadedSocket* objects.

The third proposal needs some motivation:

In some applications the validity of sending a FIX message depends on 
the success of sending another FIX message.  It would therefore be nice 
to have an atomic unit of work that involves the processing and sending 
of an arbitrary number of messages.  One simple example: the developer 
wants to send an execution report AND a drop copy somewhere else.  If 
the execution report doesn't happen then the drop copy should not 
either (and vice versa).

Proposed solution (using MySQL 4.1.1+ with BDB or InnoDB tables):

a) Add begin() and commit() methods to MySQLStore (and perhaps nil 
versions in MessageStore) which returns handle(s) useful for performing 
other MySQL statements in the same transaction.  The generalized 
transaction looks like this:

	begin transaction
	receive message from queue, persisting changes to next incoming seqnum 
(can't take place in socket reader thread-- does it now?)
	business logic resulting in message send(s) and potentially in other 
persistence within the same transaction
	commit transaction

The problem here is that we cannot actually send messages unless we get 
past commit-- otherwise we might advertise something that didn't 
actually happen!  More on this follows... Also, the user manually calls 
begin() and commit() if the enclosed send(s) and other persistence are 
not in response to a received message.

b) static Utility::socket_send() is re-implemented to place bytes on a 
send queue instead of actually sending on the socket-- and there is 
YAWT (yet another worker thread) draining this queue onto the socket.  
There are two reasons for this-- first, bandwidth utilization may 
increase if business logic can run concurrently (our in-house engine 
has this).  Second, this provides a mechanism for stalling the actual 
send until post-commit (more in the next point).  If the worker thread 
encounters a write error then the queue is marked bad/emptied and 
socket_send() returns the error encountered by the write instead of 
performing the next enqueue.  The enqueued messages are lost but will 
be resent at next session startup (gap fill).  BTW, I am aware that 
socket_send() is static-- the socket handle is used with a global map 
to obtain the queue associated with the socket.  Hmmm... since there is 
really no reason to do this with the non-threaded incarnation of 
QuickFIX perhaps the right place for this is a re-implementation of 
ThreadedSocketConnection::send().

c) when beginning the transaction, a barrier is placed on the write 
queues of each active session (whether connected or not).  The barrier 
means "don't send past this point, wait here".  All such barriers are 
removed just after the transaction is committed.  This delays the send 
of each message generated from within the transaction until after the 
message and seqnum increment have been persisted -- a guarantee that 
they are available for servicing future resend requests.

High performance, reliable routing-type applications stand to benefit 
the most from these changes.  If a process/server dies in the middle of 
a transaction then the incoming message being acted on is effectively 
"un-received" and any messages produced are not sent.  Of course 
changes would be made in such a way as to not impact current QuickFIX 
users.

It would be nice if "someone" wanted to do all of this for BDBStore 
(imaginary Berkeley DB transactional implementation of MessageStore) in 
the cases where MySQL is not practical-- like a desktop application.

Comments? Concerns?

Thanks,
John