On 8/9/05, Joerg Thoennes <Joe...@ma...> wrote:
> Now I have the following issue and wonder whether this patch fixes it. Lo=
oking at the
> source code I am not absolutely sure:
>=20
> After a connection failure the client connects to the QF server, and they=
start resend
> processing, ie QF requests to resend a couple of message. Now the followi=
ng happens:
>=20
> client resends message N
> QF processes the queued message N
> client send SequenceReset to N+2 (skipping N+2)
> QF processes the queued message N+1, ignoring the SequenceReset
> client resends message N+3
> QF detects that it is missing messsage N+2
> QF send ResendRequest N+1 to INFINITY
>=20
> Now we get into a loop.
Ouch. This looks like a slight variation on another issue I raised
w/r/t reconnects and resend processing. I've never seen the infinite
loop you illustrate above but I am sure it could happen.
Here's the end of a thread from April where I raised this issue:
http://sourceforge.net/mailarchive/message.php?msg_id=3D11573843
Basically, the problem is that QF is queueing ADMIN-type messages,
which may be skipped over in the process of dealing with
ResendRequests.
Because QuickFIX and at least one other FIX engine can end up
"invalidating" one of these enqueued messages courtesy of a
SequenceResetGapFill, it is dangerous for QuickFIX to hold onto copies
of them as they were originally sent.
I am doubtful that many (any?) engines would GapFill over messages
other than ADMIN-type messages, but is there anything in the specs
that makes it invalid to do so? If not, the defensive approach is to
just never enqueue *anything*. Just let the counterparty resend the
message (or GapFill over it) as part of the normal ResendRequest
processing.
Making this change should be simple, but it will require a
corresponding change to the code in Session.cpp that throttles
ResendRequests. It makes use of the fact that a message came from the
Session's queue to know that it has reached the end of a
ResendRequest. So if messages are no longer enqueued there need to be
a different way to detect the end-of-resend-request processing
condition. This shouldn't be too hard to do.
> Your patch line
>=20
> if (!queued)
> nextQueued();
>=20
> forces QF to consider the next incoming message instead the next message =
from the queue.
>=20
> Therefore, I guess your patch also addresses this kind of behaviour. If y=
ou could make a
> *quick* check, this would be very kind of you. Otherwise, I will investig=
ate further.
I don't think the ResendRequest throttling will fix this. The "if"
you highlight it just to keep from stack overflow. If messages are
coming from the queue, they'll keep coming from the queue because
::next was being called by ::nextQueued (which passes the queued
argument =3D true). This just prevents a deep recursion of next ->
nextQueued -> next -> nextQueued -> next -> etc.
The correct fix for this is to either get rid of the internal queue in
the Session or at least not put ADMIN messages onto it. I think the
former approach is safer, and ends up simplifying Session as a result.
I wonder, how does QFJ handle gaps? I hope it doesn't have this same bug.
--=20
Caleb Epstein
caleb dot epstein at gmail dot com
|