Thread: [Quickfix-developers] Patch: prevent overlapping resend requests, next/nextQueued stack overflow
Brought to you by:
orenmnero
From: Caleb E. <cal...@gm...> - 2005-02-28 16:46:51
|
Attached is a patch which addresses the two problems I reported on Friday of last week. It unrolls the recusion between Session::next and ::nextQueued by passing an extra parameter to the one-argument versions of Session::next that indicates whether the message is coming from nextQueued. This changes the public interface to Session (though the extra argument it defaults) so it may not be the cleanest solution to the problem. The multiple overlapping resend request issue is addressed more or less as I documented in the earlier email. SessionState keeps track of the details of any outstanding resend request and this data is reset when the session disconnects or a queued message is successfully processed. After these changes all of the acceptance tests pass but the unit test called 10_MsgSeqNumGreater fails for each FIX version. The input file for that test bears the comment "ERROR, Why are we sending this twice?". I think my patch fixes this bug in the engine, so the testcase needs to be adjusted :-) I'm attaching the results of the runat program. -- Caleb Epstein caleb dot epstein at gmail dot com |
[Quickfix-developers] Re: Patch: prevent overlapping resend requests, next/nextQueued stack overflow
From: Caleb E. <cal...@gm...> - 2005-02-28 17:03:52
Attachments:
quickfix-cvs-20050225.diff.txt
|
On Mon, 28 Feb 2005 11:46:47 -0500, Caleb Epstein <cal...@gm...> wrote: > Attached is a patch which addresses the two problems I reported on > Friday of last week. Here's another version of the patch which includes fixes for the broken test cases. Now all unit and acceptance tests pass. -- Caleb Epstein caleb dot epstein at gmail dot com |
Re: [Quickfix-developers] Patch: prevent overlapping resend requests, next/nextQueued stack overflow
From: Caleb E. <cal...@gm...> - 2005-08-09 18:07:59
|
On 8/9/05, Joerg Thoennes <Joe...@ma...> wrote: > Now I have the following issue and wonder whether this patch fixes it. Lo= oking at the > source code I am not absolutely sure: >=20 > After a connection failure the client connects to the QF server, and they= start resend > processing, ie QF requests to resend a couple of message. Now the followi= ng happens: >=20 > client resends message N > QF processes the queued message N > client send SequenceReset to N+2 (skipping N+2) > QF processes the queued message N+1, ignoring the SequenceReset > client resends message N+3 > QF detects that it is missing messsage N+2 > QF send ResendRequest N+1 to INFINITY >=20 > Now we get into a loop. Ouch. This looks like a slight variation on another issue I raised w/r/t reconnects and resend processing. I've never seen the infinite loop you illustrate above but I am sure it could happen. Here's the end of a thread from April where I raised this issue: http://sourceforge.net/mailarchive/message.php?msg_id=3D11573843 Basically, the problem is that QF is queueing ADMIN-type messages, which may be skipped over in the process of dealing with ResendRequests. Because QuickFIX and at least one other FIX engine can end up "invalidating" one of these enqueued messages courtesy of a SequenceResetGapFill, it is dangerous for QuickFIX to hold onto copies of them as they were originally sent. I am doubtful that many (any?) engines would GapFill over messages other than ADMIN-type messages, but is there anything in the specs that makes it invalid to do so? If not, the defensive approach is to just never enqueue *anything*. Just let the counterparty resend the message (or GapFill over it) as part of the normal ResendRequest processing. Making this change should be simple, but it will require a corresponding change to the code in Session.cpp that throttles ResendRequests. It makes use of the fact that a message came from the Session's queue to know that it has reached the end of a ResendRequest. So if messages are no longer enqueued there need to be a different way to detect the end-of-resend-request processing condition. This shouldn't be too hard to do. > Your patch line >=20 > if (!queued) > nextQueued(); >=20 > forces QF to consider the next incoming message instead the next message = from the queue. >=20 > Therefore, I guess your patch also addresses this kind of behaviour. If y= ou could make a > *quick* check, this would be very kind of you. Otherwise, I will investig= ate further. I don't think the ResendRequest throttling will fix this. The "if" you highlight it just to keep from stack overflow. If messages are coming from the queue, they'll keep coming from the queue because ::next was being called by ::nextQueued (which passes the queued argument =3D true). This just prevents a deep recursion of next -> nextQueued -> next -> nextQueued -> next -> etc. The correct fix for this is to either get rid of the internal queue in the Session or at least not put ADMIN messages onto it. I think the former approach is safer, and ends up simplifying Session as a result. I wonder, how does QFJ handle gaps? I hope it doesn't have this same bug. --=20 Caleb Epstein caleb dot epstein at gmail dot com |