RE: [Quickfix-users] Session resync problem

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Coalescing multiple overlapping resend requests into a single resend
should definitely help in Sean's case where there were only 125 messages
to resend.=20

However, note that with a single thread serving all FIX sessions there
is still the possibility of a massive resend request on one FIX session
starving the other FIX sessions. I have seen some cases in production in
which a catastrophic sequence number problem on one side of a FIX
connection result in a resend request for thousands of messages. If the
FIX engine on the other end servicing that massive resend request does
so in a tight loop, it will still take just one misbehaving counter
party to affect the other FIX sessions. Obviously this kind of a problem
doesn't occur very often, and would also not be a serious problem for
installations with moderate to light volume of FIX messages.

Regards,

- Ajay

-----Original Message-----
From: Oren Miller [mailto:or...@qu...]=20
Sent: Tuesday, March 07, 2006 9:05 PM
To: Sean Kirkpatrick
Cc: Ajay Kamdar; qui...@li...
Subject: Re: [Quickfix-users] Session resync problem

Although Ajay's analysis is correct, and under other circumstances=20
moving to a threaded model might be appropriate, it is actually a red=20
herring in this case.  I know you guys are running an older version of=20
the engine, and it is the resend logic in there where the fault really=20
lies.  Older versions of QuickFIX did not handle this sort of resend=20
scenario very gracefully.  The old implementation wasn't technically=20
incorrect, but it wasn't especially smart either.  Newer versions of the

engine can detect these sort of recursive resend request scenarios and=20
avoid them so you would only send 125 instead of 125! messages.

The relevant code (in newer versions) that protects against this=20
scenario is implemented with a resendRange in the session class.

--oren

Sean Kirkpatrick wrote:

> Thanks Ajay, I appreciate the response.
> =20
> We had considered that as an option, but I believe the ThreadedSocket
> classes spawn a thread per session.  Having hundreds of threads wasn't

> a desirable approach for us.  A thread pool would probably work, but I

> don't think that is implemented...perhaps I am mistaken?
> =20
> --Sean
>
>     -----Original Message-----
>     *From:* Ajay Kamdar [mailto:Aja...@tr...]
>     *Sent:* Tuesday, March 07, 2006 11:29 AM
>     *To:* Sean Kirkpatrick; qui...@li...
>     *Subject:* RE: [Quickfix-users] Session resync problem
>
>     You might want to consider using the
>     ThreadedSocketAcceptor/ThreadedSocketInitiator classes that will
>     place each Session on its own thread, which I expect should
>     prevent the other sessions getting starved while the engine is
>     busy servicing the resend requests in a tight loop. Obviously your
>     application would need to be thread safe to go this route. Caveat
>     emptor: Not having actually used these classes myself yet, this
>     suggestion is based upon theoritical analysis. YMMV.
>     =20
>     - Ajay
>
>         -----Original Message-----
>         *From:* Sean Kirkpatrick
>         [mailto:sea...@pi...]
>         *Sent:* Tuesday, March 07, 2006 9:05 AM
>         *To:* qui...@li...
>         *Subject:* [Quickfix-users] Session resync problem
>
>         Hello All,
>
>         We had an issue in our production environment that boiled down
>         to the following:
>
>         1. Client has hard disconnect
>         2. We send some messages prior to detecting the session is
down
>         3. Client logs back in with higher than expected seq num and
>         immediately starts sending some messages
>         4. We send resend reqs for each message we receive until they
>         are handled, which the client does by
>             sending us seq reset messages.
>         5. The client heartbeats.
>         6. At this point, we do some message resending.
>         -- this is where the trouble began --
>         7. Since the client did not sync its seq nums after the logon,
>         when we start sending these messages they
>             have higher than expected seq nums.
>         8. Client sends a resend request for each of the messages we
>         sent (125).
>
>         When processing the resend requests, the engine sits in a
>         tight loop processing its queue.  The trouble
>         here is that the resend requests took approx. 5 minutes to get
>         through and all other connections were
>         starved.
>
>         Has anyone come across this problem, or have a suggestion for
>         dealing with it gracefully?
>
>         Regards,
>
>         Sean Kirkpatrick
>

_________________________________________________________________________=
__

The information in this email is confidential and may be legally =
privileged. It is intended solely for the addressee. Access to this =
email by anyone else is unauthorized. If you are not the intended =
recipient, any disclosure, copying, distribution or any action taken or =
omitted to be taken in reliance on it, is prohibited and may be =
unlawful.
_________________________________________________________________________=
__