Re: [Quickfix-developers] Incorrect handling of SequenceReset
Brought to you by:
orenmnero
|
From: Caleb E. <cal...@gm...> - 2006-04-29 12:10:10
|
On 4/28/06, Ajay Kamdar <Aja...@tr...> wrote: > > I am testing QuickFIX 1.11.0 to see if I can use it for a new project > instead of our current production FIX engine. During my tests I saw > incorrect behavior in QuickFIX's handling of a SequenceReset message. It > looks like (a) QuickFIX failed to reset its next expted target number in > response to a SequenceReset, and (b) subsequenly refused to send another > resend request for ever because it had already sent a previous resend > request. I saw this happen once, but unfortunately can't reproduce it > consistently. Is this something that rings a bell with anyone? Given the event log you included, the application clearly received and processed the SequenceReset and looking at the code, there's nothing else that could happen but it changing the state's next expected sequence number. What sort of Store are you using for this NullApplication? Perhaps its not behaving correctly w/r/t updating sequence numbers. Given that there are extensive tests to exercise SequenceResets, it seems likely that there is something in your application or configuration thats leading to this behavior. > There are a couple of things that might be problematic: > > 1) QuickFIX's handling of the SequenceReset is split across two methods. = The > Session::nextSequenceReset method that processes the incoming sequence ca= ll > m_state.setNextTargetMsgSeqNum(), but the > m_state.resendRange() is reset in Session::verify only when the next mess= age > arrives. If the connection were to drop for whatever reason after the > SequenceReset and between the next message's arrival, it seems to me that > the m_state.m_resendRange would not reflect the correct state that the > resend request had been satisfied. Wouldn't it be better to reset > m_state.m_resendRange within Session::nextSequenceReset right where the n= ext > target seqnum is updated? Look at the end of Session::disconnect, which is called whenever the Session disconnects. m_resendRange is reset to (0,0). > 2) There is no timeout associated with how long QuickFIX waits for a rese= nd > request to be satisfied. If the counter party's resent messages or > SequenceReset are queued up behind a bunch of other messages, or if > QuickFIX does not correctly process the SequenceReset (as the log below > shows), then the session can potentially loop for a very long time > complaining about MsgSeqNum too high and keep queuing up new messages. FW= IW, > the Appia FIX engine (I used to manage its development team) in this type= of > a situation dropped the connection after a configurable interval waiting = for > the resend request to be satisfied. A dropped connection typically gets > picked up quickly by monitoring systems, thereby alerting someone that th= ere > is a problem that may need intervention. IMHO QuickFIX should have someth= ing > similar in Session.doTargetTooHigh to break out in case a resend request = is > not satisfied for a long time. This is an interesting idea. Perhaps this could be made configurable in the Session. In practice, the Session should eventually disconnect due to heartbeat timeouts if the SequenceReset isn't processed properly. Did you wait long enough for this to happen in your tests? > Here's the relevant excerpt from the FIX message and event log: > > 20060428-15:41:27 Incoming FIX : > 8=3DFIX.4.4=019=3D56=0135=3D0=0134=3D5=0149=3DCAPISIM=0152=3D20060428-15:= 41:27.349=0156=3DCAPIGW=0110=3D166=01 > 20060428-15:41:27 : MsgSeqNum too high, expecting 1 but received 5 > 20060428-15:41:27 Outgoing FIX : > 8=3DFIX.4.4=019=3D65=0135=3D2=0134=3D6=0149=3DCAPIGW=0152=3D20060428-15:4= 1:27.349=0156=3DCAPISIM=017=3D1=0116=3D0=0110=3D036=01 > 20060428-15:41:27 : Sent ResendRequest FROM: 1 TO: 0 > 20060428-15:41:27 Incoming FIX : > 8=3DFIX.4.4=019=3D98=0135=3D4=0134=3D1=0143=3DY=0149=3DCAPISIM=0152=3D200= 60428-15:41:27.349=0156=3DCAPIGW=01122=3D20060428-15:41:27.349=0136=3D6=011= 23=3DY=0110=3D192=01 > 20060428-15:41:27 : Received SequenceReset FROM: 1 TO: 6 > 20060428-15:41:57 Incoming FIX : > 8=3DFIX.4.4=019=3D56=0135=3D0=0134=3D6=0149=3DCAPISIM=0152=3D20060428-15:= 41:57.349=0156=3DCAPIGW=0110=3D170=01 > 20060428-15:41:57 : MsgSeqNum too high, expecting 1 but received 6 This makes me suspect something wrong with your Application's Store.=20 Clearly QF handled the SequenceReset above, so it should be expecting SeqNum 6 here. -- Caleb Epstein caleb dot epstein at gmail dot com |