Re: [Quickfix-developers] MsgSeqNum too high problem
Brought to you by:
orenmnero
|
From: Alexey Z. <ale...@in...> - 2005-06-29 13:24:52
|
Oren,
I didn't send resend messages manually.
One of a Globex (CME) certification step is not to send resend request
for more than 2500 messages.
QF sets endSeqNo field to 0 (or 99999) in the generateResendRequest and
the only way I found is
to comment out those code and set the field to 0 in my toAdmin callback.
I set it to 0 because the initiator was written for FIX4.2 and I didn't
expected to use it for the other versions.
The problem occurred because I used the initiator for version FIX4.0 and
absolutely forgot about this patch.
Now I set the field to 99999 for version <FIX4.2 and it solved the problem.
It may say that the resend messages were corrupted, but QF's reaction
wasn't correct anyway.
I think versions checks need to be eliminated int the nextResendRequest
function:
if (endSeqNo == 0 ||
endSeqNo == 999999 ||
endSeqNo >= getExpectedSenderNum() )
{ endSeqNo = getExpectedSenderNum() - 1; }
Regards,
Alexey Zubko
Infinium Capital Corporation
(416) 360-7000 ext. 305
Oren Miller wrote:
> Were you manually sending ResendRequest messages?
>
> --oren
>
> On Jun 28, 2005, at 4:15 PM, Alexey Zubko wrote:
>
>> HI,
>>
>> I found the problem.
>> Acceptor didn't resend messages because my endSeqNo field is 0 but
>> my beginString is FIX4.0.
>> So, in the nextResendRequest function QF tried to get messages from
>> say 15 to 0.
>> It's my mistake that I put 0 in my initiator instead of 999999, of
>> course,
>> but may be we need to change the code anyway?
>>
>>
>> void Session::nextResendRequest( const Message& resendRequest )
>> .......
>> . std::string beginString = m_sessionID.getBeginString();
>> if ( beginString >= FIX::BeginString_FIX42 && endSeqNo == 0 ||
>> beginString <= FIX::BeginString_FIX42 && endSeqNo == 999999 ||
>> endSeqNo >= getExpectedSenderNum() )
>> { endSeqNo = getExpectedSenderNum() - 1; }
>>
>> std::vector < std::string > messages;
>> m_state.get( beginSeqNo, endSeqNo, messages );
>> .........
>> .
>>
>>
>> Regards, Alexey Zubko Infinium Capital Corporation (416) 360-7000
>> ext. 305
>>
>> Oren Miller wrote:
>>
>>> Any idea on why the acceptor is not responding to the
>>> ResendRequest? Those acceptance tests are passing. Do you have
>>> the incoming/ outgoing log files for both processes?
>>>
>>> --oren
>>>
>>> On Jun 28, 2005, at 2:49 PM, Alexey Zubko wrote:
>>>
>>>> Hello Brian,
>>>>
>>>> Your fix prevents the disconnecting, but I still have a
>>>> sequencing problem.
>>>>
>>>> Actually I thing there are two problems -
>>>> 1. Acceptor sends nothing back.
>>>> 2. Initiator ignores incoming application messages.
>>>>
>>>> I've got the problem during debugging - terminating initiator's
>>>> process, but it's easy to reproduce:
>>>> If to increase NextSenderMsgSeqNum number in acceptor's file,
>>>> after restart the initiator sends resend requests, the acceptor
>>>> sends nothing back and ignores all incoming application messages
>>>> (there is no fromApp() call).
>>>> Both acceptor and initiator are QF (CVS version + your patch).
>>>>
>>>>
>>>> Initiator:
>>>>
>>>> 20050628-19:38:55 : Created session
>>>> 20050628-19:38:55 : Connecting to eng-server on port 57258
>>>> 20050628-19:38:55 : Connection succeeded
>>>> 20050628-19:38:56 : Initiated logon request
>>>> 20050628-19:38:59 : Received logon response
>>>> 20050628-19:40:50 : Created session
>>>> 20050628-19:40:50 : Connecting to eng-server on port 57258
>>>> 20050628-19:40:50 : Connection succeeded
>>>> 20050628-19:40:51 : Initiated logon request
>>>> 20050628-19:40:51 : Received logon response
>>>> 20050628-19:40:51 : MsgSeqNum too high, expecting 5 but received 7
>>>> 20050628-19:40:51 : Sent ResendRequest FROM: 5 TO: 6
>>>> 20050628-19:40:51 : MsgSeqNum too high, expecting 5 but received 8
>>>> 20050628-19:40:51 : Sent ResendRequest FROM: 5 TO: 7
>>>> 20050628-19:41:06 : MsgSeqNum too high, expecting 5 but received 9
>>>> 20050628-19:41:06 : Sent ResendRequest FROM: 5 TO: 8
>>>> ........
>>>> .
>>>> Acceptor:
>>>>
>>>> 20050628-19:38:48 : Created session
>>>> 20050628-19:38:56 : Received logon request
>>>> 20050628-19:38:56 : Responding to logon request
>>>> 20050628-19:39:41 : Socket Error
>>>> 20050628-19:39:41 : Disconnecting
>>>> 20050628-19:40:42 : Created session
>>>> 20050628-19:40:51 : Received logon request
>>>> 20050628-19:40:51 : Responding to logon request
>>>> 20050628-19:40:51 : Received ResendRequest FROM: 5 TO: 0
>>>> 20050628-19:40:51 : Received ResendRequest FROM: 5 TO: 0
>>>> 20050628-19:41:06 : Received ResendRequest FROM: 5 TO: 0
>>>> 20050628-19:41:21 : Received ResendRequest FROM: 5 TO: 0
>>>> 20050628-19:41:36 : Received ResendRequest FROM: 5 TO: 0
>>>> 20050628-19:41:51 : Socket Error
>>>> 20050628-19:41:51 : Disconnecting
>>>>
>>>>
>>>> Regards, Alexey Zubko Infinium Capital Corporation (416) 360-7000
>>>> ext. 305
>>>>
>>>> Brian Erst wrote:
>>>>
>>>>> Oren - I believe the following change to Session.cpp will fix
>>>>> the timeout problem when receiving out-of-sequence messages
>>>>> while awaiting a sequence reset/gap-fill: Move the following
>>>>> lines in Session::verify(...) UtcTimeStamp now;
>>>>> m_state.lastReceivedTime ( now ); from their current position up
>>>>> to a spot just before the preceding if/else clause: -->>Place
>>>>> the code here<<-- if ( checkTooHigh && isTargetTooHigh (
>>>>> msgSeqNum ) ) This will increment the "lastReceivedTime" in the
>>>>> SessionState object even when the sequence number is wrong. This
>>>>> appears to solve a whole bunch of interrelated timeouts that
>>>>> could occur in this scenario (test request, heartbeat, logon
>>>>> response, etc.) in one quick hit. Sequence too low is largely
>>>>> unaffected by this change, as it will cause a disconnect when
>>>>> hit, so that part of the code shouldn't need to be rearranged.
>>>>> It was somewhat unclear to me whether the following line:
>>>>> m_state.testRequest( 0 ); Also needed to be moved, but it seemed
>>>>> like it did not. - Brian Erst Thynk Software, Inc. --- Oren
>>>>> Miller <or...@qu...> wrote:
>>>>>
>>>>>> QuickFIX Documentation: http://www.quickfixengine.org/quickfix/
>>>>>> doc/html/index.html QuickFIX FAQ: http:// www.quickfixengine.org/
>>>>>> wikifix/index.php?QuickFixFAQ QuickFIX Support: http://
>>>>>> www.quickfixengine.org/services.html Seems to me this shouldn't
>>>>>> be happening. I'm guessing that the engine isn't processing the
>>>>>> message because the sequence number is too high, hence no
>>>>>> heartbeat is processed. I believe that out of sequence messages
>>>>>> should probably count as a keep-alive even if their contents
>>>>>> arn't processed. --oren ----- Original Message ----- From:
>>>>>> "Alexey Zubko" <ale...@in...> To: <quickfix-
>>>>>> dev...@li...> Sent: Monday, June 27, 2005
>>>>>> 1:44 PM Subject: [Quickfix-developers] MsgSeqNum too high problem
>>>>>>
>>>>>>> QuickFIX Documentation: http://www.quickfixengine.org/ quickfix/
>>>>>>> doc/html/index.html QuickFIX FAQ:
>>>>>>
>>>>>> http://www.quickfixengine.org/wikifix/index.php?QuickFixFAQ
>>>>>>
>>>>>>> QuickFIX Support: http://www.quickfixengine.org/services.html
>>>>>>> Hi, I'd like to ask if QF initiator acts correctly in the
>>>>>>> following
>>>>>>
>>>>>> scenario.
>>>>>>
>>>>>>> The initiator detects that seqnum of a server is too high and
>>>>>>> sends
>>>>>>
>>>>>> a
>>>>>>
>>>>>>> resend message. The server doesn't resend requested messages,
>>>>>>> but there are
>>>>>>
>>>>>> heartbeat
>>>>>>
>>>>>>> messages. The initiator disconnects because of timeout on
>>>>>>> heartbeat. :-) Below is the initiator's log. 20050627-17:52:50
>>>>>>> : Created session 20050627-17:52:51 : Connecting to eng-server
>>>>>>> on port 5725 20050627-17:52:51 : Connection succeeded
>>>>>>> 20050627-17:52:52 : Initiated logon request 20050627-17:53:15
>>>>>>> : Received logon response 20050627-17:53:15 : MsgSeqNum too
>>>>>>> high, expecting 13 but received
>>>>>>
>>>>>> 15
>>>>>>
>>>>>>> 20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 14
>>>>>>> 20050627-17:53:15 : MsgSeqNum too high, expecting 13 but received
>>>>>>
>>>>>> 16
>>>>>>
>>>>>>> 20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 15
>>>>>>> 20050627-17:53:15 : MsgSeqNum too high, expecting 13 but received
>>>>>>
>>>>>> 17
>>>>>>
>>>>>>> 20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 16
>>>>>>> 20050627-17:53:22 : Sent test request TEST 20050627-17:53:22 :
>>>>>>> MsgSeqNum too high, expecting 13 but received
>>>>>>
>>>>>> 18
>>>>>>
>>>>>>> 20050627-17:53:22 : Sent ResendRequest FROM: 13 TO: 17
>>>>>>> 20050627-17:53:37 : MsgSeqNum too high, expecting 13 but received
>>>>>>
>>>>>> 19
>>>>>>
>>>>>>> 20050627-17:53:37 : Sent ResendRequest FROM: 13 TO: 18
>>>>>>> 20050627-17:53:40 : Timed out waiting for heartbeat
>>>>>>> 20050627-17:53:40 : Disconnecting -- Regards, Alexey Zubko
>>>>>>> Infinium Capital Corporation (416) 360-7000 ext. 305
>>>>>>> ------------------------------------------------------- SF.Net
>>>>>>> email is sponsored by: Discover Easy Linux Migration
>>>>>>
>>>>>> Strategies
>>>>>>
>>>>>>> from IBM. Find simple to follow Roadmaps, straightforward
>>>>>>> articles, informative Webcasts and more! Get everything you
>>>>>>> need to get up to speed, fast.
>>>>>>
>>>>>> http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
>>>>>>
>>>>>>> _______________________________________________ Quickfix-
>>>>>>> developers mailing list Quickfix-
>>>>>>> dev...@li... https:// lists.sourceforge.net/
>>>>>>> lists/listinfo/quickfix-developers
>>>>>>
>>>>>> ------------------------------------------------------- SF.Net
>>>>>> email is sponsored by: Discover Easy Linux Migration Strategies
>>>>>> from IBM. Find simple to follow Roadmaps, straightforward
>>>>>> articles, informative Webcasts and more! Get everything you
>>>>>> need to get up to speed, fast. http:// ads.osdn.com/?
>>>>>> ad_id=7477&alloc_id=16492&op=click
>>>>>> _______________________________________________ Quickfix-
>>>>>> developers mailing list Quickfix-
>>>>>> dev...@li... https://lists.sourceforge.net/
>>>>>> lists/listinfo/quickfix-developers
>>>>>
>>>
>>>
>
>
|