RE: [Quickfix-developers] Bug in Session::sendRaw.
Brought to you by:
orenmnero
From: Timothy Y. <ty...@pa...> - 2004-02-27 20:01:24
|
After further analysis of problems we were experiencing at one of our client sites, it is now clear exactly what was happening. The client buy-side application lost the FIX connection (probably due to an abnormal termination of the client). After about 5 minutes, the client logged on again. Our sell-side application takes a long time to process the logon (about 20 seconds). During this time it sends any executions that occurred while the client was logged off. In the cases in point, one or two executions were sent. However, since these messages were sent before the logon process was completed, we encountered the sendRaw bug. This bug results in a hole in the message store -- a message string with zero length. When the client application received our logon reply, they noticed the missing executions and (correctly) ask the server to resend them. Since these were never successfully committed to the message store, the session was then broken irretrievably. Clearly, there is a questionmark over whether it should be legal in quickfix to send messages when logged out, or (as we did) during the logon process. I believe it should be possible, though clearly with the sendRaw bug it does not work in the current quickfix release. We have fixed the sendRaw bug, and the problem goes away. Business messages sent during the logon process are suppressed, but the client can successfully ask for them to be resent after the logon has completed. Here's our modified version of sendRaw: -------- bool Session::sendRaw( Message& message, int num ) { QF_STACK_PUSH(Session::sendRaw) Locker l( m_mutex ); try { bool result = false; Header& header = message.getHeader(); MsgType msgType; header.getField( msgType ); fill( header ); std::string messageString; if ( num ) header.setField( MsgSeqNum( num ) ); if ( Message::isAdminMsgType( msgType ) ) { m_application.toAdmin( message, m_sessionID ); if ( msgType == "A" || msgType == "5" || msgType == "2" || msgType == "4" || isLoggedOn() ) { result = send( message.toString(messageString) ); } else { message.toString(messageString); m_state.onEvent("Suppressed send of administrative message: " + messageString); } } else { try { m_application.toApp( message, m_sessionID ); if ( isLoggedOn() ) { result = send( message.toString(messageString) ); } else { message.toString(messageString); m_state.onEvent("Suppressed send of application message: " + messageString ); } } catch ( DoNotSend& ) { return false; } } if ( !num ) { MsgSeqNum msgSeqNum; header.getField( msgSeqNum ); m_state.set( msgSeqNum, messageString ); m_state.incrNextSenderMsgSeqNum(); } return result; } catch ( IOException& ) { return false; } QF_STACK_POP } -------- -----Original Message----- From: Miller, Oren [mailto:OM...@ri...] Sent: Friday, February 27, 2004 9:36 AM To: Timothy Yates; qui...@li... Cc: William Todd; Lenny Shleymovich Subject: RE: [Quickfix-developers] Bug in Session::sendRaw. I think what we have to determine is whether it is acceptable to send a logout if a valid logon hasn't been recieved. I'm not sure that it is, I think this is a good question to post at fixprotocol.org --oren -----Original Message----- From: Timothy Yates [mailto:ty...@pa...] Sent: Thu 2/26/2004 10:53 AM To: 'qui...@li...' Cc: William Todd; Lenny Shleymovich Subject: [Quickfix-developers] Bug in Session::sendRaw. There is an bug in Session::sendRaw, which may cause a sequence number to be consumed without any message actually being sent. For example, session-level reject messages are not sent if the session is not logged in. In such cases, an empty message (empty string) is committed to the message store and the outbound sequence number is incremented. This causes the session to be irretrievably broken. If the client asks for the missing message, an exception is thrown on attempting to parse the empty message from the message store. In the current quickfix release, this will cause the client's resend request to be ignored. The attached acceptance test definition illustrates this problem by sending a logon message with a bad sending time. This seems like a slightly artificial test case. However, we have seen a large number of instances of this problem which were caused by something other than bad sending time. We are still trying to ascertain exactly which outbound messages were being suppressed. I am not certain what the correct behaviour should be for this test case. Should the reject still be sent, or should it be suppressed? I think it should be suppressed because the normal way to reject a logon is either with a logout or a disconnect. Ideally, the logout text should include the reason for the problem (i.e. sending time inaccuracy), otherwise it could be very difficult for the client to ascertain. <<5000_STA.def>> Tim Yates Lead Developer Patsystems (US) LLC 141 West Jackson Boulevard Chicago 60604, USA Tel +1 (312) 542-1336 www.patsystems.com DISCLAIMER: This e-mail is confidential and may also be legally privileged. If you are not the intended recipient, use of the information contained in this e-mail (including disclosure, copying or distribution) is prohibited and may be unlawful. Please inform the sender and delete the message immediately from your system. This e-mail is attributed to the sender and may not necessarily reflect the views of the Patsystems Group and no member of the Patsystems Group accepts any liability for any action taken in reliance on the contents of this e-mail (other than where it has a legal or regulatory obligation to do so) or for the consequences of any computer viruses which may have been transmitted by this e-mail. The Patsystems Group comprises Patsystems plc and its subsidiary group of companies. DISCLAIMER: This e-mail is confidential and may also be legally privileged. If you are not the intended recipient, use of the information contained in this e-mail (including disclosure, copying or distribution) is prohibited and may be unlawful. Please inform the sender and delete the message immediately from your system. This e-mail is attributed to the sender and may not necessarily reflect the views of the Patsystems Group and no member of the Patsystems Group accepts any liability for any action taken in reliance on the contents of this e-mail (other than where it has a legal or regulatory obligation to do so) or for the consequences of any computer viruses which may have been transmitted by this e-mail. The Patsystems Group comprises Patsystems plc and its subsidiary group of companies. |