Re: [Quickfix-developers] RE: Intermittent disconnect problem
Brought to you by:
orenmnero
From: Oren M. <or...@qu...> - 2005-02-01 18:31:24
|
Yeah, for the first scenario we will be implementing a global logger. Right now as you know all loggers are associated with a specific session, so if there is something of interest that cannot be associated with a session, it doesn't have a place to report it. The second scenario will be easy to implement since we can place the duplicate logon attempt into the original sessions log. start/end time logging will also be easy to implement. --oren ----- Original Message ----- From: "Yihu Fang" <Yih...@re...> To: "Oren Miller" <or...@qu...>; "Bishop, Barry" <Bar...@gs...>; <qui...@li...> Cc: "Caleb Epstein" <cal...@gm...>; "Perez, John" <jp...@Cr...> Sent: Tuesday, February 01, 2005 11:50 AM Subject: RE: [Quickfix-developers] RE: Intermittent disconnect problem Hi, I understand that this discussion is about initiator gets disconnected. However, for a separate issue regarding QuickFIX acceptor, the acceptor silently drops the connection without any error message at least in the following scenario. (see ThreadedSocketConnection.cpp::setSession()) (1) If the incoming message does not have correct session header. (2) If the acceptor already establishes a session, a second connection from the same client tries to connect to the same port. It also silently drops connection if the incoming connection is out of session window (startTime/endTime etc). It will be very help that QuickFIX can provide appropriate error messages for these cases too. Thanks. -Yihu -----Original Message----- From: qui...@li... [mailto:qui...@li...] On Behalf Of Oren Miller Sent: Tuesday, February 01, 2005 12:25 PM To: Bishop, Barry; qui...@li... Cc: 'Caleb Epstein'; 'Perez, John' Subject: Re: [Quickfix-developers] RE: Intermittent disconnect problem QuickFIX Documentation: http://www.quickfixengine.org/quickfix/doc/html/index.html QuickFIX FAQ: http://www.quickfixengine.org/wikifix/index.php?QuickFixFAQ QuickFIX Support: http://www.quickfixengine.org/services.html Well, not necessarilly. "Dropped Connection" right now just means that the connection was droppen outside of a logout sequence. If QuickFIX knows the reason for this, it will be preceeded in the log for the reason, such as "Timed out waiting for heartbeat." The only time I can think of where QF would not know the exact reason for a disconnect is if the socket is either broken somehow, or closed by the counterparty. You can see in the SocketInitiator and SocketAcceptor onDisconnect methods, that Session::disconnect is being called. This is the only place where there is not an additional message that provides a disconnect reason. What we can do is start logging the error codes of the socket calls to get a more detailed analysis on what is hapenning with the socket. For instance, calling close can set the global error code to one of the following. EBADF The s argument is not an active descriptor. ECONNABORTED The connection was aborted by the remote endpoint. ECONNREFUSED The remote endpoint refused to continue the connection. ECONNRESET The remote endpoint reset the connection request. EDESTUNREACH Remote destination is now unreachable. EHOSTUNREACH Remote host is now unreachable. ENETDOWN Local network interface is down. ETIMEDOUT The connection timed out. I think that would get you the information you need to figure out the source of the disconnect. --oren ----- Original Message ----- From: "Bishop, Barry" <Bar...@gs...> To: "'Oren Miller'" <or...@qu...>; <qui...@li...> Cc: "'Caleb Epstein'" <cal...@gm...>; "'Perez, John'" <jp...@Cr...> Sent: Tuesday, February 01, 2005 10:30 AM Subject: RE: [Quickfix-developers] RE: Intermittent disconnect problem > Hello Oren, > > I'm afraid there is no consistency as to when this happens. Sometimes it > doesn't happen for a week, whereas it could be 8 times a week anywhere > from > 7:00AM to 9:00 PM. > > The amount of traffic is very low, maybe 1 or 2 messages per second at > most. > > The outage doesn't last long and it's not a very big deal, but it would be > nice to get it fixed. > > Can you confirm that if quickfix logs the message 'Dropped Connection' > then > it was quickfix that disconnected? I believe this to be the case, but it > seems that the code only tests to see if a logout has been sent. > > Should quickfix have logged another message if the above is true? > > In the meantime, I will take John Perez's advice and have a good look > through the last received message just in case this is related. However, > > the > disconnect often occurs many seconds after the last message is sent or > received. > > Thanks again, > barry > > > > -----Original Message----- > From: Oren Miller [mailto:or...@qu...] > Sent: Tuesday, February 01, 2005 4:12 PM > To: Bishop, Barry; qui...@li... > Cc: 'Caleb Epstein' > Subject: Re: [Quickfix-developers] RE: Intermittent disconnect problem > > > Barry, > > Is there anything common about the times in which these disconnects occur? > Is this a high frequency line? Is it possible you are overloading the > socket buffer? > > --oren > > ----- Original Message ----- > From: "Bishop, Barry" <Bar...@gs...> > To: <qui...@li...> > Cc: "Oren Miller" <or...@qu...>; "'Caleb Epstein'" > <cal...@gm...> > Sent: Tuesday, February 01, 2005 9:37 AM > Subject: RE: [Quickfix-developers] RE: Intermittent disconnect problem > > >> QuickFIX Documentation: >> http://www.quickfixengine.org/quickfix/doc/html/index.html >> QuickFIX FAQ: http://www.quickfixengine.org/wikifix/index.php?QuickFixFAQ >> QuickFIX Support: http://www.quickfixengine.org/services.html >> >> Hello all, >> >> This is a follow up on a problem that I was having last year: >> >> quickfix seemingly disconnects from its peer without indicating why. >> >> We've upgraded our system to quickfix 1.9.4 in the hope of getting >> more useful messages, but we don't appear to. I've had a look through >> the code and I can't see quite how this could happen. However it does. >> >> To reiterate, at some random time quickfix disconnects the TCP seesion >> from >> its peer and logs this message in the event log: >> >> 20050201-13:33:27 : Dropped Connection >> >> This message indicates that quickfix initiated the disconnect, but it >> does not say why. The inbound and outbound messages all look fine and >> usually there has been a few seconds since the last message was sent >> anyway. >> >> What happens next is the usual reconnect, logon and resend request. >> Everything continues after this. Incidentally, since upgrading from >> quickfix 1.4.0 to 1.9.4 this reconnect/resync is a whole order of >> magnitude better behaved. >> >> However, the mysterious disconnect still occurs. >> >> Has anyone else seen anything like this? >> Can anyone give me any suggestions as to how to track down the >> problem? >> >> We are running quickfix 1.9.4 on solaris 5.8 >> quickfix was built with GCC 3.2.2 >> We connect using SocketInitiator >> >> Thanks in advance, >> barry >> >> >> >> Here are some excerpts from our logs: >> >> EVENT LOG >> ========= >> 20050201-13:33:27 : Dropped Connection >> 20050201-13:33:29 : Connecting to XXX.XXX.XXX.XXX on port YYYY >> 20050201-13:33:29 : Connection succeeded 20050201-13:33:29 : Initiated >> logon request 20050201-13:33:31 : Received logon response >> >> >> INCOMING >> ======== >> The last message before disconnecting >> 8=FIX.4.2|9=0183|35=R|115=2126|34=8349|49=CCCCCC|56=BBBBBB|52=20050201 >> -13:32 >> > :49|122=20050201-13:33:23|116=10101010101010101|144=ZZZZZZZ|131=200502017287 >> |146=1|55=BBBBBB|48=773670|22=108|38=100|10=146| >> >> The logon response >> 8=FIX.4.2|9=0067|35=A|34=8351|49=CCCCCC|56=BBBBBB|52=20050201-13:32:54 >> |98=0| >> 108=30|10=004| >> >> >> OUTGOING >> ======== >> The last message before disconnecting >> 8=FIX.4.2|9=290|35=S|34=8232|49=BBBBBB|52=20050201-13:33:23.582|56=CCC >> CCC|12 >> > 8=2126|129=10101010101010101|145=ZZZZZZZ|22=108|48=773670|55=GSAMFFT|107=des >> cription|117=id|131=txn|132=1|133=2|134=50000| >> >> The logon after the disconnect >> 135=50000|167=OPT|200=200101|201=1|202=1.1|205=20|206=L|231=0.01|10=18 >> 1| >> > 8=FIX.4.2|9=71|35=A|34=8233|49=GSAMFFT|52=20050201-13:33:29.210|56=CATSOS|98 >> =0|108=30|10=098| >> >> >> APPLICATION LOG >> =============== >> Tue Feb 1 13:33:23:528 GMT+00:00 2005|Received: >> quickfix.fix42.QuoteRequest >> Tue Feb 1 13:33:23:528 GMT+00:00 2005|toApp, >> SessionID=FIX.4.2:BBBBBB->CCCCCC, Message=quickfix.fix42.Quote >> Tue Feb 1 13:33:27:117 GMT+00:00 2005|onLogout, >> SessionID=FIX.4.2:BBBBBB->CCCCCC >> >> >> >> -----Original Message----- >> From: Bishop, Barry >> Sent: Tuesday, November 30, 2004 08:11 AM >> To: or...@qu... [mailto:or...@qu...] >> Cc: 'qui...@li...' >> Subject: RE: [Quickfix-developers] Intermittent disconnect problem >> >> Hello Oren, >> >> Thanks for the reply. >> >> Sounds to me like I should try version 1.9.2 or later in our >> production environment. I have been unable to reproduce the mysterious >> disconnect in our QA system to the same client, but this is not >> surprising as it is so infrequent. I have been simulating it by >> breaking something else in the chain (which would appear as a client >> disconnect) so this would explain the lack of an explanation from >> qdhÔuickfix. >> >> I will try this over the next few days and report back. >> >> Thanks again, >> barry >> >> >> -----Original Message----- >> From: or...@qu... [mailto:or...@qu...] >> Sent: Monday, November 29, 2004 7:56 PM >> To: Bishop, Barry >> Cc: 'qui...@li...' >> Subject: RE: [Quickfix-developers] Intermittent disconnect problem >> >> >> Barry, >> >> For every disconnect that QuickFIX initiates, there should be a reason >> provided (not with 1.4.0, but with the new releases). With 1.9.4 >> (available now), QuickFIX also displays a "Dropped Connection" message >> if the disconnect is initiated by the peer (1.9.2, does not >> differentiate). That should help you to verify if it is QuickFIX that >> is initiating the disconnect. I don't think there are any more cases >> where QuickFIX initiates >> a disconnect without providing a reason. If the couterparty drops the >> connection, then unless they provide information in the form of a reject >> or >> logoff text, there is little QuickFIX can do to determine the cause. The >> best that we can probably do is report whether the socket was dropped >> gracefully, and therefore intentionally, or if it was an abnormal >> disconnect >> of some sort. >> >> Is there anything significantly different about this new client? Does >> their >> logs reveal anything about the nature of the disconnect? >> >> --oren >> >>> 1) Anyone have any idea what's going on? >>> 2) Is there a way to increase the amount of detail in log messages, >>> especially those to do with disconnection events? >>> 3) What sort of thing would cause quickfix to disconnect without >>> saying why? >>> >>> Thanks in advance, >>> barry >> >> >> >> ------------------------------------------------------- >> SF email is sponsored by - The IT Product Guide >> Read honest & candid reviews on hundreds of IT Products from real >> users. Discover which products truly live up to the hype. Start >> reading now. http://productguide.itmanagersjournal.com/ >> _______________________________________________ >> Quickfix-developers mailing list >> Qui...@li... >> https://lists.sourceforge.net/lists/listinfo/quickfix-developers >> >> >> ------------------------------------------------------- >> This SF.Net email is sponsored by: IntelliVIEW -- Interactive >> Reporting Tool for open source databases. Create drag-&-drop reports. >> Save time by over 75%! Publish reports on the web. Export to DOC, XLS, >> RTF, etc. Download a FREE copy at >> http://www.intelliview.com/go/osdn_nl >> _______________________________________________ >> Quickfix-developers mailing list >> Qui...@li... >> https://lists.sourceforge.net/lists/listinfo/quickfix-developers >> > ------------------------------------------------------- This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting Tool for open source databases. Create drag-&-drop reports. Save time by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. Download a FREE copy at http://www.intelliview.com/go/osdn_nl _______________________________________________ Quickfix-developers mailing list Qui...@li... https://lists.sourceforge.net/lists/listinfo/quickfix-developers ----------------------------------------------------------------- Visit our Internet site at http://www.reuters.com Get closer to the financial markets with Reuters Messaging - for more information and to register, visit http://www.reuters.com/messaging Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of Reuters Ltd. |