Re: [Quickfix-developers] possible deadlocking/freeze with routing implementation ?
Brought to you by:
orenmnero
From: <or...@qu...> - 2008-05-07 16:05:26
|
SynchronizedApplication probably isn't the best thing to use in a scenario like this. You should probably lock only the resources that need protecting and release the locks when you no longer need them. You are probably deadlocking between the synchronized application mutex and the session mutex when sending an order. Unlocking the application mutex before making the send call will probably resolve your problem. --oren > -------- Original Message -------- > Subject: [Quickfix-developers] possible deadlocking/freeze with routing > implementation ? > From: quickfixer <li...@ch...> > Date: Tue, May 06, 2008 6:32 pm > To: qui...@li... > QuickFIX Documentation: http://www.quickfixengine.org/quickfix/doc/html/index.html > QuickFIX Support: http://www.quickfixengine.org/services.html > Hi, > I've been using quickfix for about three months in a routing kind of > situation (similar to this poster: > http://www.nabble.com/ThreadedSocketAcceptor---Message-resend-tc17067577.html) > where I have both an initiator and an acceptor. I am using > ThreadedSocketAcceptor, ThreadedSocketInitator, and SynchronizedApplication. > There are n sessions coming in through the acceptor and 1 out via the > initiator, and it is a comparitively low message throughput application. > We experienced a situation today where quickfix tried to send a message to > its outgoing connection, while processing a message received from on of > incoming connections, somewhere in Session::sendToTarget my quickfix server > got "stuck" (i.e. I know I called ::sendToTarget, but I never got to the > part where the message was actually sent, as I did not see the offending > message in the filestore (am using the FileStoreFactory)). I am inclined to > believe this is a deadlocking situation but I am not sure on what resource > I am deadlocking and which are the two (or more!) threads that are > contending for the resource. During this period of time (it was about 5 > minutes before someone noticed - as I said "low volume"!), several clients > timed out because they failed to receive heartbeat responses, and began the > reconnect process. For each of these clients I see the incoming heartbeat > in the individual message store for each client and then later the following > in the global log: > "Accepted connection from x.x.x.x on port yyyy" > but there is no corresponding response from my server to neither the > heartbeat nor the new connection, presumably because those messages are > queued up waiting for the thread to finish processing the original > sendToTarget which is "stuck". I cannot however, figure out why the original > sendToTarget is "stuck" nor if it is deadlocked what it is waiting on. > The quickfix server was bounced, and everybody recovered and the offending > message was resent, and so everything was fine in the end, but I was > wondering if someone could help to figure out what I am doing in the code > that caused this errant issue. Any thoughts, or pointers as to where to > look for a possible issue would be _much_ appreciated. > Regards, > Liz > -- > View this message in context: http://www.nabble.com/possible-deadlocking-freeze-with-routing-implementation---tp17093661p17093661.html > Sent from the QuickFIX - Dev mailing list archive at Nabble.com. > ------------------------------------------------------------------------- > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > Don't miss this year's exciting event. There's still time to save $100. > Use priority code J8TL2D2. > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone > _______________________________________________ > Quickfix-developers mailing list > Qui...@li... > https://lists.sourceforge.net/lists/listinfo/quickfix-developers |