Re: [Quickfix-developers] crashes while under moderate message load
Brought to you by:
orenmnero
|
From: Caleb E. <cal...@gm...> - 2006-03-21 17:18:38
|
On 3/21/06, James Reed <jam...@gm...> wrote: > > We've experienced unexplained crashes where no core dumps are generated > nor is there anything in the logs to indicate what the problem is. We hav= e > only noticed that the crashes tend to occur when a large number of client= s > attempt to connect at once or if a significant portion of the connected > clients try to send messages in a short span of time. > > We are using QuickFIX 1.10.2, with g++ 3.2.3, on RHEL AS Release 3. Our > application uses one Initiator and one Acceptor. The Acceptor is configur= ed > with 288 Sessions. Our application is also configured to use a StdOutLogg= er > to minimize usage of file descriptors. The fdlimit for the user account > running the process is 2048. I know there is not much to go on from the > information presented, but does anyone have any ideas about what could > possibly happening here? > > Thanks. > > We've run the application in a debugger and found the following upon > crashing: > > Stack Traces > > Incident #1: > > #0 0x00d22b60 in pthread_detach () from /lib/tls/libpthread.so.0 > #1 0x00a4dc3b in FIX::thread_detach (thread=3D0) at Utility.cpp:344 > #2 0x00a07334 in FIX::ThreadedSocketAcceptor::removeThread > (this=3D0x837d210, s=3D1216) > at stl_map.h:221 > #3 0x00a07453 in FIX::ThreadedSocketAcceptor::socketThread (p=3D0x6ea61a= 10) > at ThreadedSocketConnection.h:51 > #4 0x00d21dec in start_thread () from /lib/tls/libpthread.so.0 > #5 0x005a8a2a in clone () from /lib/tls/libc.so.6 > Judging by this call stack and the second one, you've hit a bug that has been fixed in 1.11.1. There are resources in the ThreadedSocketAcceptor an= d T.S.Connector classes that were being modified without holding a mutex in all versions of QuickFIX prior to 1.11.1. Try updating to the latest version of QuickFIX and I believe this problem will disappear (though I really dislike the use of pthread_detach...) -- Caleb Epstein caleb dot epstein at gmail dot com |