Re: [Quickfix-developers] Stability problems
Brought to you by:
orenmnero
From: <OM...@th...> - 2002-10-15 02:22:08
|
I'm wondering if we can create an autofconf script to detect this. Due to the nature of the problem, it probably can't be 100% deterministic, but I imagine we can create a test that will make catching it extremely likely. If the problem is detected, at that point we can either define __STL_PTHREADS and display a warning, or stop the compilation altogether and recommend the options you outlined. I think it is important to try to automate this as much as possible because everything *seems* ok, but really is not at all. And a warning in the documentation, no matter how prominent, is likely to be glossed over. Loic's broker thankfully had a test that exposed this for him, but a lot of brokers don't have such tests, so I think we should. It would also be a good idea to start adding load tests to the automated test suite. --oren Gene Gorokhovsky <musor102@yahoo.c To: OM...@th... om> cc: Loic Guezennec <loi...@sw...>, qui...@li..., 10/14/2002 08:21 qui...@li... PM Subject: Re: [Quickfix-developers] Stability problems After some digging, it appears that libstdc++ snapshot 9 (libstdc++-2.90.8.tar.gz) is the last "official" gcc libstd which works with 2.95.x. Judging from its documentation, strings are MT safe in that release on many platforms including Solaris. For gcc 2.95.3 it requires a patch (http://gcc.gnu.org/libstdc++/libstdc++-2.90.8-compat-gcc-2.95.3.diff) I am not sure defining _PTHREADS out of the box is a great idea -- string performance (and QuickFIX depends heavliy on them strings) does suffer tremendously. This link http://gcc.gnu.org/ml/libstdc++/2001-05/msg00384.html has some interesting discussion, and also results which show that map of strings performance can decrease three-fold with _PTHREADS defined. So Solaris options could be : 1)Using gcc 3.2 (requires recompilation of all C++ libraries) 2)Staying with gcc 2.95.x and a)Using STLPort (4.0 and higher work with gcc 2.95.x) b)Upgrading libstdc++ to v3 release 9 with patch and building it with configure --enable-threads c) Defining _PTHREADS (least effort, performance penalty) Gene --- OM...@th... wrote: > > Gene, > > I've done some tests and it appears your analysis is > dead on. Thanks for > the lead. I am thinking of having autotools turn on > _PTHREADS by default > so that everything will work right out of the box, > but recommend that > people either upgrade their compiler or move to > STLPort if using Solaris. > > --oren > > > > |---------+-----------------------------------------------> > | | Gene Gorokhovsky > | > | | <mus...@ya...> > | > | | Sent by: > | > | | > qui...@li...ur| > | | ceforge.net > | > | | > | > | | > | > | | 10/10/2002 08:20 PM > | > | | > | > |---------+-----------------------------------------------> > > > ----------------------------------------------------------------------------------------------| > | > | > | To: Loic Guezennec > <loi...@sw...>, > | > | qui...@li... > | > | cc: > | > | Subject: Re: [Quickfix-developers] > Stability problems | > > > ----------------------------------------------------------------------------------------------| > > > > > Although Quickfix code maybe partly to blame, the > stack that you have shown could also be caused by > subtle misconfiguraton of gcc and STL (part of > libstd). Some people have reported that despite > documentation saying that it matters only for > Objective-C, gcc itself should be compiled with > configure --enable-threads, and that C++ compiler is > affected by this setting. I cannot vouch for this > though, since I have always preferred Sun's own cc, > and had with it significantly fewer headaches (for > some $$) > Also in gcc 2.95.x + STL defining _PTHREADS turns on > more robust (and signficantly slower) implementation > of locking in STL allocator (exactly where your > stack > shows crash), This apparently has been fixed in 3.2, > and the flag no longer has any effect on the code. > Try defining this and trying your tests. > Also, some older implemenations of STL had > reference-counted std::string which made strings not > thread-safe even for reading. This certainly has > been > fixed with gcc 3.2 release, but I am not sure about > about older versions. > Another option yet would be to switch to STLPort > implementation of STL. It has had thread-safe > strings > from the get-go. > > Gene > --- Loic Guezennec <loi...@sw...> > wrote: > > I have implemented a buy-side with Quickfix which > I > > hope to use in prod > > soon. > > > > The platform is Solaris 8 sparc multi-processor. > > compiler is gcc 2.95.3 > > > > The application runs well when heartbeating and > > under light load. > > > > I have severe instability problems when I apply a > > load test of 50 orders > > in one > > go. This happens systematically. > > > > I believe I am experiencing the problems described > > by Gene Gorokhovsky > > with the > > threading issues. The results so far are > > segmentation faults, bus errors > > and > > also perhaps a deadlock... The latter being hard > for > > me to troubleshoot as > > I am not an expert on threads. > > > > > > An alarming point for me is the following: > > At times that the engine crashes, I can lose > > messages. This also seems to > > go along > > the message from Constantin about crash scenarios. > > > > Now my questions are: > > > > - Is quickfix known to be unstable on some > platforms > > ( eg Sun) > > - Is there a preferred platform / architecture to > > use it. > > ( OS/ single or multi-proc/ Threaded or non > > threaded...) > > I have tried both threaded and non threaded > > socket initiators > > with no luck. > > > > Any feedback on what to do would be great. > > > > > > An example from attaching gdb to the process: > > > > Reading symbols from > > /usr/lib/libpthread.so.1...done. > > Reading symbols from /usr/lib/librt.so.1...done. > > Reading symbols from > > /usr/local/lib/libxml2.so.2...done. > > Reading symbols from /usr/lib/libz.so...done. > > Reading symbols from > /usr/lib/libsocket.so.1...done. > > Reading symbols from /usr/lib/libnsl.so.1...done. > > Reading symbols from > > /usr/local/lib/libstdc++.so.2.10.0...done. > > Reading symbols from /usr/lib/libm.so.1...done. > > Reading symbols from /usr/lib/libc.so.1...done. > > Reading symbols from /usr/lib/libaio.so.1...done. > > Reading symbols from /usr/lib/libdl.so.1...done. > > Reading symbols from /usr/lib/libmp.so.2...done. > > Reading symbols from > > > /usr/platform/SUNW,Ultra-80/lib/libc_psr.so.1...done. > > Reading symbols from > /usr/lib/libthread.so.1...done. > > sol-thread active. > > Symbols already loaded for > /usr/lib/libpthread.so.1 > > Symbols already loaded for /usr/lib/librt.so.1 > > Symbols already loaded for > > /usr/local/lib/libxml2.so.2 > > Symbols already loaded for /usr/lib/libz.so > > Symbols already loaded for /usr/lib/libsocket.so.1 > > Symbols already loaded for /usr/lib/libnsl.so.1 > > Symbols already loaded for > > /usr/local/lib/libstdc++.so.2.10.0 > > Symbols already loaded for /usr/lib/libm.so.1 > > Symbols already loaded for /usr/lib/libc.so.1 > > Symbols already loaded for /usr/lib/libaio.so.1 > > Symbols already loaded for /usr/lib/libdl.so.1 > > Symbols already loaded for /usr/lib/libmp.so.2 > > Symbols already loaded for > > /usr/platform/SUNW,Ultra-80/lib/libc_psr.so.1 > > Symbols already loaded for /usr/lib/libthread.so.1 > > 0xff0194a0 in door_restart () from > > /usr/lib/libc.so.1 > > (gdb) continue > > Continuing. > > [New Thread 4 (LWP 5)] > > [Switching to Thread 4 (LWP 5)] > > > > Program received signal SIGSEGV, Segmentation > fault. > > 0x142130 in __default_alloc_template<false, > > 0>::allocate (__n=32) > > at > > > /usr/local/lib/gcc-lib/sparc-sun-solaris2.8/2.95.3/../../../../include/g++-3/stl_alloc.h:422 > > > 422 *__my_free_list = __result -> > > _M_free_list_link; > > (gdb) bt > > #0 0x142130 in __default_alloc_template<false, > > 0>::allocate (__n=32) > > at > > > === message truncated === __________________________________________________ Do you Yahoo!? Faith Hill - Exclusive Performances, Videos & More http://faith.yahoo.com |