Re: [Quickfix-developers] Stability problems
Brought to you by:
orenmnero
|
From: <OM...@th...> - 2002-10-15 02:22:08
|
I'm wondering if we can create an autofconf script to detect this. Due to
the nature of the problem, it probably can't be 100% deterministic, but I
imagine we can create a test that will make catching it extremely likely.
If the problem is detected, at that point we can either define
__STL_PTHREADS and display a warning, or stop the compilation altogether
and recommend the options you outlined.
I think it is important to try to automate this as much as possible because
everything *seems* ok, but really is not at all. And a warning in the
documentation, no matter how prominent, is likely to be glossed over.
Loic's broker thankfully had a test that exposed this for him, but a lot of
brokers don't have such tests, so I think we should. It would also be a
good idea to start adding load tests to the automated test suite.
--oren
Gene Gorokhovsky
<musor102@yahoo.c To: OM...@th...
om> cc: Loic Guezennec <loi...@sw...>,
qui...@li...,
10/14/2002 08:21 qui...@li...
PM Subject: Re: [Quickfix-developers] Stability problems
After some digging, it appears that libstdc++ snapshot
9 (libstdc++-2.90.8.tar.gz) is the last "official" gcc
libstd which works with 2.95.x. Judging from its
documentation, strings are MT safe in that release on
many platforms including Solaris.
For gcc 2.95.3 it requires a patch
(http://gcc.gnu.org/libstdc++/libstdc++-2.90.8-compat-gcc-2.95.3.diff)
I am not sure defining _PTHREADS out of the box is a
great idea -- string performance (and QuickFIX depends
heavliy on them strings) does suffer tremendously.
This link
http://gcc.gnu.org/ml/libstdc++/2001-05/msg00384.html
has some interesting discussion, and also results
which show that map of strings performance can
decrease three-fold with _PTHREADS defined.
So Solaris options could be :
1)Using gcc 3.2 (requires recompilation of all C++
libraries)
2)Staying with gcc 2.95.x and
a)Using STLPort (4.0 and higher work with gcc 2.95.x)
b)Upgrading libstdc++ to v3 release 9 with patch and
building it with configure --enable-threads
c) Defining _PTHREADS (least effort, performance
penalty)
Gene
--- OM...@th... wrote:
>
> Gene,
>
> I've done some tests and it appears your analysis is
> dead on. Thanks for
> the lead. I am thinking of having autotools turn on
> _PTHREADS by default
> so that everything will work right out of the box,
> but recommend that
> people either upgrade their compiler or move to
> STLPort if using Solaris.
>
> --oren
>
>
>
>
|---------+----------------------------------------------->
> | | Gene Gorokhovsky
> |
> | | <mus...@ya...>
> |
> | | Sent by:
> |
> | |
> qui...@li...|
> | | ceforge.net
> |
> | |
> |
> | |
> |
> | | 10/10/2002 08:20 PM
> |
> | |
> |
>
|---------+----------------------------------------------->
>
>
>
----------------------------------------------------------------------------------------------|
> |
> |
> | To: Loic Guezennec
> <loi...@sw...>,
> |
> | qui...@li...
> |
> | cc:
> |
> | Subject: Re: [Quickfix-developers]
> Stability problems |
>
>
>
----------------------------------------------------------------------------------------------|
>
>
>
>
> Although Quickfix code maybe partly to blame, the
> stack that you have shown could also be caused by
> subtle misconfiguraton of gcc and STL (part of
> libstd). Some people have reported that despite
> documentation saying that it matters only for
> Objective-C, gcc itself should be compiled with
> configure --enable-threads, and that C++ compiler is
> affected by this setting. I cannot vouch for this
> though, since I have always preferred Sun's own cc,
> and had with it significantly fewer headaches (for
> some $$)
> Also in gcc 2.95.x + STL defining _PTHREADS turns on
> more robust (and signficantly slower) implementation
> of locking in STL allocator (exactly where your
> stack
> shows crash), This apparently has been fixed in 3.2,
> and the flag no longer has any effect on the code.
> Try defining this and trying your tests.
> Also, some older implemenations of STL had
> reference-counted std::string which made strings not
> thread-safe even for reading. This certainly has
> been
> fixed with gcc 3.2 release, but I am not sure about
> about older versions.
> Another option yet would be to switch to STLPort
> implementation of STL. It has had thread-safe
> strings
> from the get-go.
>
> Gene
> --- Loic Guezennec <loi...@sw...>
> wrote:
> > I have implemented a buy-side with Quickfix which
> I
> > hope to use in prod
> > soon.
> >
> > The platform is Solaris 8 sparc multi-processor.
> > compiler is gcc 2.95.3
> >
> > The application runs well when heartbeating and
> > under light load.
> >
> > I have severe instability problems when I apply a
> > load test of 50 orders
> > in one
> > go. This happens systematically.
> >
> > I believe I am experiencing the problems described
> > by Gene Gorokhovsky
> > with the
> > threading issues. The results so far are
> > segmentation faults, bus errors
> > and
> > also perhaps a deadlock... The latter being hard
> for
> > me to troubleshoot as
> > I am not an expert on threads.
> >
> >
> > An alarming point for me is the following:
> > At times that the engine crashes, I can lose
> > messages. This also seems to
> > go along
> > the message from Constantin about crash scenarios.
> >
> > Now my questions are:
> >
> > - Is quickfix known to be unstable on some
> platforms
> > ( eg Sun)
> > - Is there a preferred platform / architecture to
> > use it.
> > ( OS/ single or multi-proc/ Threaded or non
> > threaded...)
> > I have tried both threaded and non threaded
> > socket initiators
> > with no luck.
> >
> > Any feedback on what to do would be great.
> >
> >
> > An example from attaching gdb to the process:
> >
> > Reading symbols from
> > /usr/lib/libpthread.so.1...done.
> > Reading symbols from /usr/lib/librt.so.1...done.
> > Reading symbols from
> > /usr/local/lib/libxml2.so.2...done.
> > Reading symbols from /usr/lib/libz.so...done.
> > Reading symbols from
> /usr/lib/libsocket.so.1...done.
> > Reading symbols from /usr/lib/libnsl.so.1...done.
> > Reading symbols from
> > /usr/local/lib/libstdc++.so.2.10.0...done.
> > Reading symbols from /usr/lib/libm.so.1...done.
> > Reading symbols from /usr/lib/libc.so.1...done.
> > Reading symbols from /usr/lib/libaio.so.1...done.
> > Reading symbols from /usr/lib/libdl.so.1...done.
> > Reading symbols from /usr/lib/libmp.so.2...done.
> > Reading symbols from
> >
>
/usr/platform/SUNW,Ultra-80/lib/libc_psr.so.1...done.
> > Reading symbols from
> /usr/lib/libthread.so.1...done.
> > sol-thread active.
> > Symbols already loaded for
> /usr/lib/libpthread.so.1
> > Symbols already loaded for /usr/lib/librt.so.1
> > Symbols already loaded for
> > /usr/local/lib/libxml2.so.2
> > Symbols already loaded for /usr/lib/libz.so
> > Symbols already loaded for /usr/lib/libsocket.so.1
> > Symbols already loaded for /usr/lib/libnsl.so.1
> > Symbols already loaded for
> > /usr/local/lib/libstdc++.so.2.10.0
> > Symbols already loaded for /usr/lib/libm.so.1
> > Symbols already loaded for /usr/lib/libc.so.1
> > Symbols already loaded for /usr/lib/libaio.so.1
> > Symbols already loaded for /usr/lib/libdl.so.1
> > Symbols already loaded for /usr/lib/libmp.so.2
> > Symbols already loaded for
> > /usr/platform/SUNW,Ultra-80/lib/libc_psr.so.1
> > Symbols already loaded for /usr/lib/libthread.so.1
> > 0xff0194a0 in door_restart () from
> > /usr/lib/libc.so.1
> > (gdb) continue
> > Continuing.
> > [New Thread 4 (LWP 5)]
> > [Switching to Thread 4 (LWP 5)]
> >
> > Program received signal SIGSEGV, Segmentation
> fault.
> > 0x142130 in __default_alloc_template<false,
> > 0>::allocate (__n=32)
> > at
> >
>
/usr/local/lib/gcc-lib/sparc-sun-solaris2.8/2.95.3/../../../../include/g++-3/stl_alloc.h:422
>
> > 422 *__my_free_list = __result ->
> > _M_free_list_link;
> > (gdb) bt
> > #0 0x142130 in __default_alloc_template<false,
> > 0>::allocate (__n=32)
> > at
> >
>
=== message truncated ===
__________________________________________________
Do you Yahoo!?
Faith Hill - Exclusive Performances, Videos & More
http://faith.yahoo.com
|