From: Shen, G. <sh...@bn...> - 2012-02-07 19:02:17
|
Hi Matej, Do we have any progress on this? This problem it not directly related to masarService. In our case, it happens to happen in any program which is used wrapped by Python, for both Python/masar and Python/gather client. in clientFactory.cpp under pvAccessAPP/ca directory, you have a Mutex globally defined, which means then we call CLientFactory::stop(), we can not guarantee the correct unlock order, or one mutex being unlock by other thread. This is exact the case when we call ClientFactory::stop() from Python. I am going to release the masarService, and need to clean this before release. Can you fix this problem soon? Since our first V4 service is planning to release to our user, we can not give them any bad impression about the stability. Guobao On 1/6/12 8:42 AM, Marty Kraimer wrote: > I am having a problem in masarService that You maybe can help solve. > > You will have to clone masarService ( another source-forge mercurial repository). > > After cloning read the Building and Testing sections in masarService/documentation/masarService.html > > Make sure the following work: > > In one window > > mrk> pwd > /home/mrk/hg/masarService/cpp/bin/linux-x86 > mrk> ./masarServiceRun > > KEEP this running > > In another window > > mrk> pwd > /home/mrk/hg/masarService/cpp/bin/linux-x86 > mrk> ./testezchannelRPC > > You should see the following: > response > structure NTTable > string function > structure timeStamp > long secondsPastEpoch 0 > int nanoSeconds 0 > int userTag 0 > structure alarm > int severity 0 > int status 0 > string message > string[] label [position,alarms] > double[] position [1,2] > structure[] alarms > structure alarm > int severity 2 > int status 7 > string message test > structure alarm > int severity 2 > int status 7 > string message test > channelRPC: totalConstruct 1 totalDestruct 1 > blockingTCPTransport: totalConstruct 1 totalDestruct 1 > channel: totalConstruct 1 totalDestruct 1 > blockingUDPTransport: totalConstruct 2 totalDestruct 2 > timerNode: totalConstruct 2 totalDestruct 2 > timer: totalConstruct 1 totalDestruct 1 > LinkedList: totalConstruct 1 totalDestruct 1 > LinkedListNode: totalConstruct 3 totalDestruct 3 > remoteClientContext: totalConstruct 1 totalDestruct 1 > event: totalConstruct 7 totalDestruct 7 > pvField: totalConstruct 29 totalDestruct 29 > field: totalConstruct 120 totalDestruct 120 > > > Now for the problem. > > In another window > > mrk> pwd > /home/mrk/hg/masarService/python/test > mrk> python testchannelRPC.py > > > This works just fine but terminates with: > > terminate called after throwing an instance of 'epicsMutex::invalidMutex' > what(): epicsMutex::invalidMutex() > Aborted (core dumped) > > > When I run it with gdb I get > > terminate called after throwing an instance of 'epicsMutex::invalidMutex' > what(): epicsMutex::invalidMutex() > > Program received signal SIGABRT, Aborted. > 0x00110416 in __kernel_vsyscall () > (gdb) bt > #0 0x00110416 in __kernel_vsyscall () > #1 0x003bf2f1 in raise () from /lib/libc.so.6 > #2 0x003c0d5e in abort () from /lib/libc.so.6 > #3 0x05366c45 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/libstdc++.so.6 > #4 0x05364b16 in ?? () from /usr/lib/libstdc++.so.6 > #5 0x05364b53 in std::terminate() () from /usr/lib/libstdc++.so.6 > #6 0x05364cd2 in __cxa_throw () from /usr/lib/libstdc++.so.6 > #7 0x003365ec in epicsMutex::lock (this=0x27ae88) at ../../../src/libCom/osi/epicsMutex.cpp:238 > #8 0x001d56a9 in Lock () at /home/mrk/hg/pvDataCPP/include/pv/lock.h:26 > #9 epics::pvAccess::ClientFactory::stop () at ../../pvAccessApp//ca/clientFactory.cpp:41 > #10 0x00167e27 in epics::pvAccess::deleteStatic () at ../../../src/client/ezchannelRPC/ezchannelRPC.cpp:18 > #11 0x00331c91 in epicsExitCallAtExitsPvt () at ../../../src/libCom/misc/epicsExit.c:80 > #12 epicsExitCallAtExits () at ../../../src/libCom/misc/epicsExit.c:97 > #13 0x003c2cdf in exit () from /lib/libc.so.6 > #14 0x003aae3e in __libc_start_main () from /lib/libc.so.6 > #15 0x08048501 in _start () > > > The problem occurs in > > void ClientFactory::stop() > { > Lock guard(m_mutex); > > This is really strange since that same mutex must have been used in ClientFactory::start() > > Note the the Python code is implemented in > > masarService/python/src/client/channelRPC.py This is python class the calls the extension module > > masarService/cpp/src/python/channelRPCPy.cpp This is a python C++ extension which in turn calls the next C++ class > > masarService/cpp/src/client/ezchannelRPC This is the C++ code for ezchannelRPC. > > Marty > > -- Guobao Shen Bldg. 902-B, 17 Cornell Avenue National Synchrotron Light Source II Brookhaven National Laboratory Upton, New York 11973 Tel. : +1 (631) 344 7540 Fax. : +1 (631) 344 8085 http://www.bnl.gov/nsls2 |