[Opalvoip-devel] SIP_PDU_Thread and PTimerList competition deadlock
Brought to you by:
csoutheren,
rjongbloed
From: fan z. <kui...@ya...> - 2008-04-06 12:59:10
|
Running Environment: CPU: Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz OS: Red Hat Enterprise Linux Server release 5.1 (Tikanga) Kernel: 2.6.18-53.el5 #1 SMP Wed Oct 10 16:34:02 EDT 2007 i686 i686 i386 GNU/Linux SIP UAS: Grandstream bugtone 100 Our UAC is based on Opal 3.2.0 and ptlib 2.2.0. When the UAS (bugtone 100) was wrong configured, the UAC calling fails. Following the requirements, the UAC will retry, and so fail again, and again…. Aftrer dozens or hundreds of retrying, the UAC deadlocks. In gdb, the backtrace of SIP_PDU_Thread and PhouseKeepingThread are as follows: Backtrace1: #0 0x00ee5402 in __kernel_vsyscall () #1 0x00c08236 in pthread_cond_wait@@GLIBC_2.3.2 () #2 0x083dc2ff in PSyncPoint::Wait (this=0xb76feb78) at tlibthrd.cxx:1418 #3 0x083fdaa6 in PTimerList::QueueRequest (this=0x9db259c, action=PTimerList::RequestType::Stop, timer=0xb7411ff8, _isSync=true) at ../common/osutils.cxx:736 #4 0x083fdb48 in PTimer::Stop (this=0xb7411ff8, wait=true) at ../common/osutils.cxx:612 #5 0x0814d4f9 in SIPTransaction::OnReceivedResponse (this=0xb7411cc0, response=@0x9df07a8) at /home/van/opal/src/sip/sippdu.cxx:2030 #6 0x081580a6 in SIPInvite::OnReceivedResponse (this=0xb7411cc0, response=@0x9df07a8) at /home/van/opal/src/sip/sippdu.cxx:2253 #7 0x0814448f in SIPConnection::OnReceivedPDU (this=0xb7417ba8, pdu=@0x9df07a8) at /home/van/opal/src/sip/sipcon.cxx:1332 #8 0x0813533c in SIPEndPoint::SIP_PDU_Thread::Main (this=0x9df0038) at /home/van/opal/src/sip/sipep.cxx:1130 Backtrace2: #0 0x00ee5402 in __kernel_vsyscall () #1 0x00c0a23e in sem_wait@GLIBC_2.0 () #2 0x083dcd25 in PSemaphore::Wait (this=0xb741b920) at tlibthrd.cxx:1041 #3 0x083f9eec in PReadWriteMutex::StartWrite (this=0xb741b86c) at ../common/osutils.cxx:2081 #4 0x08403b34 in PSafeObject::LockReadWrite (this=0xb7417ba8) at ../common/safecoll.cxx:122 #5 0x08403bf7 in PSafeLockReadWrite (this=0xb7f5a09c, object=@0xb7417ba8) at ../common/safecoll.cxx:194 #6 0x0807730e in OpalConnection::Release (this=0xb7417ba8, reason=OpalConnection::EndedByConnectFail) at /home/van/opal/src/opal/connection.cxx:372 #7 0x08144242 in SIPConnection::OnTransactionFailed (this=0xb7417ba8, transaction=@0xb7411cc0) at /home/van/opal/src/sip/sipcon.cxx:1282 #8 0x0814d4bb in SIPTransaction::SetTerminated (this=0xb7411cc0, newState=SIPTransaction::Terminated_Timeout) at /home/van/opal/src/sip/sippdu.cxx:2191 #9 0x0814ca8b in SIPTransaction::OnTimeout (this=0xb7411cc0) at /home/van/opal/src/sip/sippdu.cxx:2140 #10 0x0815c6c5 in SIPTransaction::OnTimeout_PNotifier::Call (this=0xb7417530, note=@0xb7411ff8, extra=0) at /home/van/opal/include/sip/sippdu.h:718 #11 0x0806e3ec in PNotifier::operator() (this=0xb7412004, notifier=@0xb7411ff8, extra=0) at /home/van/ptlib/include/ptlib/notifier.h:99 #12 0x083fe419 in PTimer::OnTimeout (this=0xb7411ff8) at ../common/osutils.cxx:643 #13 0x083fde0f in PTimer::Process (this=0xb7411ff8, delta=@0xb7f5a29c, minTimeLeft=@0xb7f5a290) at ../common/osutils.cxx:677 #14 0x083fe238 in PTimerList::Process (this=0x9db259c) at ../common/osutils.cxx:790 #15 0x083e0133 in PHouseKeepingThread::Main (this=0x9dbd0f8) at tlibthrd.cxx:123 The both threads try to lock the same SIPConnection Object(this=0xb7417ba8). The SIP_PDU_Thread successed at the beginning of SIPConnection::OnReceivedPDU() method, and waitted in PTimer::Stop() {sippdu.cxx:2030} for a signal from PTimerList. But the PTimerList::Process drived by HouseKeepingThread is processing the TimeOut notifier and trying to lock the same SIPConnection in the OpalConnection::Release() method. Now my solution is removing the statement "PSafeLockReadWrite safeLock(*this);" in OpalConnection::Release(). It works, but it is defective. ____________________________________________________________________________________ You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. http://tc.deals.yahoo.com/tc/blockbuster/text5.com |