Sometimes I'm getting a deadlock in dbus-c++ while debugging my application under gdb.
It happens when I interrupt the program when DBus request is already sent, but no reply received yet. Then I'm debugging something for 25+ sec (so dbus timeout expires), continue execution and get a deadlock in a dispatcher thread.
Backtrace:
(gdb) thread 4
[Switching to thread 4 (Thread 0xab721b40 (LWP 20192))]
#0 0xb7fd9d39 in __kernel_vsyscall ()
(gdb) bt
#0 0xb7fd9d39 in __kernel_vsyscall ()
#1 0xb3e0bde2 in __lll_lock_wait () from /lib/i386-linux-gnu/libpthread.so.0
#2 0xb3e0592e in pthread_mutex_lock () from /lib/i386-linux-gnu/libpthread.so.0
#3 0xb72917c8 in DBus::DefaultMutex::lock (this=0x5deb2d2c) at eventloop.cpp:104
#4 0xb7291c8a in DBus::DefaultTimeout::~DefaultTimeout (this=0x7a74a9e8, __in_chrg=<optimized out>) at eventloop.cpp:59
#5 0xb72933ce in DBus::BusTimeout::~BusTimeout (this=0x7a74a9e0, __in_chrg=<optimized out>) at ../include/dbus-c++/eventloop-integration.h:44
#6 DBus::BusTimeout::~BusTimeout (this=0x7a74a9e0, __in_chrg=<optimized out>) at ../include/dbus-c++/eventloop-integration.h:44
#7 0xb724dc43 in _dbus_timeout_list_remove_timeout (timeout_list=0x5df1b710, timeout=0x7a73e170) at ../../../dbus/dbus-timeout.c:347
#8 0xb723814a in protected_change_timeout (enabled=0, toggle_function=0x0, remove_function=<optimized out>, add_function=0x0, timeout=<optimized out>, connection=0x5df1cef8) at ../../../dbus/dbus-connection.c:841
#9 _dbus_connection_remove_timeout_unlocked (timeout=<optimized out>, connection=0x5df1cef8) at ../../../dbus/dbus-connection.c:888
#10 reply_handler_timeout (data=0x7a73ddd0) at ../../../dbus/dbus-connection.c:3344
#11 0xb7290bbb in DBus::Timeout::handle (this=0x7a74a9e0) at dispatcher.cpp:58
#12 0xb729288d in DBus::BusDispatcher::timeout_expired (this=0x5deb2ce0, et=...) at eventloop-integration.cpp:201
#13 0xb7291b52 in DBus::Slot<void, DBus::DefaultTimeout&>::operator() (param=..., this=<optimized out>) at ../include/dbus-c++/util.h:240
#14 DBus::DefaultMainLoop::dispatch (this=0x5deb2d20) at eventloop.cpp:221
#15 0xb7292610 in DBus::BusDispatcher::enter (this=0x5deb2ce0) at eventloop-integration.cpp:100
...
DBus requests are sent from another thread, but it is not participating in a deadlock.
Deadlock happens in a dispatcher thread itself, which tries to lock the same non-recursive mutex it already holds:
DefaultMainLoop::_mutex_t
Deadlock happens in DBus::DefaultMainLoop::dispatch() in the following code:
_mutex_t.lock();
ti = _timeouts.begin();
while (ti != _timeouts.end())
{
DefaultTimeouts::iterator tmp = ti;
++tmp;
if ((*ti)->enabled() && now_millis >= (*ti)->_expiration)
{
(*ti)->expired(*(*ti));
if ((*ti)->_repeat)
{
(*ti)->_expiration = now_millis + (*ti)->_interval;
}
}
ti = tmp;
}
_mutex_t.unlock();
It locks
_mutex_t,
then calls
"(ti)->expired((*ti));"
which just deletes the timeout, and tries to lock
_mutex_t
again in DefaultTimeout destructor:
DefaultTimeout::~DefaultTimeout()
{
_disp->_mutex_t.lock();
_disp->_timeouts.remove(this);
_disp->_mutex_t.unlock();
}
It results in a deadlock by one thread on a non-recursive mutex.
Exactly the same problem is also described here: https://sourceforge.net/p/dbus-cplusplus/mailman/message/28745127/
I've tried making
DefaultMainLoop::_mutex_t
recursive, and it fixed the problem.
Patch:
--- eventloop.cpp 2020-04-06 12:26:06.902743512 +0300
+++ NEW_eventloop.cpp 2020-04-06 12:26:00.838743134 +0300
@@ -110,6 +110,7 @@
}
DefaultMainLoop::DefaultMainLoop() :
+ _mutex_t(true),
_mutex_w(true)
{
}