#317 Thread outlives SimulationIface object causing seg fault

Nathan Koenig
gazebo (59)

Using trunk revision 8482.

I'm trying to use gazebo by an external program through the libgazebo interface. When trying to use SimulationIface::Go I get a seg fault. I do believe the seg fault is caused by a thread (goAckWait) which is still running while the SimulationIface object which created the thread has already been destructed (see more detailed explanation below).

The attached tar contains:
testlibgazebo.cc : test program causing the seg fault.
my_testworld.world : the model used by gazebo
pioneer2dx.world : model of pioneer loaded by my_testworld.world (exact copy of model in the trunk).
fix_stop_blockthread.patch : patch with a quick fix which I made to get further.

The following terminal ouput shows first starting the gazebo server with the model and then starting the testlibgazebo resulting in the seg fault:

[barry@harold build]$ gazebo -u ../worlds/my_test.world &
[1] 13934
[barry@harold build]$ Gazebo multi-robot simulator, version 0.9.0

Part of the Player/Stage Project [http://playerstage.sourceforge.net].
Copyright (C) 2003 Nate Koenig, Andrew Howard, and contributors.
Released under the GNU General Public License.

Gazebo Path[/usr/local/share/gazebo]
Ogre Path[/usr/lib/OGRE]
directory [/tmp/gazebo-barry-0] already exists (previous crash?)
but the owner gazebo server (pid=13271) is not running.
deleting the old information of the directory [/tmp/gazebo-barry-0]
Gazebo successfully initialized

[barry@harold build]$ ./testlibgazebo
opening /tmp/gazebo-barry-0/simulation.default 112 33842332
opening /tmp/gazebo-barry-0/simulation.default +112 33842332
opening /tmp/gazebo-barry-0/position.pioneer2dx_model::position_iface_0 +112 156
sending go request : +0

I think the following is causing the seg fault:
The program testlibgazebo connects to the gazebo server using SimulationIface::ConnectWait (Client.cc). In SimulationIface::ConnectWait a SimulationIface is created and opened to check if we're not connecting to stale leftovers from previous gazebo crash. During opening of the simulation interface (In SimulationIface::Open) a thread goAckWait (blockthread) is created and starts waiting for a goAckPost from the server.
After SimulationIface::ConnectWait has determined that the connection is valid, it closes the SimulationIface and deletes the object during exiting of the method. However, the goAckWait thread is still alive and waiting for a goAckPost while data it is acting on is invalid!

Later in the program we send a SimulationIface::Go request resulting in a goAckPost from the server. The goAckWait thread wakes up and does its thing with invalid data resulting in a seg fault.

I made the following changes to get around the problem (fix_stop_blockthread.patch):

Defined an SimulationIface::Close() overloading Iface::Close. SimulationIface::Close stops the goAckWait thread: an interrupt request is send to the goAckWait thread. Then we wait for the thread to join. Then we call the overloaded method from Iface.
To let the goAckWait thread react to an interrupt request we changed SimulationIface::GoAckWait: used a semtimedop to check every 10ms for an interrupt request. If there's an interrupt request an boost::thread_interrupted exception is thrown.
I changed SimulationIface::BlockThread to catch this exception. When the exception is catched the thread just exits.


  • Barry

    tar file with patch, program and world model

  • Nathan Koenig
    Nathan Koenig

    • status: open --> closed-fixed