From: Edward d'A. <tru...@gm...> - 2015-09-04 08:02:18
|
On 4 September 2015 at 08:47, Rebecca N. Palmer <reb...@zo...> wrote: > On 04/09/15 02:29, wki...@gm... wrote: >> what would be very good to see would be the ability to build a debug version >> that will output the address and maybe even the source code line of the fault... >> there may be an option for this in the download_and_compile.sh script i'm using >> but i've not found it... > > Build with -DCMAKE_BUILD_TYPE=RelWithDebInfo (as this means "enable > optimization but include debug information" it could simply be made the > default; in Debian it already is), then run in a debugger (e.g. gdb > --args fgfs --launcher). Debugging GUI problems can be quite hard, as GUI toolkits are, by design, multi-threaded and many issues are due to what is known as racing ( https://en.wikipedia.org/wiki/Race_condition ). For such debugging, I would suggest researching 'helgrind' ( http://valgrind.org/docs/manual/hg-manual.html , https://www.kdab.com/~dfaure/helgrind.html ). You'll quickly work out where the 'hell' in this comes from ;) It is not referring to itself! Sometimes the racing will not be caught by a debugger and you need to do some detective work, and here copying and using lots of SG_LOG() function calls can be useful. Note that for the GUI bugs, nothing in the STDOUT and STDERR terminal printouts or in the FG log will likely be of help. They might help a little with the detective work, but more often than not it will send you in the wrong direction. One other issue is that racing sometimes only happens in a non-debugging version of the program. The debug build can execute slower in the parts where the racing happens or the order of threaded operations can be different, so you sometimes will only see the bug in the non-debugging fgfs binary. In this case, you can only use the detective work. For the termination error though, I have seen this often when an application closes before all the GUI elements have had they destructor function properly called. My main experience is with the wxWidgets toolkit, which provided OS native widgets, and there there is a Mac OS X vs. Linux issue. You can use the Close() or Destroy() functions on Windows and Linux, but on Mac OS X, the Destroy() function tends to destroy the whole program rather than just the GUI element. So maybe something like this is at play. Note that as the QtLauncher transitions into a full GUI to replace the PUI GUI, many of these issues will disappear, and new ones will appear. In any case, debugging a modern GUI is much harder than debugging the non-GUI parts. A basic knowledge of threading, locks, and racing is quite useful for understanding and catching these bugs. Regards, Edward P. S. The relevant helgrind messages for this problem, which are quite informative about where the failure occurred, are: catalog download failure:http://fgfs.goneabitbursar.com/pkg/3.7.0/default-catalog.xml ==7939== ==7939== Process terminating with default action of signal 11 (SIGSEGV) ==7939== Access not within mapped region at address 0x41 ==7939== at 0x9E493DB: operator<<(QDataStream&, QString const&) (in /usr/lib64/libQt5Core.so.5.4.0) ==7939== by 0x1F944C3: AircraftItem::toDataStream(QDataStream&) const (AircraftModel.cxx:109) ==7939== by 0x1F969C2: AircraftScanThread::writeCache() (AircraftModel.cxx:224) ==7939== by 0x1F964CB: AircraftScanThread::run() (AircraftModel.cxx:181) ==7939== by 0x9DB027E: ??? (in /usr/lib64/libQt5Core.so.5.4.0) ==7939== by 0x4C2DCB9: ??? (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so) ==7939== by 0x5C6C5BC: start_thread (in /usr/lib64/libpthread-2.20.so) ==7939== by 0xAAED5CC: clone (in /usr/lib64/libc-2.20.so) ==7939== If you believe this happened as a result of a stack ==7939== overflow in your program's main thread (unlikely but ==7939== possible), you can try to increase the size of the ==7939== main thread stack using the --main-stacksize= flag. ==7939== The main thread stack size used in this run was 8388608. |