#230 Unhandled exceptions escape event handlers

6.34.1
closed
1
2014-04-27
2012-12-25
Vidar Hasfjord
No

OWLNext, based on 32-bit OWL 5, assumes that the implementation of C++ exceptions interacts with Windows system exceptions (Structured Exception Handling) in such a manner that unhandled C++ exceptions in event handlers are transported across the Windows API boundary and rethrown in the message loop.

This behaviour is non-standard, compiler-dependent and unreliable (e.g. see "OWL application crashes on system shutdown on Windows 8 64-bit").

The usual advice is to ensure that C++ exceptions do not escape callbacks from a C API. This was solved in 16-bit OWL 5 code by an exception firewall in TWindow::ReceiveMessage and the usage of TApplication::SuspendThrow and ResumeThrow to transport unhandled exceptions to the message loop. See ReceiveMessage in 6.20.

1 Attachments

Related

Bugs: #231
Bugs: #249
Bugs: #324
Discussion: OWL application crashes on system shutdown on windows8 64 bits
Discussion: Planning for 6.34
Discussion: New handling for unhandled exceptions in event handlers
Discussion: New handling for unhandled exceptions in event handlers
Discussion: New handling for unhandled exceptions in event handlers
Discussion: New handling for unhandled exceptions in event handlers
Discussion: New handling for unhandled exceptions in event handlers
Discussion: Problem with 64-bit DialogDataTransfer
Discussion: 6.32.5 for Christmas?
Discussion: 6.32.5 for Christmas?
Discussion: 6.32.7 and 6.34.1 milestones (bug-fix updates)
News: 2013/02/roadmap-refresh
Wiki: OWLNext_Stable_Releases

Discussion

  • Vidar Hasfjord
    Vidar Hasfjord
    2012-12-29

    I've added a preliminary patch applicable to revision 1677 of the trunk. The patch also includes a new test project "examples/Classes/exceptiontransport" (only a Visual Studio 2012 solution so far).

    Remarks about the patch:

    • The patch restores the functions TApplication::SuspendThrow and ResumeThrow used by the 16-bit OWL 5 implementation (still present in OWLNext 6.20). These functions are used to implement an exception firewall in TWindow::ReceiveMessage and TDialog::StdDlgProc.
    • ResumeThrow is inserted in all the places where it occurred in 16-bit OWL, but note that this may not cover all the places it needs to be called in the current code. So careful review is required. ResumeThrow is necessary after any call into the Windows API, where that call may lead to a callback to C++ code (OWLNext or client code), e.g. after every ::SendMessage call (see TWindow::SendMessage).
    • Instead of the more complex and less general implementation in OWL, the patch uses std::exception_ptr (C++11) to transport the unhandled exception. This feature is supported on VC++ 2010 and later.
    • Support for C++Builder needs to be added. If std::exception_ptr is not supported, we can use boost::exception_ptr, which should be similar.
    • If we want to support BC++ 5.5 and VC++ 2008, we may have to use an alternative transport mechanism (e.g. based on the 16-bit code).
     
  • Vidar Hasfjord
    Vidar Hasfjord
    2012-12-30

    The proposed patch above is not binary compatible (since it adds TApplication::CurrentException), and it is not obvious to me how a binary compatible solution can be implemented.

    For this reason, I've created a more specific ticket [bugs:#231] to deal with the shutdown issue in particular.

     

    Related

    Bugs: #231

  • Support for C++Builder needs to be added. If std::exception_ptr is not supported, we can use boost::exception_ptr, which should be similar.

    Yes, the 32-bit C++ Builder compiler does not support std::exception_ptr.
    The 64-bit compiler in XE3 supports it.

     
  • Vidar Hasfjord
    Vidar Hasfjord
    2013-01-06

    I've now applied the patch to the trunk as a starting point [r1692]. As noted in the revision log and above, the code will need further test, review and work to support compilers without std::exception_ptr.

     

    Related

    Commit: [r1692]

  • Vidar Hasfjord
    Vidar Hasfjord
    2014-03-13

    • status: closed --> open
    • assigned_to: Ognian Tchernokojev --> Vidar Hasfjord
    • Group: 6.34 --> 6.34.1

    I have reopened this ticket, since the solution still has issues.

    In particular, the solution does not yet work for dialogs (as of 6.34.0). The problem occurs with dialogs that are executed using TDialog::Execute, which in turn calls TDialog::DoExecute, which calls ::DialogBoxParam or ::DialogBoxIndirectParam. The latter Windows API functions run an internal message loop that is not broken by suspended exceptions (suspended by calls to TApplication::SuspendThrow, e.g. in TWindow::ReceiveMessage). Consequently, an exception in a dialog is silently ignored until the dialog is closed and execution returns to the TApplication message loop, at which point the exception is rethrown (causing application shutdown if not handled).

    Note that this problem affects OWLNext diagnostics macros PRECONDITION and CHECK. When checks fail these macros cause a message box to be shown in which "Abort" is an option. This option throws an exception. But, due to the problem stated, this does not have an immediate effect within dialogs.

    Also note that the application may be left in an unstable state while the exception is in limbo within the dialog message loop. Further exceptions may occur, overwriting the previous.

    Finally, note that this (i.e. having exceptions in limbo within a message loop) is a problem whenever there is a message loop outside the control of OWLNext, i.e. user-defined message loops, in which case it may affect more than just dialogs.

     
    Last edit: Vidar Hasfjord 2014-03-13
  • Vidar Hasfjord
    Vidar Hasfjord
    2014-03-13

    A fix for the dialog exception problem was committed to the trunk in [r1971] and merged to the 6.34 branch in [r1972].

    To solve the issue, I added code within TDialog::StdDlgProc to check whether any exception has been suspended. If this is the case, EndDialog is called to break the message loop. The exception is then rethrown by the existing call to ResumeThrow within TDialog::Execute.

    Note that, to make this work in my test case, I had to modify the code in TWindow::DefWindowProc so that the call to ResumeThrow here is not performed if an exception already was suspended when DefWindowProc was called, i.e. the code now does not rethrow a pre-existing exception. Why this change was necessary, and whether it is sufficient to do this only here, I still do not fully understand.

    Please review.

     

    Related

    Commit: [r1971]
    Commit: [r1972]

  • Vidar Hasfjord
    Vidar Hasfjord
    2014-03-20

    Unfortunately, the fix for the dialog exception problem does not work for Property Sheets. Property Sheets does not support the call to EndDialog in the dialog procedure:

    pfnDlgProc, Type: DLGPROC; Pointer to the dialog box procedure for the page. Because the pages are created as modeless dialog boxes, the dialog box procedure must not call the EndDialog function.

    PROPSHEETPAGE structure documentation

    The constructor of TPropertyPage sets pfnDlgProg to TPropertyPage::PropDlgProc, which in turn calls TDialog::StdDlgProc. In the case of an unhandled exception, the new call to EndDialog within StdDlgProc causes the property sheet to freeze (due to endless recursive calls).

    So we need another way to force property sheets to close in the face of unhandled exceptions.

     
  • Vidar Hasfjord
    Vidar Hasfjord
    2014-03-22

    • status: open --> pending

    I have now made a major review of the exception transport [r1981]. It solves the problem with suspended exceptions in private message loops. TApplication::SuspendThrow now calls PostQuitMessage to force all message loops to break, and thus ensure, along with further code changes, that the suspended exception propagates to, and out of, the main message loop. This solution replaces the need for EndDialog calls and other hacks.

    But these changes have a major implication: The user can no longer choose to ignore an unhandled OWL exception and resume the application. See the revision log for more details.

    In the end I think this is a good thing. The application is in a undefined and possibly corrupt state after an unhandled exception. So it was bad programming to rely on this brittle feature. For example, choosing to ignore an unhandled exception within a dialog in 6.32 will just cause the application to become unresponsive.

    Edit: The changes have now been merged to the 6.34 branch [r1982].

     

    Related

    Commit: [r1981]
    Commit: [r1982]


    Last edit: Vidar Hasfjord 2014-03-22
  • Vidar Hasfjord
    Vidar Hasfjord
    2014-04-27

    • Status: pending --> closed