#97 QuickFix.Message.addGroup() can crash process

open
nobody
None
5
2012-10-26
2012-10-26
Laconicus
No

When adding repeating groups to a message in QuickFix for .NET, it appears that in some rare scenarios a call to QuickFix.Message.addGroup() can result in memory corruption (often manifesting as an AccessViolationException or StackOverflowException, neither of which can be caught since .NET 4 and .NET 2 respectively).

The following code exhibits the issue (although as the underlying cause is a race condition, it may be necessary to execute it a significant number of times before the error occurs:

Message CreateMessage()
{
var message = new Message();
var group = new Group(NoPartyIDs.FIELD, PartyID.FIELD);
message.addGroup(group);
return message;
}

When the error manifests, the following stack trace is typical:

Exception Info: System.AccessViolationException
Stack:
at <Module>.FIX.FieldMap.addGroup(FIX.FieldMap*, Int32, FIX.FieldMap*, Boolean)
at QuickFIXTest.Program.CreateMessage()
at QuickFIXTest.Program.Main()

Having investigated the issue appears to be a race condition as follows:

The QuickFix::Message::addGroup() method (.NET\Message.h) is implemented as:

void addGroup( Group* group )
{
unmanaged().addGroup( group->unmanaged() );
}

Once addGroup() calls group->unmanaged(), the "group" parameter is no longer used, either within addGroup() itself, or within the example caller above. This leaves the managed Group instance eligible for garbage collection before the unmanaged addGroup() method has begun executing. If a GC occurs at this point, the Group instance will be placed in the Finalizer queue, and since the finalizer for Group deletes the unmanaged group, it is possible for the unmanaged addGroup() to attempt to add a now-destroyed group to the message.

Whilst it is possible to workaround this by explicitly Dispose()ing the group after adding it to the message, this should not be required and should certainly not cause the application to crash (however infrequently) if omitted.

Having reviewed some of the code it appears that there is the possibility for the same issue to occur elsewhere as well (e.g. in Message::replaceGroup() and the equivalents in QuickFix::Message::Header and QuickFix::Message::Trailer).

Discussion