I've been building a .NET Core wrapper for 7z. In .NET, all objects are managed automatically and garbage collected lazily in one or more garbage collector threads.
7z's COM objects use non-atomic methods for AddRef
and Release
. In this scenario, during an ::Expand()
operation, 7z generates a ton of CLimitedSequentialInStream
's all referencing the same parent stream. When these streams are lazily disposed by multiple garbage collector threads, they also release their reference to their parent stream.
Since they all share the same parent stream, when all these objects are disposed in a multi-threaded Release, the AddRef
/Release
exhibits unpredictable behavior resulting usually in a crash due to use-after-free or a heap corruption error.
C/Threads.h
and Threads.c
declare a cross-platform method for InterlockedIncrement
, but no InterlockedDecrement
. I added an InterlockedDecrement
in here, and modified MyCom.h
to perform proper atomic addref's and release's, which ultimately resolved the problem.
CLimitedSequentialInStream
is internal class inside 7z library. that object is created and destoyed in 7z code.7-zip calls
Release()
inCMyComPtr
class destructors and in some another code.So why it doesn't work in NET?
How
CMyComPtr
destructor work in .NET?Do you compile whole 7-zip code in some
.NET
mode?Maybe you can you compile 7-zip code in usual
C++
mode, and then call that C++ code from .net code?I don't have to compile the 7z code at all, I can use your binaries from the website and have the same result.
The problem is that in .NET, the garbage collector runs in a separate thread. The thread that caused 7z to create an object is not the same thread that will call
Release
on that object. Because of that, there is a race condition where 7z can be executingAddRef
at the same time it is executingRelease
on the same object.A sample scenario:
1. Main .NET thread A calls
IInArchiveGetStream::GetStream()
. 7z internally createsCLimitedSequentialInStream
, which in turn calls::SetStream
with a shared parentIInStream
(triggering anAddRef
on that pointer) and returns the interface forISequentialInstream
back to the caller.2. Thread A uses the interface to do something and then the object goes out of scope.
3. Thread A loops back to step 1 to get another stream.
4. .NET background Thread B then garbage collects the object from step 2.
5. Thread A is inside 7z code, which has created a new
CLimitedSequentialInsStream
and internally calls::SetStream
and passes in a parentIInStream
object to it, which triggersAddRef
to be called on the parentIInStream
object.6. Thread B has called
Release
on the out of scopeISequentialInStream
which inside 7z triggers adelete
which triggers a call toRelease
on the shared parentIInStream
reference.7. Thread A in step 5 is executing at the same time as thread B in step 6. One thread is calling
AddRef
on the sharedIInStream
, and the other thread is callingRelease
on the same exact pointer. SinceAddRef
andRelease
are not atomic, the_m_RefCount
gets stomped and is no longer accurate.I can replicate this scenario in C++ without using .NET easily by having one thread call
GetStream
repeatedly and having another thread callRelease
on the returned objects in parallel. Eventually the shared parent stream gets destroyed by 7z because the reference count gets messed up and 7z will crash.In an ideal world it would be great if .NET didn't work this way and I could deterministically destroy objects in the same thread that created them, but unfortunately in .NET Core you can't call
Release
directly. Technically there is a call that can do it, but it only works in Windows and I'm not running in Windows.7-zip code doesn't allow to work with multiple GetStream() objects simultaneously from same archive handler object.
So the caller must close GetStream() object before requesting another GetStream() object.
Technically it's not "working with", it's just calling Release from the other thread which triggers 7z to delete the object. So if you're saying 7z can't delete an object created by another thread, then yeah there's a serious problem using 7z from .NET.
In the testing I've done with 7z, all I've needed to do is make AddRef/Release atomic. Once I did that, everything in .NET seems to work perfectly. The only call .NET will ever make to a 7z object in a different thread is
Release
.If
Release
is atomic, then the subsequentdelete
is also going to be atomic and only ever called once. So unless 7z is using thread-static member variables, deleting an object from another thread shouldn't be a problem I would think?7-zip uses such scheme:
GetStream
use stream
Release
...
GetStream
use stream
Release
For example, it's not allowed to use 2 streams from same archive simultaneously.
We don't need atomic Release in 7zip code now.
Maybe there is way in .net to call Release in same places as done in C++ code of 7-Zip.
Unfortunately there's no way to explicitly release a com object from .net core that works on all platforms. The garbage collector always runs in a separate thread, so objects will always have their
Release
called from that thread.The only methods .NET core has to release a com object are tagged as Windows-only, and not supported from Linux or OSX:
https://learn.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.marshal.finalreleasecomobject?view=net-8.0
https://learn.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.marshal.releasecomobject?view=net-8.0
I tested the code with
Z7_COM_USE_ATOMIC
.But I will not use
Z7_COM_USE_ATOMIC
by default.Last edit: Igor Pavlov 2024-03-28
This probably goes without saying, but I assume you tested this in Linux? I had to add definitions for
InterlockedDecrement
inThreads.h
andThreads.c
for GCC.Incidentally, what's wrong with using the Interlocked atomic functions by default for the reference count? Are you concerned about performance implications?
I tested both Linux and Windows.
InterlockedDecrement
in Release by default gives slightly larger code and maybe slower, if we have very big calling rate. Maybe__sync_sub_and_fetch
andInterlockedDecrement
are fast. But I didn't test it.Also we need some code stubs for single thread compiling.
Actually because I don't need that feature in my C++ 7-Zip code, I don't want to use
InterlockedDecrement
by default. I'll think it more later.Last edit: Igor Pavlov 2024-03-28